Copyright © 2009 Li Hong
Permission is granted to reprint or republic this article as long as the original source information and the copyright are kept.
Hard disk is a kind of storage that uses a concentric stack of disks or "platters" to record data. It is a block device, that says it reads and writes data in fixed-size blocks. Generally, the block size is 512 bytes. So from a software engineer's point of view, a hard disk is just a sequence of continuous blocks of data, and you can visit any of them freely using some kind of address mechanism.
1 MBR
A master boot record (MBR) is the first sector of a hard disk. It serves mainly two functions:
- Holds a disk's primary partition table.
- Holds the bootstrapping code. After BIOS initializing the PC, it will load this sector into memory and pass execution to it.
The structure of MBR is as follows:
Offset | Description | Size
|
---|
0x0000 | Code area | 440
|
0x01B8 | Disk signature | 4
|
0x01BC | Usually NULL (0x0000) | 2
|
0x01BE | Primary partition table (Fore entries, each 16 bytes) | 64
|
0x01FE | MBR signature (0x55, 0xAA) | 2
|
Disk signature is used to uniquely indentify the boot disk by the OS and further by userland processes. But after the introduction of EDD, disk signature can be omitted and code area can be extended to a length of 446.
By convention, there are exactly four primary partition table entries in the MBR Partition Table scheme. Both the partition length and partition start address are stored as 32-bit quantities. Because the block size is 512 bytes, this implies that neither the maximum size of a partition nor the maximum start address (both in bytes) can exceed 2^32 * 512 bytes, or 2 TiB.
See Partition Table, for more info.
2 CHS
Cylinder-head-sector, also known as CHS, was an early method for giving addresses to each physical block of data on a hard disk drive. Though CHS values no longer have a direct physical relationship to the data stored on disks, pseudo CHS values (which can be translated by disk electronics or software) are still being used by many utility programs.
- Head: Data is written to or read from a platter of the hard disk by a device called head. Usually, two heads are used to manipulate the data on both surfaces of a platter.
- Track, Sylinder: A platter surface is composed of concentric circles. They are called tracks. All information stored on a hard disk is recorded in tracks. The tracks are numbered, starting from 0, starting at the outside of the platter and increasing as you go in. All tracks that have the same number and span across each platter surface form a sylinder.
- Sector: A track is divided into sectors that are the base units managed by a hard disk driver.
So each sector can be addressed by a three-dimensional coordinate system (CHS). The number of sectors a hard disk holds is:
cylinders * heads * sectors
In earlier hard drive designs, the number of sectors per track was fixed and because the outer tracks on a platter have a larger circumference than the inner tracks, space on the outer tracks was wasted. The number of sectors that would fit on the innermost track constrained the number of sectors per track for the entire platter. However, many of today's advanced drives use a formatting technique called Multiple Zone Recording to pack more data onto the surface of the disk. Multiple Zone Recording allows the number of sectors per track to be adjusted so more sectors are stored on the larger, outer tracks. By dividing the outer tracks into more sectors, data can be packed uniformly throughout the surface of a platter, disk surface is used more efficiently, and higher capacities can be achieved with fewer platters. Not only is effective storage capacity increased by as much as 25 percent with Multiple Zone Recording, but the disk-to-buffer transfer rate also is boosted. With more bytes per track data in the outer zones is read at a faster rate.
However, as I metioned before, CHS values no longer have a direct physical relationship to the data stored on disks, the pseudo CHS still uses a uniform schema. The total length of CHS is 24 bits. Below is the detailed limit. See Partition Table.
Name | Bits | Start From | End Limit | Total Number
|
---|
Cylinder | 10 | 0 | 1023 | 1024
|
Head | 8 | 0 | 254 | 255
|
Sector | 6 | 1 | 63 | 63
|
So when use the CHS address schema, a hard disk could be no lager than:
(1024 * 255 * 63) * (512) = 8,422,686,720 bytes (about 8.4 GB)
3 LBA
Logical block addressing (LBA) is a common scheme used for specifying the location of blocks of data stored on computer storage devices, generally secondary storage systems such as hard disks. The term LBA can mean either the address or the block to which it refers. Logical blocks in modern computer systems are typically 512 or 1024 bytes each. ISO 9660 CDs (and images of them) use 2048-byte blocks. LBA is a particularly simple addressing scheme; blocks are located by an index, with the first block being LBA=0, the second LBA=1, and so on.
CHS tuples can be converted to LBA addresses using the following formula:
LBA(C,H,S) = ((C * heads_num) + H) * sectors_per_track + S - 1
4 Partition Table
As described before, the partition table in MBR can hold at most four records. Each partion can't exceed 2 TiB. To alleviat this capacity limitation, an new partition schema called GUID Partition Table (GPT) is introduced in industry. See more at UEFI.
Follows is the layout of one 16-byte partition record:
Offset | Length | Description
|
0x00 | 1 | status (0x80 = bootable, 0x00 = non-bootable, other = invalid)
|
0x01 | 3 | CHS address of first sector in partition
|
0x04 | 1 | partition type
|
0x05 | 3 | CHS address of last sector in partition
|
0x08 | 4 | LBA of first sector in the partition
|
0x0C | 4 | number of sectors in partition, in little-endian format
|
Most of the time, LBA is used to find a partition. But specification says: if a partition's start block or end block or both are under the 8.4 GB limitation, CHS address should also be correctly record. Otherwise, CHS fields have some kind of default values.
Partition type is used to label the file system used on this partition. For example, the code for linux ext2 is 0x83 and linux swap is 0x82. You can see a list of partition types by sfdisk -T. A hard disk can have at most four primary partitions for there are only four entries in the primary partition table. The following figure gives an example of a hard disk holding two primary partitions.
If you ls /dev/sda* or ls /dev/hda*, you may see the results as follows:
/dev/sda /dev/sda1 /dev/sda2 or
/dev/hda /dev/hda1 /dev/hda2
Please note:
- The address mode used in figure is LBA. In CHS dialect, it should be Sector 1 - Sector 63.
- The first partition normally starts at sector 63 (LBA), that is just after the first track. The first 63 sectors (first track) can be used for other purpose such as holding bootloader code.
- Partition can start and end at any places as soon as there are no overlappings. And may not cover all the space on a hard disk.
To get more partitions, we can subpartition a primary partition into several logical partitions. The primary partition used to house the logical partitions is called an extended partition and it has its own file system type (0x05 extended type). See more at Extended partition.