Sizing Disks:recommend appropriate hardware
A typical Load test (utilizing a typical workload) on your test environment (representing a production model) results in raw performance data, this data could be utilized to uncover sizing information.
Disk sizing should characterize storage capacity of the disks and throughput required. storage capacity: This factor is influenced by the day to day increase in the storage volume as a function of business transactions (typical e.g. could be number of orders raised, number of users registered in a web site etc). In addition to the preceding factor, we also to need account for the redudancy levels (RAID levels; RAID1/RAID5/RAID10) which would be employed to arrive at the total storage volume required. Throughput: Analyzing raw data e.g. (NT perfmon counters) %Disk Time,Disk Transfers/sec,Disk writes/sec, Disk Reads/sec,Avg. Disk Bytes/sec, Avg. Disk Bytes/Write, Avg. Disk Bytes/Read, Avg. Disk Bytes/Transfer, we could gauge the amount of writes/sec and the data being transferred. Using the above data we could then estimate the minimum requried disk specification, utilizing the formula below. sequential IO seconds perIO=(transfer time)+(track-to-track seek time) + (avg.rotational latency) --->(1) random IO seconds per IO= (transfer time) + (random seek time)+(avg. rotational latency) ----> (2) IOPS=1/(seconds per IO) avg.rotational latency=Rotation permin/60*2). ---> (3) To explain the above i am sssuming a single disk with predominatly sequential IO, utilizations below 80%, zero queue length and a throughput of 80 IOPS,which leads to an average time of 12.5 ms to retrieve data. Using the formula (1) and ignoring the transfer time component to make our explanation simple (since it is a small component when compared with the other two). Lookup the manufactures performance specifications hard drive on the rotation per minute (use formula 3) and track to track seek time mentioned. Evaluate if we would achieve the the target IOPS. If a single disk drive is unable to satisfy the througput requirements then use a multiple drive (spindles), and stripe the data across the disks to achieve higher throughput.
Some miscellaneous factors which indirectly influence throughput of the entire disk subsystem is the number of disks deployed. Different classes of storage requires different services, for instance if there is a database log file (sequential access) and the database data files (random access) on the same disk, this could result in increased disk contention thereby reducing the overall throughput.