But IOPs by themselves don’t paint the whole picture of storage performance. More information is needed to understand the whole picture. Here are questions leading to the complete picture:
- What is the size of the read/written data block while measuring IOPs?
- What is the end-to-end latency seen by the application in reading or writing the data block?
- With regards to writes, are the reported IOP numbers for synchronous writes or asynchronous writes?
- With regards to reads, what is the role of cache?
Large data blocks take longer to write to/read from storage. So IOP numbers of 4kB blocks will be very different than IOPs for 1MB blocks. The most relevant block size for an IOP measure should correspond to the sizes of blocks being written to the storage system by the application. Knowing the IO profile of the application is key to choosing an appropriate storage system for an application.
Latency is a key IOP qualifier. Storage system latency is the time from when the application issues an IO request to the storage system to when the request is completed - the read data is delivered to the application or the storage system acknowledges that the data block has been written. The important question with respect to writes is when the storage system acknowledges that the data block has been persisted on non-volatile storage. For some applications writes may be asynchronous - they are acknowledged before they have been persisted on non-volatile storage. Since asynchronous IOP and latency numbers look better (higher IOPs, lower latency), promotional storage systems’ material often quotes asynchronous write IOPs.
Some systems have battery-backed non-volatile RAM to allow the acknowledgement to be sent to the application as soon as the data block is written to RAM - usually orders of magnitude faster than data storage media like SSD or disk. The question then becomes, what is the size of this non-volatile RAM that can hold data blocks before the data need to be persisted on the slower storage media. While some applications have bursty write profiles which play nicely with this (limited) non-volatile RAM, applications that require sustained write performance may not benefit much from such methods.
Similarly, RAM can be used to cache data for reads - read latency decreases when a cache hits serve data blocks from RAM instead of being read off slower storage media. The size of the RAM cache as well as the application’s read patterns - is some data read more often than other data - are important considerations while working with caches.
The key to having an intelligent conversation about IOPs is to know your application and to seek definitive answers about latency, data block sizes, synchronous/asynchronous assumptions and caches.