Know Your I/O: Finding Where Intel Solid-State

Know Your I/O: Finding Where SSDs Excel in Solutions
(Windows* Platform)
Understanding the Characterizations of Your I/O Workload from a System Perspective
This technology brief is intended for Application and System Architects, Database Administrators, and Systems and Storage Engineers.
It provides the framework to collect the important measurements of platform OS specific I/O so you can properly focus your limited
resources on architectural choices that provide the greatest cost effective, user-experience gain. By focusing on performance indicators
that affect user, back-end applications, networks, and systems, you will have a complete view on the data for optimizing your solution.
This brief also covers the OS metrics needed to determine if there is a storage bottleneck per system node, and also maintains a focus
on where a Solid-State Drive (SSD) can help when there are bottlenecks or opportunities for improvement.
Introduction
The following table compares the areas of I/O workload characterization and how SSD technology differs from Hard Disk Drive (HDD)
technology in each area.
Table 1:
Characterization of SSD and HDD
Area
IOPS
Definition
Throughput capability in a particular unit
size (4 Kbytes is most common). Highest
IOPS can generally be achieved with
smallest unit sizes.
SSD (per drive)
HDD (per drive)
20,000 IOPS and higher (SATA)
100,000 IOPS and higher (PCIe*)
100-350 IOPS
NOTE: PCIe* is typically 4 times or more
powerful on a throughput basis than SATA.
Throughput
I/O operations multiplied by the unit size
(i.e. 4K). The highest throughput can be
achieved with sequential high block size
workload (i.e. 128 KB blocks).
Throughput is dependent on how deeply
you force the queues on an SSD.
Throughput can be limited by
device queuing.
Latency
Access time for the I/O operation.
Microsecond class performance.
3-20 milliseconds of access time.
Locality
Is the data sequential and cacheable or
random and unpredictable?
Well suited for mixed and random
workloads.
Better suited for sequential
workload.
Access Patterns
The read/write mix is critical in determining
which solution is employed and how it will
tier with SSD.
SSDs differentiate based on write endurance
capabilities to provide years of service.
HDD solutions are aided by more
spindle count and caching, such
as a controller DRAM-based
cache.
Block Size
Every solution utilizes unique block sizes.
I/O travels through layers to the devices.
SSDs perform optimally at 4K and above
block sizes.
Disks are typically aligned and
performance at smaller block
sizes.
SSDs can produce greater IOPS per device much more efficiently and with less latency than HDDs. Still, to improve user experience or
better utilize hardware and CPU utilization, you must capture the per-node performance measurements and analyze across several
systems to decide how to tier SSDs into your application. You can apply an appropriate set of cost factors, but the acquisition cost of the
raw storage should be factored on a cost-per-IOPS basis when considering only raw storage costs.
SSDs will always provide a lower cost of IOPS, but the raw solution should also factor in power usage, heat dispensation, write endurance
(for n years), and consistency of the I/O service over time for the total storage solution. Other factors are storage subsystem maintenance,
support costs, disaster recovery, and the costs of creating copies for high availability. These are all part of a total solution cost.
Applications demand consistent I/O to perform best, so you must understand your I/O through the necessary attributes outlined in the
following table.
330984-001US
Technology Brief
1
Know Your I/O: Finding Where SSDs Excel in Solutions
(Windows* Platform)
Table 2:
Holistic tiered approach to application design optimization
Area
Definition
Common Examples
User / Client
Determines client performance and data cache
opportunities.
Network
Networks are extensively layered in a mobile
device world.
Web Application
Orchestrates the user requests and can benefit
from application data tiering and caching as
intermediate nodes in a stateless fashion
Http network protocol – Cache-Control
HTML5 Application Cache.



Physical latency restrictions must be understood
Number of requests must be lowered
Payload considerations are always a factor.

Well suited for personalization or login specific data to make
application start faster.
Common queries can be “cached” into the application tier
with a memory cache.


Provides the persistent original data to the user.
Provides the system of reference to the data.
Data Stores


Purpose-built
Applications and Service
Tiers
Figure 1:
Purpose-built services are often high volume
and have low latency service needs.
SSDs can differentiate based on enhanced write endurance,
lower failure rates and less drives per application’s I/O
requirement.
Within a solution a temp store or hot data cache partition that
utilizes SSD.
Leverage a third party cache solution for databases such as
Intel® CAS.
Personalization data, identity, user profile stores, security logging,
content caches, video on demand, big data for Operations, virtual
desktop systems.
Workload characteristic benefits between HDD and SSD
330984-001US
Technology Brief
2
Know Your I/O: Finding Where SSDs Excel in Solutions
(Windows* Platform)
I/O Performance Counters on Windows
IOPS (I/O Operations per Second)
To understand your I/O operations per second (IOPS), use the following PerfMon performance counter:
Disk Transfers/sec
IO Operations transferred to the drive but you need to find the peak not just the average
over a known cycle
HINT: SSDs have the ability to provide much larger IOPS per drive and handle peaks efficiently.
Find Peak Throughput (Mbytes per Second)
Engineers must design for peak throughput. Utilize Disk Total Bytes/sec to determine long-term trends and workload peaks.
Disk Total Bytes/sec
Bytes transferred to the drive but you need to find the peak for your applications over
a specified cycle time (number of days or weeks)
HINT: SSDs can provide very large MB/sec up to GB/sec transfer capabilities via a PCIe* drive.
Block Size Always Matters
Block size is important because it impacts other metrics such as throughput. Published hardware specifications tend to focus on a
common block size tested with a synthetic load tool. A common block size is 4kb random I/O, which is often specified in the product
documentation. Every workload is unique, and therefore each application will operate differently. Even within a traditional relational
database application, one part of the product I/O workload will differ from another. For example, database log writing might be a fixed
sector alignment that provides a sequential workload. Another database process might be page oriented utilizing 8KB pages.
Application solutions all operate differently because of variations in design and function, and each process of a solution will function
in a different way.
The formula for finding an isolated block size is:
Intel SSDs will excel at block sizes of 4KB or larger.
330984-001US
Technology Brief
3
Know Your I/O: Finding Where SSDs Excel in Solutions
(Windows* Platform)
Read and Write Mixture and Randomness
A typical solution mixes several processes that ultimately randomize the I/O behavior. Virtualization or pooling of solutions into a
storage pool creates more random I/O behavior, some portions writing, and other portions reading. This mix, often independent at
the process level, tends to share the same storage. Therefore, finding your mix with these counters over time is important. It is best
to utilize these counters and follow this equation, making sure your sample time is adequate so you have a good representation
of the lifecycle of what your application does. A typically online user-oriented system may require sampling for 24 hours over
several 24 hour periods.
Counter
Definition
Read Bytes/sec
Read bytes transferred
Write Bytes/sec
Write bytes transferred
Total Bytes/sec
Total bytes transferred
SSDs excel at servicing random workloads over HDD, because SSDs are inherently parallel in nature, and HDDs are not.
Randomness is a function of application behavior between different I/O (Read or Write) operations. However, it is important to
understand the core application nuances. By looking at the basic platform counters, you can get an idea if your drive is doing most
of its work in read or write workloads.
In Windows*, you can infer randomness from your IOPS performance at 100% busy state. You can also infer that you are not reaching
theoretical maximum for either read or write workloads because of randomness. Typically, when looking at your real workload, you can
have minimal true busy states. Any mixed workload will typically be random. Any virtualized mixed workload will be even more random.
Latency
Below is the Windows Performance Counter for disk latency or service time of I/Os at the partition. These numbers are affected
by Queuing and % Disk Time, which is covered in the following section. Capturing and observing for queuing issues is important
in understanding latency.
Counter
Definition
Disk sec/Read
Latency on Reads
Disk sec/Write
Latency on Writes
Disk sec/Transfer
Latency on all transferred IO
SSDs can provide service time in the microsecond realm for greater than 99% of the workload.
330984-001US
Technology Brief
4
Know Your I/O: Finding Where SSDs Excel in Solutions
(Windows* Platform)
Queuing Affects Latency
You must understand if a drive is too busy (Disk Time). If the partition’s queue is busy, this parameter will show busy for the measured
time increment. This table shows the definitions.
Object/Counter
Definition
Comments
Logical Disk(x)/Current disk queue
length
The number of queued I/Os
Queued disk operations add to
latency.
Logical Disk (x) / %Idle Time
This is the percentage of desired time interval
where the disk was in an idle state and had no
pending operations.
Sustained periods near 0% can
indicate controller or drive
saturation.
Focus on Specific Drive Partitions
Systems normally provide specific mount points/drive letters. Some generalized partition types are listed in the following table.
Partition Type
Benefits from SSD
Notes
SSD provide very low latency to improve OLTP database logging
application commit time and scalability potential.
Logging
Hot data tiers and caches provide many opportunities, such as a
database temporary work area or any hot data that can excel with SSDs.
Cache
Virtualization and Paging
Well suited for mixed and random workloads.
Data intensive applications can consolidate HDD spindles or break
through low ceilings of random IOPS.
Data
SSDs can provide excellent consolidation, performance or scalability benefits where noted, based on your I/O requirements being
analyzed.
CPU Privilege Time (or CPU “kernel” time)
Many applications cannot meet user or business demands because they are slowed down by excessive privileged time. This can mean
that the CPU is working too hard on Memory, Network, or Disk Subsystems, and perhaps unproductive system overhead as opposed to
user workload. The % of Privileged Time should not exceed 30% on Windows or there will be some overuse of the driver stacks in the
system. To determine if the devices are overly busy, focus on the % Idle Time and Queuing.
Object
Counter
Behavior to Address
Processor
% Privileged Time
Should not exceed 30% as a guideline.
330984-001US
Technology Brief
5
Know Your I/O: Finding Where SSDs Excel in Solutions
(Windows* Platform)
Related System Counters
System overhead related to I/O operations of the network and storage are captured by the Windows PerfMon Object and
Counter called “Processor (n) / % privileged time.” This counter collects the percentage of time in which the system was executing
in privileged mode for drivers such as the storage driver. When dealing with privileged mode operations, there are two modes to
consider: Deferred Procedure Call (DPC) mode and Interrupt mode.
DPC mode allows high-priority tasks to defer required but lower-priority tasks for later execution. Interrupt mode is reserved for
interrupt service routines, which are device driver functions which could be a signal that the storage driver is working too hard.
Microsoft* gives a recommendation that this counter should never exceed 30% or you have some indication that the system driver
level is not functioning efficiently and you may want to look further at other systems and driver configurations to see if you need
to make a change, such as a driver update.
A related Microsoft link with more background and guidance can be found at:
http://blogs.technet.com/b/askperf/archive/2008/01/18/do-you-know-where-your-processor-spends-its-time.aspx
Also review the following two counters for unusual spiking behavior that can require further isolation:
Object
Counter
Behavior to Address
Processor
Interrupts / sec
Look for spikes
System
Context Switches / sec
Look for spikes
Conclusion
A total application monitoring approach requires many points of data collection. The modern approach uses application performance
monitoring (APM), which can provide client, network, application, and system level data, using agents that monitor most of your critical
needs. These solutions should monitor data from all of these necessary tiers.
This paper focuses only on the Node Level for IT Architects to manage to the latency and transaction and throughput level
(item 6, System Hardware, in Table 3 below).
Table 3:
Ecosystems of solution architecture optimization and data collection
Ecosystem
Collection Source
Notes
1
Client Tiers
Browser Page Load Time
2
Web/Service Tiers
Web Application / Web Service
Monitors (APM)
Throughput and response time of the service
trace metrics of key business transactions
3
Database Tiers
Data Store / Data Service
Response time and data throughput, then key
performance indicators (KPIs) and traces
4
Network Topology
Network Monitoring
Conversation level monitoring between Cloud
and Web Service boundaries
5
Storage Topology
Storage System Monitoring
Focus on throughput, service time of the shared
Storage environment
6
System / Hardware
Node Level Metrics
Focus on CPU, Memory, Node Level Network and
Storage
Real User Monitoring
Cloud Service users should understand how they can obtain these metrics (response time and throughput) from their Cloud provider,
which might be difficult, and the variability of I/O in the Cloud is a well-documented challenge. Remember these facts because like
laws of physics, latency (and not just throughput) always matters for a user-provided service that nearly always relies on data.
330984-001US
Technology Brief
6
Know Your I/O: Finding Where SSDs Excel in Solutions
(Windows* Platform)
Additional References
Intel® SSD product specifications:
http://www.intel.com/content/www/us/en/solid-state-drives/solid-state-drives-ssd.html
How to save PerfMon Data (2 minute video):
http://www.youtube.com/watch?v=3yEEzqDE5qI
Microsoft* Performance Team blog series:
http://blogs.technet.com/b/askperf/
Disk Counters Explained:
http://blogs.technet.com/b/askcore/archive/2012/03/16/windows-performance-monitor-disk-counters-explained.aspx
Examining and Tuning Disk Performance:
http://technet.microsoft.com/library/Cc938959
CPU User Mode versus Privileged Mode:
http://blogs.technet.com/b/perfguide/archive/2010/09/28/user-mode-versus-privileged-mode-processor-usage.aspx
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY
INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL
ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING
LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER
INTELLECTUAL PROPERTY RIGHT.
A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE
OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND
AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE
ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH
MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL
PRODUCT OR ANY OF ITS PARTS.
Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or
instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising
from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information.
The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications.
Current characterized errata are available on request.
Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.
Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or
go to: http://www.intel.com/design/literature.htm
All products, computer systems, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.
Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.
Copyright © 2014 Intel Corporation. All rights reserved.
330984-001US
Technology Brief
7