IOPS Archives | simplyblock

Local NVMe Storage on AWS – Pros and Cons

Rob Pankow — Thu, 03 Oct 2024 12:13:26 +0000

What is the Best Storage Solution on AWS?

The debate over the optimal storage solution has been ongoing. Local instance storage on AWS (i.e. ephemeral NVMe disk attached to EC2 instance) brings remarkable cost-performance ratios. It offers 20 times better performance and 10 times lower access latency than EBS. It’s a powerhouse for quick, ephemeral storage needs. In simple words, local NVME disk is very fast and relatively cheap, but not scalable and not persistent.

Recently, Vantage posted an article titled “Don’t use EBS for Cloud Native Services“. We agree with this problem statement, however we also strongly believe that there is a better solution that using Local NVMe SSD Storage on AWS as an alternative to EBS. Local NVMe to EBS is not like comparing apples to apples, but more like apples to oranges.

The Local Instance NVMe Storage Advantage

Local storage on AWS excels in speed and cost-efficiency, delivering performance that’s 20 times better and latency that’s 10 times lower compared to EBS. For certain workloads with temporary storage needs, it’s a clear winner. But, let’s acknowledge the reasons why data centers have traditionally separated storage and compute.

Overcoming Traditional Challenges of Local Storage

Challenges	Local Storage	simplyblock
Scalability	Limited capacity, unable to resize dynamically	Dynamic scalability with simplyblock
Reliability	Data loss if instance is stopped or terminated	Advanced data protection, data survives instance outage
High Availability	Inconsistent access in case of compute instance outage	Access to storage must remain fully available in case of a compute instance outage
Data Protection Efficiency	N/A	Use of erasure coding instead of three replicas to reduce load on the network and effective-to-raw storage ratios by a factor of about 2.5x
Predictability/Consistency	Access latency increases with rising IOPS demand	Constant access latencies with simplyblock
Maintainability	Impact on storage during compute instance upgrades	Upgrading and maintaining compute instances without impact on storage is an important aspect of operations
Data Services Offloading	N/A	No impact on local CPU, performance and access latency for data services such as volume snapshot, copy-on-write cloning, instant volume resizing, erasure coding, encryption and data compression
Intelligent Storage Tiering	N/A	Automatically move infrequently accessed data chunks from more expensive, fast storage to cheap S3 buckets

Simplyblock provides an innovatrive approach that marries the cost and performance advantages of local instance storage with the benefits of pooled cloud storage. It offers the best of both worlds—high-speed, low-latency performance near to local storage, coupled with the robustness and flexibility of pooled cloud storage.

Why Choose simplyblock on AWS?

Performance and Cost Efficiency: Enjoy the benefits of local storage without compromising on scalability, reliability, and high availability.
Data Protection: simplyblock employs advanced data protection mechanisms,
ensuring that your data survives any instance outage.
Seamless Operations: Upgrade and maintain compute instances without impacting storage, ensuring continuous operations.
Data Services Galore: Unlock the potential of various data services without affecting local CPU performance.

While local instance storage has its merits, the future lies in a harmonious blend of the speed of local storage and the resilience of cloud-pooled storage. With simplyblock, we transcend the limitations of local NVMe disk, providing you with a storage solution that’s not just powerful but also versatile, scalable, and intelligently designed for the complexities of the cloud era.

The post Local NVMe Storage on AWS – Pros and Cons appeared first on simplyblock.

Simplyblock for AWS: Environments with many gp2 or gp3 Volumes

Michael Schmidt — Thu, 19 Sep 2024 21:49:02 +0000

When operating your stateful workloads in Amazon EC2 and Amazon EKS, data is commonly stored on Amazon’s EBS volumes. AWS supports a set of different volume types which offer different performance requirements. The most commonly used ones are gp2 and gp3 volumes, providing a good combination of performance, capacity, and cost efficiency. So why would someone need an alternative?

For environments with high-performance requirements such as transactional databases, where low-latency access and optimized storage costs are key, alternative solutions are essential. This is where simplyblock steps in, offering a new way to manage storage that addresses common pain points in traditional EBS or local NVMe disk usage—such as limited scalability, complex resizing processes, and the cost of underutilized storage capacity.

What is Simplyblock?

Simplyblock is known for providing top performance based on distributed (clustered) NVMe instance storage at low cost with great data availability and durability. Simplyblock provides storage to Linux instances and Kubernetes environments via the NVMe block storage and NVMe over Fabrics (using TCP/IP as the underlying transport layer) protocols and the simplyblock CSI Driver.

Simplyblock’s storage orchestration technology is fast. The service provides access latency between 100 us and 500 us, depending on the IO access pattern and deployment topology. That means that simplyblock’s access latency is comparable to, or even lower than on Amazon EBS io2 volumes, which typically provide between 200 us to 300 us.

To make sure we only provide storage which will keep up, we test simplyblock extensively. With simplyblock you can easily achieve more than 1 million IOPS at a 4KiB block size on single EC2 compute instances. This is several times higher than the most scalable Amazon EBS volumes, io2 Block Express. On the other hand, simplyblock’s cost of capacity is comparable to io2. However, with simplyblock IOPS come for free – at absolutely no extra charge. Therefore, depending on the capacity to IOPS ratio of io2 volumes, it is possible to achieve cost advantages up to 10x .

For customers requiring very low storage access latency and high IOPS per TiB, simplyblock provides the best cost efficiency available today.

Why Simplyblock over Simple Amazon EBS?

Many customers are generally satisfied with the performance of their gp3 EBS volumes. Access latency of 6 to 10 ms is fine, and they never have to go beyond the included 3,000 IOPS (on gp2 and gp3). They should still care for simplyblock, because there is more. Much more.

Benefits of Thin Provisioning

With gp3, customers have to pay for provisioned rather than utilized capacity (~USD 80 per TiB provisioned). According to our research, the average utilization of Amazon EBS gp3 volumes is only at ~30%. This means that customers are actually paying more than three times the price per TiB of utilized storage. That said, due to the low utilization below one-third, the actual price comes down to about USD 250 per TiB. The higher the utilization, the closer a customer would be to the projected USD 80 per TiB.

In addition to the price inefficiency, customers also have to manage the resizing of gp3 volumes when utilization reaches the current capacity limit. However, resizing has its own number of limitations in EBS it is only possible once every six hours. To mitigate potential issues during that time, volumes are commonly doubled in size.

On the other hand, simplyblock provides thin provisioned logical volumes. This means that you can provision your volumes nearly without any restriction in size. Think of growable partitions that are sliced out of the storage pool. Logical volumes can also be over-provisioned, meaning, you can set the requested storage capacity to exceed the storage pool’s current size. There is no charge for the over-provisioned capacity as long as you do not use it.

That said, simplyblock thinly provisions NVMe volumes from a storage pool which is either made up of distributed local instance storage or gp3 volumes. The underlying pool is resized before it runs out of storage capacity.

These means enable you to save massively on storage, while also simplifying your operations. No more manual or script-based resizing! No more custom alerts before running out of storage.

Benefits of Storage Tiering

But if you feel there should be even more potential to save on storage, you are absolutely right!

The total data stored on a single EBS volume has very different access patterns. Let’s explore together what the average database setup looks like. The typical corporate’s transactional database will easily qualify as a “hot” storage. It is commonly stored on SSD-based EBS volumes. Nobody would think of putting this database to slow file storage stored on HDD or Amazon S3.

In reality, however, data that belongs to a database is never homogeneous when it comes to performance requirements. There is, for example, the so-called database transaction log, often referred to as write-ahead log (WAL) or simply a database journal. The WAL is quite sensitive to access latency and requires a high IOPS rate for writes. On the other hand, the log is relatively small compared to the entire dataset in the database.

Furthermore, some other data files store tablespaces and index spaces. Many of them are read so frequently that they are always kept in memory. They do not depend on storage performance. Others are accessed less frequently, meaning they have to be loaded from storage every time they’re accessed. They require solid storage performance on read.

Last but not least, there are large tables which are commonly used for archiving or document storage. They are written or read infrequently and typically in large IO sizes (batches). While throughput speed is relevant for accessing this data, access latency is not.

To support all of the above use cases, simplyblock supports automatic tiering. Our tiering will place less frequently accessed data to either Amazon EBS (st2) or Amazon S3, called warm storage. The tiering implementation is optimized for throughput, hence large amounts of data can be written or read in parallel. Simplyblock automatically identifies individual segments of data, which qualify for tiering, and moves them automatically to secondary storage, and only after tiering was successful, cleaning them up on the “hot” tier. This reduces the storage demand in the hot pool.

The AWS cost ratio between hot and warm storage is about 5:1, cutting cost to about 20% for tiered data. Tiering is completely transparent to you and data is automatically read from tiered storage when requested.

Based on our observations, we often see that up to 75% of all stored data can be tiered to warm storage. This creates another massive potential in storage costs savings.

How to Prevent Data Duplication

But there is yet more to come.

The AWS’ gp3 volumes do not allow multi-attach, meaning the same volume cannot be attached to multiple virtual machines or containers at the same time. Furthermore, its reliability is also relatively low (indicated at 99.8% – 99.9%) compared to Amazon S3.

That means neither a loss of availability nor a loss of data can be ruled out in case of an incident.

Therefore, additional steps need to be taken to increase availability of the storage consuming service, as well as the reliability of the storage itself. The common measure is to employ storage replication (RAID-1, or application-level replication). However, this leads to additional operational complexity, utilization of network bandwidth, and to a duplication of storage demand (which doubles the storage capacity and cost).

Simplyblock mitigates the requirement to replicate storage. First, the same thinly provisioned volume can be attached to more than one Amazon EC2 instance (or container) and, second, the reliability of each individual volume is higher (99.9999%) due to the internal use of erasure coding (parity data) to protect the data.

Multi-attach helps to cut the storage cost by half.

The Cost of Backup

Last but not least, backups. Yes there is even more.

A snapshot taken from an Amazon EBS volume is stored in an S3-like storage. However, AWS charges significantly more per TiB than for the same storage directly on S3. Actually about 3.5 times.

Snapshots taken from simplyblock logical volumes, however, are stored into a standard Amazon S3 bucket and based on the standard S3 pricing, giving you yet another nice cost reduction.

Near-Zero RPO Disaster Recovery

Anyhow, there is one more feature that we really want to talk about. Disaster recovery is an optional feature. Our DR comes at a minimum RPO and can be deployed without any redundancy on either the block storage or the compute layer between zones. Additionally, no data transfers between zones are needed.

Simplyblock employs asynchronous replication to store any change on the storage pool to an S3 bucket. This enables a fully crash-consistent and near-real-time option for disaster recovery. You can bootstrap and restart your entire environment after a disaster. This works in the same or a different availability zone and without having to take care of backup management yourself.

And if something happened, accidental deletion or even a successful ransomware attack which encrypted your data. Simplyblock is here to help. Our asynchronous replication journal provides full Point-in-Time-Recovery functionality on the block storage layer. No need for your service or database to support it. Just rewind the storage to whatever point in time in the past.

It also utilizes write- and deletion-protected on its S3 bucket making the journal itself resilient to ransomware attacks. That said, simplyblock provides a sophisticated solution to disaster recovery and cybersecurity breaches without the need for manual backup management.

Simplyblock is Storage Optimization – just for you

Simplyblock provides a number of advantages for environments that utilize a large number of Amazon EBS gp2 or gp3 volumes. Thin provisioning enables you to consolidate unused storage capacity and minimize the spent. Due to the automatic pool enlargement (increasing the pool with additional EBS volumes or storage nodes), you’ll never run out of storage space but also only require the least amount.

Together with automatic tiering, you can move infrequently used data blocks to warm or even cold storage. Fully transparent to the application. The same is true for our disaster recovery. Built into the storage layer, every application can benefit from point in time recovery, removing almost all RPO (Recovery Point Objective) from your whole infrastructure. And with consistent snapshots across volumes, you can enable a full-blown infrastructure recovery in case of an availability zone outage, right from ground up.

With simplyblock you get more features than mentioned here. Get started right away and learn about our other features and benefits.

The post Simplyblock for AWS: Environments with many gp2 or gp3 Volumes appeared first on simplyblock.

IOPS vs Throughput vs Latency – Storage Performance Metrics

Chris Engelbert — Wed, 24 Apr 2024 12:13:37 +0000

IOPS, throughput, and latency are interrelated metrics that provide insight into the read, write, and access performance of storage entities and network interconnects.

Measuring the performance of a storage solution isn’t hard, but understanding the measured values is. IOPS, throughput, and latency are related to each other, but how? In this blog post I try to explain how they are related and what you should know to really get the most out of your performance test.

IOPS, throughput, and latency are all very important metrics. HDDs (spinning or rotating disks), SSDs (I mean SATA or SAS connected ones), and NVMe devices (connected through some PCIe one way or the other), have very different performance profiles.

Simplyblock’s storage engine uses NVMe disks and the NVMe protocol to provide its virtual (or logical) storage volumes to the outside world. Enough of simplyblock, though, let’s get into the performance metrics!

What is IOPS?

IOPS (spoken as eye-ops) means input/output operations per second. An IOPS basically says how many actual read or write operations a storage device can perform (and sustain) in a single second.

Reads are operations that access information stored on the storage device, while writes perform operations adding new or updating existing information on the disk.

As a good rule of thumb, higher IOPS means better performance. However, it is important to understand how IOPS relates to throughput and block size. A measured or calculated amount of IOPS for one specific block size doesn’t necessarily translate to the same amount of IOPS with another block size. We’ll come back to block size later.

How to Calculate IOPS?

Calculating IOPS is a fairly simple formula. We take the throughputs (read and write combined) and divide it by the number of seconds we measured.

IOPS = (ReadThroughputs + WriteThroughputs) / TimeInSeconds

With that knowledge we can actually simplify the calculation down to a single measurement of read and write throughput.

IOPS = (ReadThroughput + WriteThroughput) / BlockSize

With spinning disks it is a bit more complicated since we have additional seek times. Since spinning disks aren’t used for high-performance storage anymore, I’ll leave it out.

Anyhow, while this sounds easy enough, sometimes, write operations require additional read operations to succeed. That can happen in a RAID setup (or similar technologies like erasure coding), where multiple bits of information are “mathematically” connected into recovery information (often called parity data). Meaning, we take two (or more) input information, connect them through some mathematical formula, and write the result to different disks.

As a result, we can lose one of the input information but use the calculation result and the other input information to reverse the mathematical operation and extract the lost information. Anyway, the result is that we have 1 additional read and 1 additional write for every write operation:

Write: New information to write
Read: Second information input for mathematical calculation
Write: Recovery calculation result

For reading data from such a setup, as long as the information to be read isn’t lost, it’s a single-read operation. In the case of recovering the information due to disk failure (or any other issue), there will be two read operations (parity and the second information input), as well as the reversed mathematical operation.

When calculating how many IOPS storage can sustain, we need to make sure we take those cases into account. And finally, some storage solutions may have a slightly asynchronous ability when it comes to writing and reading, meaning they aren’t equally fast. That said, to find the max IOPS value, we may have to test with a 60:40 read-write ratio or any other one. This is a plain trial and error situation. It’s always best to test with your specific use case and read-write ratio.

What is Throughput?

Throughput is the measurement of how much data can be transferred (read or written) in a given amount of time, for storage solutions it’s typically a measure in bytes per second. Just be careful since there are metrics using bits per second, too. Especially when your storage is connected via a network link (like Ethernet or FiberChannel).

A higher throughput normally means better performance, as in faster data transfers. In comparison to IOPS, throughput tells us a specific amount of data that can be read or written in a given amount of time. This metric can be used directly to understand if a storage solution is “fast enough.” It will not tell us if you can sustain a certain pattern of read or write operations though.

How to Calculate Throughput?

To calculate the throughput without measuring it, we take the amount of IOPS (read and write) and multiply it by the configured block size.

Throughput = (Read IOPS + Write IOPS) x BlockSize

We differentiate between read and write IOPS because the maximum amount of IOPS for either direction may not necessarily equal the other direction. See the How to Calculate IOPS section for more information.

IOPS vs Throughput

Since IOPS and throughput are related to each other. That means, to give a full picture of storage performance both values are required.

Throughput gives us a feeling of how much actual data we can read or write in a given amount of time. If we know that we have to write a 1GB file in less than 10 seconds, we know what write throughput is required.

On the other hand, if we know that our database makes many small read operations, we want to know the potential read IOPS limit of our storage.

Anyhow, there is more to performance than those two metrics, latency.

Figure 1: How is latency and throughput defined?

What is Latency?

Latency, in storage systems, describes the amount of time a storage entity needs to process a single data request. With spinning disks that included seek time (the time to find a specific location and position the read-write heads and for the platter to be at the correct position), but it also includes computation time inside the disk controller.

The latter still holds true for flash storage devices. Due to features such as wear leveling, the controller has to calculate where a piece of information is stored at the current point in time.

That said, latency directly affects how many IOPS can be performed, hence the throughput. Thanks to flash technology, the latency these days is fairly consistent (per device) and doesn’t depend on the physical location anymore. With spinning disks, the latency was different depending on the current and new position of the heads and if the information was stored on the inside or outside of the platter. Luckily, we can ignore this information today. Measuring the latency once per disk should be enough.

IOPS, throughput, and latency are directly related to each other but provide different information.

While IOPS tells us how many actual operations we can perform per second, throughput tells us how much data can be transferred at the same time. Latency, however, tells us how long we have to wait for this operation to be performed (or finished, depending on what type of latency you’re looking for).

There is no one best metric, though. When you want to understand the performance characteristics of a storage entity, all 3 metrics are equally important. Though, for a specific question, one may have more weight than the other.

Reads and Writes

So far, we have always talked about reading and writing. We also mentioned that their ratio may impact the overall performance. In addition, we briefly talked about RAID and erasure coding (and now I just throw in the mix mirroring or replication).

Diverging read and write speeds mostly come from hardware implications, such as flash cells needing to be erased before being written. That, inherently, makes writes slower.

For other situations, such as RAID and similar, additional reads or writes have to be executed so that the effective write IOPS value may be different from the physically possible write IOPS one.

Generally, it’s a valid approach to try to find the absolute maximum read and write values (in terms of throughput and IOPS) by trying out different ratios of writing and reading in the same test. I’d recommend starting with a 50:50 ratio and then going 60:40 or 40:60, seeing which one has the higher result, and then making smaller steps (1 or 5 increments or decrements) until you seem to get the highest numbers.

This doesn’t necessarily reflect your real-world pattern, though, because depending on the workload you run, you may have a different ratio requirement.

Why do I have more Reads than Writes?

Workloads like static asset web servers (found on CDN providers), generally, have a much read than write rate. I mean their whole purpose is to cache data from other systems and present them as fast and as many times as possible. In this case, the storage should provide the fastest read throughput, as well as the highest read IOPS.

While those systems need to store and update existing data in the cache, the extreme imbalance of read overwrite makes write performance (almost) negligible.

Another example of such a system may be an analytics database, which is filled with updated data irregularly but is used for analytical purposes most of the time. That means that while a lot of data is probably written in a short amount of time (ingress burst), the data is mostly read for running queries and analytics requests.

Why do I have more Writes than Reads?

Data lakes, or IoT databases, often see much higher writes than reads. In both situations we want to optimize the storage for faster writes.

Data lakes commonly collect data for later analytics. The collected data is often pre-evaluated and used as a training set for algorithms such as fraud detection.

With IoT databases, the amount of data being delivered from IoT devices is often much higher than what is necessary to present to the user. For that reason, many databases optimized for IoT data (or time series data in general) have features to pre-aggregate data for dashboards, hence increasing the write rates even higher. Anyhow, many such databases are optimized to hold the active data set (the data commonly presented to users) in RAM to prevent it from being read from disk, decreasing the typical read rates further.

Why do I have Writes Only?

Finally, there are systems that observe pretty much only writes. These include mirrors or something like hot standby systems. In this case, changes on a primary system are replicated to the mirror and written down. While the system is “waiting” for itself to become the primary, there are basically no read operations happening.

Reads and Writes Distribution

Depending on your workload, your read-write ratio will change. Before trying to find out if a storage solution fits your needs, you should figure out what your workload’s real-world distribution looks like. Only afterward is there a way to really make an educated guess.

Generally speaking, the read-write distribution will change the outcome of your tests. While a storage solution may have a massive amount of IOPS, the question is more around whether it can sustain the IOPS. At what block size can I expect those amounts of IOPS? And how those IOPS values will change when I change the block sizes.

Different workloads have different requirements. Just keep that in mind.

Why Block Size Matters

Finally, block size. We talked about it before over and over again. The block size defines how much data (in bytes) is read or written per operation. That said, 1000 operations (IOPS) with each 8KB means that we manage to read 8000KB (or 8MB) per second. We immediately see our throughput.

Does that mean that with 64KB block size, I get 8 times the speed? It depends. We all hate this answer, but it does.

It mostly depends on how the storage solution is built and how it is connected. Say, you have a network interconnect to the storage solution of 1Gbit/s, I can get roughly 125MB/s through the wire. 64KB x 1000 IOPS results in 64MB/s, so it should fit. If I increase the block size to 128KB, the network bandwidth is now the limiting factor.

Anyhow, the network isn’t your only limiting factor. Transferring larger block sizes needs more time, meaning your potential maximum IOPS will drop. Though, what I didn’t tell you before, when running your benchmark and trying to find the perfect read-write ratio, you can also play with the block size values. There is a good chance that the default block size may not be the perfect one.

Conclusion

That was a lot of information. Your two main metrics are IOPS and throughput. Latency should be measured but can be considered stable for most flash based systems today. Block size is something to experiment with. It will directly impact both IOPS and throughput, but only if you reach another limiting factor (mostly likely some bandwidth limitation).

Up to that point, you can try and increase the block size for best performance. Remember, you may also have to make changes to your workload configuration or implementation to make use of the increased block size.

The following table is supposed to help a bit in terms of judgment of both IOPS and throughput.

	IOPS	Throughput
Measurement	Input/Output operations per second	(Mega-)Bytes per second
Meaning	The amount of input/output (read or write) operations per second.	The amount of data (in bytes) that can be transferred through the storage connection per second.
Difficulty	Measurement requires specific software, and measurements depend directly on the selected block size. For measurements, I’d recommend the command line too l fio.	Easy to measure. Most operating systems have built-in tools for that or provide tools in the repositories (such as iostat). A specialized tool such as fio can help with benchmarking.
Where it helps	If you have a lot of random I/O, throughput will not be a good measurement due to the influences of latency and queuing of storage requests.	Good measurement of random and small operations.
Where is doesn’t help	Doesn’t say much about the amount of data that can be transferred. Also very much dependent on the chosen block size.	If you have a lot of random I/O, throughput will not be a good measurement due to influences of latency and queuing of storage requests.

That said, both metrics, IOPS and throughput, are useful for assessing the performance of a storage system. They shouldn’t be used in isolation, though.

Simplyblock provides a disaggregated storage solution for latency-sensitive and high-IO workloads in the cloud. To achieve that, simplyblock creates a unified storage pool of NVMe disks connected to the simplyblock cluster nodes (virtual machines) and offers logical volumes through NVMe over TCP. Those logical volumes will be distributed across all connected cluster nodes and disks and secured using erasure coding, bringing the highest performance without sacrificing fault tolerance. Learn now more about simplyblock, or get started right away.

The post IOPS vs Throughput vs Latency – Storage Performance Metrics appeared first on simplyblock.