Recovery Point Objective Archives | simplyblock

Simplyblock for AWS: Environments with many gp2 or gp3 Volumes

Michael Schmidt — Thu, 19 Sep 2024 21:49:02 +0000

When operating your stateful workloads in Amazon EC2 and Amazon EKS, data is commonly stored on Amazon’s EBS volumes. AWS supports a set of different volume types which offer different performance requirements. The most commonly used ones are gp2 and gp3 volumes, providing a good combination of performance, capacity, and cost efficiency. So why would someone need an alternative?

For environments with high-performance requirements such as transactional databases, where low-latency access and optimized storage costs are key, alternative solutions are essential. This is where simplyblock steps in, offering a new way to manage storage that addresses common pain points in traditional EBS or local NVMe disk usage—such as limited scalability, complex resizing processes, and the cost of underutilized storage capacity.

What is Simplyblock?

Simplyblock is known for providing top performance based on distributed (clustered) NVMe instance storage at low cost with great data availability and durability. Simplyblock provides storage to Linux instances and Kubernetes environments via the NVMe block storage and NVMe over Fabrics (using TCP/IP as the underlying transport layer) protocols and the simplyblock CSI Driver.

Simplyblock’s storage orchestration technology is fast. The service provides access latency between 100 us and 500 us, depending on the IO access pattern and deployment topology. That means that simplyblock’s access latency is comparable to, or even lower than on Amazon EBS io2 volumes, which typically provide between 200 us to 300 us.

To make sure we only provide storage which will keep up, we test simplyblock extensively. With simplyblock you can easily achieve more than 1 million IOPS at a 4KiB block size on single EC2 compute instances. This is several times higher than the most scalable Amazon EBS volumes, io2 Block Express. On the other hand, simplyblock’s cost of capacity is comparable to io2. However, with simplyblock IOPS come for free – at absolutely no extra charge. Therefore, depending on the capacity to IOPS ratio of io2 volumes, it is possible to achieve cost advantages up to 10x .

For customers requiring very low storage access latency and high IOPS per TiB, simplyblock provides the best cost efficiency available today.

Why Simplyblock over Simple Amazon EBS?

Many customers are generally satisfied with the performance of their gp3 EBS volumes. Access latency of 6 to 10 ms is fine, and they never have to go beyond the included 3,000 IOPS (on gp2 and gp3). They should still care for simplyblock, because there is more. Much more.

Benefits of Thin Provisioning

With gp3, customers have to pay for provisioned rather than utilized capacity (~USD 80 per TiB provisioned). According to our research, the average utilization of Amazon EBS gp3 volumes is only at ~30%. This means that customers are actually paying more than three times the price per TiB of utilized storage. That said, due to the low utilization below one-third, the actual price comes down to about USD 250 per TiB. The higher the utilization, the closer a customer would be to the projected USD 80 per TiB.

In addition to the price inefficiency, customers also have to manage the resizing of gp3 volumes when utilization reaches the current capacity limit. However, resizing has its own number of limitations in EBS it is only possible once every six hours. To mitigate potential issues during that time, volumes are commonly doubled in size.

On the other hand, simplyblock provides thin provisioned logical volumes. This means that you can provision your volumes nearly without any restriction in size. Think of growable partitions that are sliced out of the storage pool. Logical volumes can also be over-provisioned, meaning, you can set the requested storage capacity to exceed the storage pool’s current size. There is no charge for the over-provisioned capacity as long as you do not use it.

That said, simplyblock thinly provisions NVMe volumes from a storage pool which is either made up of distributed local instance storage or gp3 volumes. The underlying pool is resized before it runs out of storage capacity.

These means enable you to save massively on storage, while also simplifying your operations. No more manual or script-based resizing! No more custom alerts before running out of storage.

Benefits of Storage Tiering

But if you feel there should be even more potential to save on storage, you are absolutely right!

The total data stored on a single EBS volume has very different access patterns. Let’s explore together what the average database setup looks like. The typical corporate’s transactional database will easily qualify as a “hot” storage. It is commonly stored on SSD-based EBS volumes. Nobody would think of putting this database to slow file storage stored on HDD or Amazon S3.

In reality, however, data that belongs to a database is never homogeneous when it comes to performance requirements. There is, for example, the so-called database transaction log, often referred to as write-ahead log (WAL) or simply a database journal. The WAL is quite sensitive to access latency and requires a high IOPS rate for writes. On the other hand, the log is relatively small compared to the entire dataset in the database.

Furthermore, some other data files store tablespaces and index spaces. Many of them are read so frequently that they are always kept in memory. They do not depend on storage performance. Others are accessed less frequently, meaning they have to be loaded from storage every time they’re accessed. They require solid storage performance on read.

Last but not least, there are large tables which are commonly used for archiving or document storage. They are written or read infrequently and typically in large IO sizes (batches). While throughput speed is relevant for accessing this data, access latency is not.

To support all of the above use cases, simplyblock supports automatic tiering. Our tiering will place less frequently accessed data to either Amazon EBS (st2) or Amazon S3, called warm storage. The tiering implementation is optimized for throughput, hence large amounts of data can be written or read in parallel. Simplyblock automatically identifies individual segments of data, which qualify for tiering, and moves them automatically to secondary storage, and only after tiering was successful, cleaning them up on the “hot” tier. This reduces the storage demand in the hot pool.

The AWS cost ratio between hot and warm storage is about 5:1, cutting cost to about 20% for tiered data. Tiering is completely transparent to you and data is automatically read from tiered storage when requested.

Based on our observations, we often see that up to 75% of all stored data can be tiered to warm storage. This creates another massive potential in storage costs savings.

How to Prevent Data Duplication

But there is yet more to come.

The AWS’ gp3 volumes do not allow multi-attach, meaning the same volume cannot be attached to multiple virtual machines or containers at the same time. Furthermore, its reliability is also relatively low (indicated at 99.8% – 99.9%) compared to Amazon S3.

That means neither a loss of availability nor a loss of data can be ruled out in case of an incident.

Therefore, additional steps need to be taken to increase availability of the storage consuming service, as well as the reliability of the storage itself. The common measure is to employ storage replication (RAID-1, or application-level replication). However, this leads to additional operational complexity, utilization of network bandwidth, and to a duplication of storage demand (which doubles the storage capacity and cost).

Simplyblock mitigates the requirement to replicate storage. First, the same thinly provisioned volume can be attached to more than one Amazon EC2 instance (or container) and, second, the reliability of each individual volume is higher (99.9999%) due to the internal use of erasure coding (parity data) to protect the data.

Multi-attach helps to cut the storage cost by half.

The Cost of Backup

Last but not least, backups. Yes there is even more.

A snapshot taken from an Amazon EBS volume is stored in an S3-like storage. However, AWS charges significantly more per TiB than for the same storage directly on S3. Actually about 3.5 times.

Snapshots taken from simplyblock logical volumes, however, are stored into a standard Amazon S3 bucket and based on the standard S3 pricing, giving you yet another nice cost reduction.

Near-Zero RPO Disaster Recovery

Anyhow, there is one more feature that we really want to talk about. Disaster recovery is an optional feature. Our DR comes at a minimum RPO and can be deployed without any redundancy on either the block storage or the compute layer between zones. Additionally, no data transfers between zones are needed.

Simplyblock employs asynchronous replication to store any change on the storage pool to an S3 bucket. This enables a fully crash-consistent and near-real-time option for disaster recovery. You can bootstrap and restart your entire environment after a disaster. This works in the same or a different availability zone and without having to take care of backup management yourself.

And if something happened, accidental deletion or even a successful ransomware attack which encrypted your data. Simplyblock is here to help. Our asynchronous replication journal provides full Point-in-Time-Recovery functionality on the block storage layer. No need for your service or database to support it. Just rewind the storage to whatever point in time in the past.

It also utilizes write- and deletion-protected on its S3 bucket making the journal itself resilient to ransomware attacks. That said, simplyblock provides a sophisticated solution to disaster recovery and cybersecurity breaches without the need for manual backup management.

Simplyblock is Storage Optimization – just for you

Simplyblock provides a number of advantages for environments that utilize a large number of Amazon EBS gp2 or gp3 volumes. Thin provisioning enables you to consolidate unused storage capacity and minimize the spent. Due to the automatic pool enlargement (increasing the pool with additional EBS volumes or storage nodes), you’ll never run out of storage space but also only require the least amount.

Together with automatic tiering, you can move infrequently used data blocks to warm or even cold storage. Fully transparent to the application. The same is true for our disaster recovery. Built into the storage layer, every application can benefit from point in time recovery, removing almost all RPO (Recovery Point Objective) from your whole infrastructure. And with consistent snapshots across volumes, you can enable a full-blown infrastructure recovery in case of an availability zone outage, right from ground up.

With simplyblock you get more features than mentioned here. Get started right away and learn about our other features and benefits.

The post Simplyblock for AWS: Environments with many gp2 or gp3 Volumes appeared first on simplyblock.

Disaster Recovery with Simplyblock in AWS

Michael Schmidt — Fri, 06 Sep 2024 23:41:03 +0000

When disaster strikes, a great recovery strategy is required. Oftentimes, deficiencies are only discovered when it’s already too late. Simplyblock provides comprehensive disaster recovery support for databases, file storages, and whole infrastructures, enabling the restore from ground up in a different availability zone with minimal RTO (Recovery Time Objective) and near-zero RPO (Recovery Point Objective).

Amazon EBS, Amazon S3, and Local Instance Storage

AWS’ cloud block storage ( Amazon EBS ) is a great product, providing a multitude of different product types depending on your performance (random IOPS, access latency) requirements. However, the provided durability is limited. Depending on the EBS volume type , AWS provides a durability indicator between 99.8% and 99.999%. The bigger issue though, in case of a disaster in your availability zone (AZ), storage will become unavailable in its entirety and, depending on the type of the disaster, data may actually be lost (partially or in full).

The durability is even worse with local instance storage. Local instance storage are NVMe disks which are physically located on the virtual machine host that runs your workload. That said, all data stored on local instance storage is immediately lost once the instance is turned off, or a failure occurs with the physical host.

Amazon S3 storage, on the other hand, is considered to be extremely durable, offering 99.999999999% durability. In addition, it is replicated across availability zones. Therefore, the probability of data loss by any kind of disaster is close to zero. To our knowledge, and as of time of writing, it has never actually happened. In terms of durability, Amazon S3 is king.We trade, however, durability for latency.

Data Protection for Amazon EBS

As shown, all persistent (meaning, non ephemeral) data stored in Amazon EBS requires additional means of protection. That said, the most common way to protect your data is taking a snapshot of your EBS volume and backing it up to Amazon S3.

Those S3 backups have a number of important drawbacks though: A snapshot-based backup always implicitly means that you’ll have data loss of some kind. Data which has been written between the last backup and the time of the failure is irrecoverably lost. No restore procedure will be able to recover it. For low velocity data (data which is rarely changed) that may be a minor issue. Examples of this kind of data may be media files or archived documents. However, the data loss can be catastrophic for other types of data such as transactional systems. Multiple backups between different systems aren’t consistent between each other. The backup of one database may not fit the backup of another database or a file repository. That said, after restoration the systems may have inconsistent data states and will not integrate correctly. Bringing a collection of systems with backups taken at different times back into a working state can be a massive manual effort. Sometimes it is even impossible. Backup management is a significant effort. To free up disk space, it is necessary to remove snapshots from EBS after moving them to S3. Furthermore, backups have to be configured with retention policies. The successful operations of taking backups must be monitored and backups have to be tested regularly to make sure it is possible to restore them successfully.

Last but not least, human error in backup management may lead to missing or corrupted backups.

Data Protection for Amazon EBS with Simplyblock

Simplyblock provides a smart solution to the consistent recovery of hot data after a major incident or even a zone-level disaster.

First and foremost, simplyblock logical volumes stores data synchronously into the hot tier storage backend. In addition, data is also written into an asynchronous replicated write-ahead log (WAL). Writing this log is optimized for high throughput to secondary (low IOPS) storage such as S3 or HDD pools (e.g. the Amazon EBS st2 service). Last but not least, the WAL is efficiently compacted at regular intervals to limit storage growth and optimize recovery times.

Simplyblock’s logical volumes inherently support snapshots. Due to the copy-on-write nature of simplyblock, snapshots are taken immediately and, together with the WAL, asynchronously replicated to S3.

Data recovery, on the other hand, restores all live volumes and snapshots in a fully consistent manner. The asynchronicity of the replication limits data loss to a few hundred milliseconds.

Disaster Recovery with Near-Zero RPO

The solution stores all “hot” data either in distributed instance storage or within gp3 pools, providing the necessary online performance of storage. At the same time, all data is also asynchronously replicated into S3.

In case of a loss of the entire infrastructure in an availability zone (including the gp3 volumes and local instance storage) it is possible to consistently bootstrap the entire environment in a new AZ.

If a customer uses simplyblock to store the databases, but also bootstrap and deployment information (like ArgoCD configuration, terraform data, or similar), a recover operation can consistently restore the entirety of the infrastructure from ground up. Using this strategy, infrastructures supported by simplyblock can be consistently and fully automatically recovered with near-zero RPO and a low RTO becomes possible.

For this purpose, the “primary” simplyblock storage pod, which contains all data required for bootstrapping, has to be restarted in a new zone and connected to the control plane. Afterwards, all storage is consistently accessible.

First, infrastructure templates and configurations for the environment are retrieved, after which the deployment scripts are run and the infrastructure is redeployed. In this process, databases, documents, and other file stores can already be connected to their corresponding volumes, which contain all of the data in a crash-consistent manner.

At a later stage, “secondary” storage plane pods can be restarted within the new availability zone and data will be recovered.

The recovery time depends largely on the amount of data and the instance network bandwidth. The read time from S3 is highly optimized using large, parallel reads, wherever possible to pre-fetch hot data as quickly as possible.

Conclusion

All that said, simplyblock, the intelligent storage orchestrator, provides a powerful and feature-rich solution to provide a crash-consistent, yet performant storage solution.

Built upon well-known storage solutions, such as local instance storage, Amazon EBS, and Amazon S3, simplyblock combines the ultra low latency access of NVMe volumes (pooled or unpooled) with the extreme durability of Amazon S3. Simplyblock’s write-ahead log and disaster recovery support enables the lowest RPO and minimal downtime, even in case of the loss of a full availability zone.

Get started with simplyblock today and learn all about the other amazing features simplyblock brings right to you.

The post Disaster Recovery with Simplyblock in AWS appeared first on simplyblock.