Amazon EBS Archives | simplyblock https://www.simplyblock.io/blog/tags/amazon-ebs/ NVMe-First Kubernetes Storage Platform Thu, 06 Feb 2025 09:02:34 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.1 https://www.simplyblock.io/wp-content/media/cropped-icon-rgb-simplyblock-32x32.png Amazon EBS Archives | simplyblock https://www.simplyblock.io/blog/tags/amazon-ebs/ 32 32 We Built a Tool to Help You Understand Your Real EBS Usage! https://www.simplyblock.io/blog/ebs-volume-usage-exporter/ Fri, 17 Jan 2025 08:50:27 +0000 https://www.simplyblock.io/?p=4902 There is one question in life that is really hard to answer: “What is your actual AWS EBS volume usage?” When talking to customers and users, this question is frequently left open with the note that they’ll check and tell us later. With storage being one of the main cost factors of cloud services such […]

The post We Built a Tool to Help You Understand Your Real EBS Usage! appeared first on simplyblock.

]]>
There is one question in life that is really hard to answer: “What is your actual AWS EBS volume usage?”

When talking to customers and users, this question is frequently left open with the note that they’ll check and tell us later. With storage being one of the main cost factors of cloud services such as Amazon’s AWS, this is not what it should be.

But who could blame them? It’s not like AWS is making it obvious to you how much of your storage resources (not only capacity but especially IOPS and throughput) you really use. It might be bad for AWS’ revenue.

We just open sourced our AWS EBS Volume Usage Exporter on Github. Get an accurate view of your EBS usage in EKS.

Why We Built This

We believe that there is no reason to pay more than necessary. However, since it’s so hard to get hard facts on storage use, we tend to overprovision—by a lot.

Hence, we decided to do something about it. Today, we’re excited to share our new open-source tool – the AWS EBS Volume Usage Exporter!

What makes this particularly interesting is that, based on our experience, organizations typically utilize only 20-30% of their provisioned AWS EBS volumes. That means 70-80% of provisioned storage is sitting idle, quietly adding to your monthly AWS bill. Making someone else happy but you.

What Our Tool Does

The EBS Volume Usage Exporter runs in your EKS cluster and collects detailed metrics about your EBS volumes, including:

  • Actual usage patterns
  • IOPS consumption
  • Throughput utilization
  • Available disk space
  • Snapshot information

All this data gets exported into a simple CSV file that you can analyze however you want.

If you like convenience, we’ve also built a handy calculator (that runs entirely in your browser – no data leaves your machine!) to help you quickly understand potential cost savings. Here’s the link to our EBS Volume Usage Calculator. No need to use it, though. The data is easy enough to get basic insight. Our calculator just automates the pricing and potential saving calculation based on current AWS price lists.

Super Simple to Get Started

To get you started quickly, we packaged everything as a Helm chart to make deployment as smooth as possible. You’ll need:

  • An EKS cluster with cluster-admin privileges
  • An S3 bucket
  • Basic AWS permissions

The setup takes just a few minutes – we’ve included all the commands you need in our GitHub repository.

After a successful run, you can simply delete the helm chart deployment and be done with it. The exported data are available in the provided S3 bucket for download.

We Want Your Feedback!

This is just the beginning, and we’d love to hear from you!

Do the calculated numbers match your actual costs?
What other features would you find useful?

We already heard people asking for a tool that can run outside of EKS, and we’re looking into it. We would also love to extend support to utilize existing orchestration infrastructure such as DataDog, Dynatrace, or others. Most of the data is already available and should be easy to extract.

For those storage pros out there who can answer the EBS utilization question off the top of your head – we’d love to hear your stories, too!

Share your experiences and help us make this tool even better.

Try It Out!

The AWS EBS Volume Usage Exporter is open source and available now on GitHub. Give it a spin, and let us know what you think!

And hey – if you’re wondering whether you really need this tool, ask yourself: “Do I know exactly how much of my provisioned EBS storage is actually being used right now?”

If there’s even a moment of hesitation, you should check this out!


At simplyblock, we’re passionate about helping organizations optimize their cloud storage. This tool represents our commitment to the open-source community and our mission to eliminate cloud storage waste.

The post We Built a Tool to Help You Understand Your Real EBS Usage! appeared first on simplyblock.

]]>
amazon-ebs-usage-ebs-volume-usage-exporter-github-hero
AWS Cost Management: Strategies for Right-Sizing Storage in Dynamic Environments https://www.simplyblock.io/blog/aws-cost-management-right-sizing-storage-in-dynamic-environments/ Tue, 10 Dec 2024 13:09:23 +0000 https://www.simplyblock.io/?p=4586 For companies that use Amazon Web Services (AWS) for storage, having firm control over costs is key. Mismanaging storage can directly lead to soaring expenses and unexpected losses in a fast-paced world with fluctuating workloads. A reliable and effective way to keep costs in check for a business is to implement efficient storage solutions and […]

The post AWS Cost Management: Strategies for Right-Sizing Storage in Dynamic Environments appeared first on simplyblock.

]]>
(Pixabay)

For companies that use Amazon Web Services (AWS) for storage, having firm control over costs is key. Mismanaging storage can directly lead to soaring expenses and unexpected losses in a fast-paced world with fluctuating workloads. A reliable and effective way to keep costs in check for a business is to implement efficient storage solutions and in the right size. This can ensure solid growth and performance, making it a plan that works for AWS-reliant businesses.

This article will look at several good ways to manage storage costs on AWS and give you practical tips to size storage in changing environments.

What Does Right-Sizing Mean for AWS Storage?

Choosing the right size in AWS means going with suitable storage solution types and scales based on the particular needs of a business. It’s a major strategic factor that can help you achieve and maintain a competitive lead in the market by avoiding unnecessary costs and putting a stop to overspending. All it takes is actively monitoring storage policies, checking how much storage usually goes unused, and proactively making prompt changes accordingly.

AWS offers various storage types, including Amazon S3 for object storage, Amazon EBS for block storage, and Amazon EFS for file storage, each suitable for different applications. By right-sizing, businesses can avoid paying for idle storage resources and only use what’s necessary.

AWS Storage Services and The Cost-Savings They Offer

With AWS, you get a few storage options at different price points that you can choose from based on your business needs

  • Amazon S3 (Simple Storage Service) offers an incredible amount of scalability, allowing growing businesses to adapt well. It works well for unorganized data and uses a pay-as-you-go system, which keeps costs down when storage needs change.
  • Amazon EBS (Elastic Block Store) gives lasting block storage for EC2 instances. EBS prices change based on the type of volume, its size, and input/output actions, so you need to watch it to keep expenses in check.
  • Amazon EFS (Elastic File System) is a managed file storage service that grows on its own, which helps applications that need shared storage. While it’s convenient, data costs can rise as volume grows.

To reduce overall cloud spending, it’s essential to understand which storage type suits your workloads and to handle these services.

Editor’s note: If you’re looking for ways to consolidate your Amazon EBS volumes, simplyblock got you covered.

Ways to Optimize Storage Size on AWS

1. Use Storage Class Levels

Amazon S3 offers various storage classes with different costs and speeds. You can save money by placing data in the appropriate class based on access frequency and retrieval speed needs. Here’s a breakdown:

  • S3 Standard is best for accessed data, but it’s more expensive.
  • S3 Infrequent Access (IA) is cheaper for less-used data that still needs retrieval.
  • S3 Glacier and Glacier Deep Archive are the least expensive options for long-term accessed data.

You can cut costs without losing access by reviewing and moving data to suitable storage classes based on usage patterns.

(Pixabay)

2. Set Up Data Lifecycle Rules

Managing data lifecycles helps companies with changing storage needs to save money. AWS lets you make rules to move, store, or delete data based on certain conditions. With S3 lifecycle policies, you can set up your data to move from S3 Standard to S3 IA and then to Glacier or be removed after a set time.

3. Use Automatic Tracking and Warnings

AWS offers tools to keep an eye on storage use and costs, like AWS CloudWatch and AWS Budgets. These tools help spot wasted resources, odd spikes in use, or costs that go over your set budget. Setting up warnings through AWS Budgets can tell you when you’re close to your budget limit and stop extra costs before they pile up.

4. Make EBS Volumes the Right Size

(Pixabay)

Elastic Block Store (EBS) volumes often waste resources when they’re bigger than needed. Checking EBS use often can show volumes that aren’t used much or at all. AWS has a tool called EBS Right Sizing Recommendation that helps find volumes you can make smaller without slowing things down.

EBS provides Volume Types such as General Purpose (gp3), Provisioned IOPS (io2), and Throughput Optimized (st1). Picking the right volume type for each workload, plus sizing, cuts costs. You pay for the storage performance and capacity you need.

Editor’s note: Save up to 80% on your high-performance AWS storage costs.

Smart Ways to Cut AWS Storage Costs Further

(Unsplash)

1. Use Reserved Instances and Savings Plans

For workloads you can predict, think about AWS Reserved Instances (RIs) and Savings Plans. People often link these to EC2 instances, but they also offer cheap options for related EBS storage. RIs and Savings Plans let you promise to use a certain amount, giving you lower rates if you commit for one or three years. This works best for steady workloads that need the same amount of storage over time, where you’re less likely to buy too much.

Savings plans shouldn’t just be limited to storage solutions. You can also reconsider other costs, such as switching to a cheap web hosting service that still meets your business needs.

2. Make Multi-Region Storage More Efficient

AWS gives you ways to copy your data across different regions. This makes your data more secure and helps you recover if something goes wrong. But storing data in multiple regions can cost a lot because of copying and moving data between regions. To cut these costs, you can look at how people use your data and put it in regions close to most of your users.

3. Consider Spot Instances for Short-Term Storage Needs

Spot Instances offer a more affordable option to handle tasks that can cope with interruptions. You can use short-term storage on Spot Instances for less crucial brief projects where storage requirements fluctuate. When you combine Spot Instances with Amazon EBS or S3, you gain flexibility and cut costs. However, remember that AWS has the right to reclaim Spot Instances at any moment. This makes them unsuitable for critical or high-availability tasks.

Summing Up: Managing AWS Storage Costs

(Unsplash)

Smart AWS cost control begins with a hands-on strategy to size storage. This includes picking the right S3 storage types, setting up lifecycle rules, keeping an eye on EBS use, or taking advantage of reserved options. These methods can help you keep a lid on your storage bills.

When you check usage and put these tried-and-true tips into action, you’ll be in a better position to handle your AWS expenses. At the same time, you’ll keep the ability to scale and the reliability your workloads need. In a cloud world where storage costs can get out of hand, clever management will pay off. It’ll help your company stay nimble and budget-friendly.

The post AWS Cost Management: Strategies for Right-Sizing Storage in Dynamic Environments appeared first on simplyblock.

]]>
archive-1850170_1280 aws store-5619201_1280 cloud-6515064_1280 micheile-henderson-ZVprbBmT8QA-unsplash national-cancer-institute-S-3AnKlICmY-unsplash
Local NVMe Storage on AWS – Pros and Cons https://www.simplyblock.io/blog/local-nvme-storage-aws/ Thu, 03 Oct 2024 12:13:26 +0000 https://www.simplyblock.io/?p=324 What is the Best Storage Solution on AWS? The debate over the optimal storage solution has been ongoing. Local instance storage on AWS (i.e. ephemeral NVMe disk attached to EC2 instance) brings remarkable cost-performance ratios. It offers 20 times better performance and 10 times lower access latency than EBS. It’s a powerhouse for quick, ephemeral […]

The post Local NVMe Storage on AWS – Pros and Cons appeared first on simplyblock.

]]>
What is the Best Storage Solution on AWS?

The debate over the optimal storage solution has been ongoing. Local instance storage on AWS (i.e. ephemeral NVMe disk attached to EC2 instance) brings remarkable cost-performance ratios. It offers 20 times better performance and 10 times lower access latency than EBS. It’s a powerhouse for quick, ephemeral storage needs. In simple words, local NVME disk is very fast and relatively cheap, but not scalable and not persistent.

Recently, Vantage posted an article titled “Don’t use EBS for Cloud Native Services“. We agree with this problem statement, however we also strongly believe that there is a better solution that using Local NVMe SSD Storage on AWS as an alternative to EBS. Local NVMe to EBS is not like comparing apples to apples, but more like apples to oranges.

The Local Instance NVMe Storage Advantage

Local storage on AWS excels in speed and cost-efficiency, delivering performance that’s 20 times better and latency that’s 10 times lower compared to EBS. For certain workloads with temporary storage needs, it’s a clear winner. But, let’s acknowledge the reasons why data centers have traditionally separated storage and compute.

Overcoming Traditional Challenges of Local Storage

ChallengesLocal Storagesimplyblock
ScalabilityLimited capacity, unable to resize dynamicallyDynamic scalability with simplyblock
ReliabilityData loss if instance is stopped or terminatedAdvanced data protection, data survives instance outage
High AvailabilityInconsistent access in case of compute instance outageAccess to storage must remain fully available in case of a compute instance outage
Data Protection EfficiencyN/AUse of erasure coding instead of three replicas to reduce load on the network and effective-to-raw storage ratios by a factor of about 2.5x
Predictability/ConsistencyAccess latency increases with rising IOPS demandConstant access latencies with simplyblock
MaintainabilityImpact on storage during compute instance upgradesUpgrading and maintaining compute instances without impact on storage is an important aspect of operations
Data Services OffloadingN/ANo impact on local CPU, performance and access latency for data services such as volume snapshot, copy-on-write cloning, instant volume resizing, erasure coding, encryption and data compression
Intelligent Storage TieringN/AAutomatically move infrequently accessed data chunks from more expensive, fast storage to cheap S3 buckets

Simplyblock provides an innovatrive approach that marries the cost and performance advantages of local instance storage with the benefits of pooled cloud storage. It offers the best of both worlds—high-speed, low-latency performance near to local storage, coupled with the robustness and flexibility of pooled cloud storage.

Why Choose simplyblock on AWS?

  1. Performance and Cost Efficiency: Enjoy the benefits of local storage without compromising on scalability, reliability, and high availability.
  2. Data Protection: simplyblock employs advanced data protection mechanisms,
    ensuring that your data survives any instance outage.
  3. Seamless Operations: Upgrade and maintain compute instances without impacting storage, ensuring continuous operations.
  4. Data Services Galore: Unlock the potential of various data services without affecting local CPU performance.

While local instance storage has its merits, the future lies in a harmonious blend of the speed of local storage and the resilience of cloud-pooled storage. With simplyblock, we transcend the limitations of local NVMe disk, providing you with a storage solution that’s not just powerful but also versatile, scalable, and intelligently designed for the complexities of the cloud era.

The post Local NVMe Storage on AWS – Pros and Cons appeared first on simplyblock.

]]>
Simplyblock for AWS: Environments with many gp2 or gp3 Volumes https://www.simplyblock.io/blog/aws-environments-with-many-ebs-volumes/ Thu, 19 Sep 2024 21:49:02 +0000 https://www.simplyblock.io/?p=1609 When operating your stateful workloads in Amazon EC2 and Amazon EKS, data is commonly stored on Amazon’s EBS volumes. AWS supports a set of different volume types which offer different performance requirements. The most commonly used ones are gp2 and gp3 volumes, providing a good combination of performance, capacity, and cost efficiency. So why would […]

The post Simplyblock for AWS: Environments with many gp2 or gp3 Volumes appeared first on simplyblock.

]]>
When operating your stateful workloads in Amazon EC2 and Amazon EKS, data is commonly stored on Amazon’s EBS volumes. AWS supports a set of different volume types which offer different performance requirements. The most commonly used ones are gp2 and gp3 volumes, providing a good combination of performance, capacity, and cost efficiency. So why would someone need an alternative?

For environments with high-performance requirements such as transactional databases, where low-latency access and optimized storage costs are key, alternative solutions are essential. This is where simplyblock steps in, offering a new way to manage storage that addresses common pain points in traditional EBS or local NVMe disk usage—such as limited scalability, complex resizing processes, and the cost of underutilized storage capacity.

What is Simplyblock?

Simplyblock is known for providing top performance based on distributed (clustered) NVMe instance storage at low cost with great data availability and durability. Simplyblock provides storage to Linux instances and Kubernetes environments via the NVMe block storage and NVMe over Fabrics (using TCP/IP as the underlying transport layer) protocols and the simplyblock CSI Driver.

Simplyblock’s storage orchestration technology is fast. The service provides access latency between 100 us and 500 us, depending on the IO access pattern and deployment topology. That means that simplyblock’s access latency is comparable to, or even lower than on Amazon EBS io2 volumes, which typically provide between 200 us to 300 us.

To make sure we only provide storage which will keep up, we test simplyblock extensively. With simplyblock you can easily achieve more than 1 million IOPS at a 4KiB block size on single EC2 compute instances. This is several times higher than the most scalable Amazon EBS volumes, io2 Block Express. On the other hand, simplyblock’s cost of capacity is comparable to io2. However, with simplyblock IOPS come for free – at absolutely no extra charge. Therefore, depending on the capacity to IOPS ratio of io2 volumes, it is possible to achieve cost advantages up to 10x .

For customers requiring very low storage access latency and high IOPS per TiB, simplyblock provides the best cost efficiency available today.

Why Simplyblock over Simple Amazon EBS?

Many customers are generally satisfied with the performance of their gp3 EBS volumes. Access latency of 6 to 10 ms is fine, and they never have to go beyond the included 3,000 IOPS (on gp2 and gp3). They should still care for simplyblock, because there is more. Much more.

Simplyblock provides multiple angles to save on storage: true thin provisioning, storage tiering, multi-attach, and snapshot storage!

Benefits of Thin Provisioning

With gp3, customers have to pay for provisioned rather than utilized capacity (~USD 80 per TiB provisioned). According to our research, the average utilization of Amazon EBS gp3 volumes is only at ~30%. This means that customers are actually paying more than three times the price per TiB of utilized storage. That said, due to the low utilization below one-third, the actual price comes down to about USD 250 per TiB. The higher the utilization, the closer a customer would be to the projected USD 80 per TiB.

In addition to the price inefficiency, customers also have to manage the resizing of gp3 volumes when utilization reaches the current capacity limit. However, resizing has its own number of limitations in EBS it is only possible once every six hours. To mitigate potential issues during that time, volumes are commonly doubled in size.

On the other hand, simplyblock provides thin provisioned logical volumes. This means that you can provision your volumes nearly without any restriction in size. Think of growable partitions that are sliced out of the storage pool. Logical volumes can also be over-provisioned, meaning, you can set the requested storage capacity to exceed the storage pool’s current size. There is no charge for the over-provisioned capacity as long as you do not use it.

A thinly provisioned logical volume requires only the amount of storage actually used

That said, simplyblock thinly provisions NVMe volumes from a storage pool which is either made up of distributed local instance storage or gp3 volumes. The underlying pool is resized before it runs out of storage capacity.

These means enable you to save massively on storage, while also simplifying your operations. No more manual or script-based resizing! No more custom alerts before running out of storage.

Benefits of Storage Tiering

But if you feel there should be even more potential to save on storage, you are absolutely right!

The total data stored on a single EBS volume has very different access patterns. Let’s explore together what the average database setup looks like. The typical corporate’s transactional database will easily qualify as a “hot” storage. It is commonly stored on SSD-based EBS volumes. Nobody would think of putting this database to slow file storage stored on HDD or Amazon S3. Simplyblock tiers infrequently used data blocks automatically to cheaper storage backends

In reality, however, data that belongs to a database is never homogeneous when it comes to performance requirements. There is, for example, the so-called database transaction log, often referred to as write-ahead log (WAL) or simply a database journal. The WAL is quite sensitive to access latency and requires a high IOPS rate for writes. On the other hand, the log is relatively small compared to the entire dataset in the database.

Furthermore, some other data files store tablespaces and index spaces. Many of them are read so frequently that they are always kept in memory. They do not depend on storage performance. Others are accessed less frequently, meaning they have to be loaded from storage every time they’re accessed. They require solid storage performance on read.

Last but not least, there are large tables which are commonly used for archiving or document storage. They are written or read infrequently and typically in large IO sizes (batches). While throughput speed is relevant for accessing this data, access latency is not.

To support all of the above use cases, simplyblock supports automatic tiering. Our tiering will place less frequently accessed data to either Amazon EBS (st2) or Amazon S3, called warm storage. The tiering implementation is optimized for throughput, hence large amounts of data can be written or read in parallel. Simplyblock automatically identifies individual segments of data, which qualify for tiering, and moves them automatically to secondary storage, and only after tiering was successful, cleaning them up on the “hot” tier. This reduces the storage demand in the hot pool.

The AWS cost ratio between hot and warm storage is about 5:1, cutting cost to about 20% for tiered data. Tiering is completely transparent to you and data is automatically read from tiered storage when requested.

Based on our observations, we often see that up to 75% of all stored data can be tiered to warm storage. This creates another massive potential in storage costs savings.

How to Prevent Data Duplication

But there is yet more to come.

The AWS’ gp3 volumes do not allow multi-attach, meaning the same volume cannot be attached to multiple virtual machines or containers at the same time. Furthermore, its reliability is also relatively low (indicated at 99.8% – 99.9%) compared to Amazon S3.

That means neither a loss of availability nor a loss of data can be ruled out in case of an incident.

Therefore, additional steps need to be taken to increase availability of the storage consuming service, as well as the reliability of the storage itself. The common measure is to employ storage replication (RAID-1, or application-level replication). However, this leads to additional operational complexity, utilization of network bandwidth, and to a duplication of storage demand (which doubles the storage capacity and cost).

Simplyblock mitigates the requirement to replicate storage. First, the same thinly provisioned volume can be attached to more than one Amazon EC2 instance (or container) and, second, the reliability of each individual volume is higher (99.9999%) due to the internal use of erasure coding (parity data) to protect the data.

Multi-attach helps to cut the storage cost by half.

The Cost of Backup

Last but not least, backups. Yes there is even more.

A snapshot taken from an Amazon EBS volume is stored in an S3-like storage. However, AWS charges significantly more per TiB than for the same storage directly on S3. Actually about 3.5 times.

Snapshots taken from simplyblock logical volumes, however, are stored into a standard Amazon S3 bucket and based on the standard S3 pricing, giving you yet another nice cost reduction.

Near-Zero RPO Disaster Recovery

Anyhow, there is one more feature that we really want to talk about. Disaster recovery is an optional feature. Our DR comes at a minimum RPO and can be deployed without any redundancy on either the block storage or the compute layer between zones. Additionally, no data transfers between zones are needed.

Simplyblock employs asynchronous replication to store any change on the storage pool to an S3 bucket. This enables a fully crash-consistent and near-real-time option for disaster recovery. You can bootstrap and restart your entire environment after a disaster. This works in the same or a different availability zone and without having to take care of backup management yourself.

And if something happened, accidental deletion or even a successful ransomware attack which encrypted your data. Simplyblock is here to help. Our asynchronous replication journal provides full Point-in-Time-Recovery functionality on the block storage layer. No need for your service or database to support it. Just rewind the storage to whatever point in time in the past.

It also utilizes write- and deletion-protected on its S3 bucket making the journal itself resilient to ransomware attacks. That said, simplyblock provides a sophisticated solution to disaster recovery and cybersecurity breaches without the need for manual backup management.

Simplyblock is Storage Optimization – just for you

Simplyblock provides a number of advantages for environments that utilize a large number of Amazon EBS gp2 or gp3 volumes. Thin provisioning enables you to consolidate unused storage capacity and minimize the spent. Due to the automatic pool enlargement (increasing the pool with additional EBS volumes or storage nodes), you’ll never run out of storage space but also only require the least amount.

Together with automatic tiering, you can move infrequently used data blocks to warm or even cold storage. Fully transparent to the application. The same is true for our disaster recovery. Built into the storage layer, every application can benefit from point in time recovery, removing almost all RPO (Recovery Point Objective) from your whole infrastructure. And with consistent snapshots across volumes, you can enable a full-blown infrastructure recovery in case of an availability zone outage, right from ground up.

With simplyblock you get more features than mentioned here. Get started right away and learn about our other features and benefits.

The post Simplyblock for AWS: Environments with many gp2 or gp3 Volumes appeared first on simplyblock.

]]>
Simplyblock provides multiple angles to save on storage: true thin provisioning, storage tiering, multi-attach, and snapshot storage! A thinly provisioned logical volume requires only the amount of storage actually used Simplyblock tiers infrequently used data blocks automatically to cheaper storage backends
AWS Storage Optimization: Best Practices for Cost and Performance https://www.simplyblock.io/blog/aws-storage-optimization/ Mon, 12 Aug 2024 01:20:15 +0000 https://www.simplyblock.io/?p=1747 Managing storage costs in AWS environments has become increasingly critical as organizations scale their cloud infrastructure. With storage often representing 20-30% of cloud spending, AWS storage optimization isn’t just about reducing costs – it’s about maximizing performance while maintaining data accessibility and security. Storage optimization in AWS presents unique challenges. Organizations frequently overprovision Amazon EBS […]

The post AWS Storage Optimization: Best Practices for Cost and Performance appeared first on simplyblock.

]]>
Managing storage costs in AWS environments has become increasingly critical as organizations scale their cloud infrastructure. With storage often representing 20-30% of cloud spending, AWS storage optimization isn’t just about reducing costs – it’s about maximizing performance while maintaining data accessibility and security.

Storage optimization in AWS presents unique challenges. Organizations frequently overprovision Amazon EBS volumes, leading to poor utilization rates averaging just 30%. Meanwhile, the complexity of managing multiple storage types – from high-performance io2 volumes to cost-effective S3 buckets – can make it difficult to implement effective tiering strategies. For companies running databases and data-intensive applications, these challenges are particularly acute.

This guide explores proven best practices for AWS storage optimization, focusing on key areas including:

  • Strategic storage provisioning and capacity planning
  • Effective use of storage tiers and volume types
  • Performance optimization techniques
  • Cost reduction strategies through improved resource utilization
  • Automated storage management and monitoring

Whether you’re running managed database services, operating observability platforms, or managing enterprise applications, these optimization strategies can help you achieve the ideal balance of performance, cost, and operational efficiency in your AWS environment.

Introduction to AWS Storage

What is AWS Storage?

AWS Storage refers to the various cloud storage solutions provided by Amazon Web Services (AWS). Core AWS Storage services are S3, EBS and EFS. These services enable users to store, manage, and retrieve data over the internet, offering scalable and secure storage options tailored to different needs. AWS Storage solutions are integral for businesses and developers who require reliable, high-performance storage that can grow with their demands.

Why is AWS Storage Important?

AWS Storage services are crucial for managing vast amounts of data efficiently. It provides flexibility, scalability, and cost-effectiveness, making it suitable for a wide range of applications—from simple data backup to complex data analytics and high-performance computing. Understanding AWS Storage types and costs helps businesses optimize their data management strategies and budgets.

AWS Storage offers scalable, secure, and cost-effective solutions for all your data management needs.

AWS Storage Types: Overview and Use Cases

Amazon EBS (Elastic Block Store)

What it is: Block-level storage volumes attached to EC2 instances, behaving like physical hard drives.

Pros:

  • High performance with low latency (especially io2)
  • Consistent I/O performance
  • Supports live configuration changes
  • Automatic replication within AZ
  • Supports snapshots and encryption

Cons:

  • Limited to single AZ
  • Can only attach to one instance (except multi-attach io2)
  • Relatively expensive, especially for high IOPS
  • Pay for provisioned capacity, not used capacity
  • Volume size changes limited to every 6 hours

Best for:

  • Database storage
  • Boot volumes
  • Enterprise applications requiring consistent I/O
  • Development and test environments

Amazon S3 (Simple Storage Service)

What it is: Highly durable object storage service accessible via HTTP/HTTPS.

Pros:

  • Unlimited storage capacity
  • 99.999999999% durability
  • Cross-region availability
  • Multiple storage tiers
  • Pay only for what you use
  • Highly scalable and cost-effective

Cons:

  • Higher latency than block storage
  • Not suitable for operating systems or databases
  • Can be expensive for frequent data access
  • Object size limitations
  • No file system interface

Best for:

  • Static website hosting
  • Backup and archive
  • Data lakes
  • Content distribution
  • Application assets

Local Instance Storage

What it is: Physical storage attached to EC2 instance hardware.

Pros:

  • Extremely low latency
  • Very high IOPS
  • No additional cost beyond instance price
  • Ideal for temporary storage
  • Highest performance option

Cons:

  • Data lost when instance stops
  • Cannot be detached/reattached
  • Size limited by instance type
  • No replication or backup
  • No data persistence

Best for:

  • Cache storage
  • Temporary processing
  • High-performance scratch space
  • Buffer/queue storage
  • Instance-specific workloads

Amazon EFS (Elastic File System)

What it is: Fully managed NFS file system for EC2 instances.

Pros:

  • Shared access across multiple instances
  • Automatic scaling
  • Cross-AZ replication
  • Pay for used storage only
  • Linux-compatible file system

Cons:

  • Higher latency than EBS
  • More expensive than S3
  • Limited to Linux workloads
  • Performance scales with size
  • Regional service only

Best for:

  • Shared file storage
  • Content management systems
  • Development environments
  • Web serving
  • Analytics applications

Comparing AWS Storage Types: A Deep Dive

Amazon EBS vs Local Instance Storage (NVMe)

Performance Characteristics:

  • EBS:
    • 200-500μs latency for io2
    • Limited IOPS (64,000 max per volume)
    • Network-attached storage with consistent performance
    • Bandwidth limited by network
  • Local NVMe:
    • Ultra-low latency (100μs or less)
    • Very high IOPS (millions possible)
    • Direct-attached storage
    • No network bandwidth limitations

Durability & Availability:

  • EBS:
    • 99.8-99.9% durability
    • Persists independently of instance
    • Survives instance stops/starts
    • Supports snapshots
  • Local NVMe:
    • Ephemeral storage
    • Data lost on instance stop
    • No built-in replication
    • No snapshot support

Amazon EBS vs S3

Performance Characteristics:

  • EBS:
    • Low latency (milliseconds)
    • Block-level access
    • Consistent I/O performance
    • Limited to single AZ
  • S3:
    • Higher latency (tens of milliseconds)
    • Object-level access
    • Unlimited scale
    • Global access

Cost Structure:

  • EBS:
    • Pay for provisioned capacity
    • Additional IOPS costs for io2
    • Snapshot storage costs
    • Cross-AZ data transfer fees
  • S3:
    • Pay for used storage only
    • Tiered pricing based on volume
    • Access frequency pricing options
    • Cheaper for large datasets

Local NVMe vs S3

Use Case Optimization:

  • Local NVMe:
    • High-performance databases
    • Real-time analytics
    • Cache layers
    • Temporary processing
  • S3:
    • Long-term storage
    • Data lakes
    • Static content
    • Backup/archive

Management Overhead:

  • Local NVMe:
    • Requires careful capacity planning
    • No built-in data protection
    • Instance type dependent
    • Complex redundancy needs
  • S3:
    • Fully managed service
    • Built-in redundancy
    • Automatic scaling
    • Lifecycle management

Choosing the Right Storage Type

For Database Workloads:

  1. High-Performance Requirements:
    • Primary: Local NVMe
    • Secondary: EBS io2
    • Archive: S3
  2. Cost-Sensitive Applications:
    • Primary: EBS gp3
    • Secondary: S3
    • Archive: S3 Glacier

For Analytics Workloads:

  1. Real-time Analytics:
    • Hot data: Local NVMe
    • Warm data: EBS
    • Cold data: S3
  2. Batch Processing:
    • Processing: Local NVMe
    • Source data: S3
    • Results: EBS/S3

Implementing AWS storage solutions tailored to your specific needs ensures you get the most out of your cloud investment.

Understanding AWS Storage Challenges

Storage optimization in AWS presents several key challenges that organizations must address:

  • Underutilization: Organizations frequently overprovision Amazon EBS volumes, leading to poor utilization rates averaging just 30%
  • Complex Management: Balancing multiple storage types, from high-performance io2 volumes to cost-effective S3 buckets
  • Performance Tradeoffs: Finding the right balance between cost and performance for different workload types
  • Scaling Costs: Managing growing storage expenses as data volumes expand

AWS Storage Pricing and Cost Optimization

Understanding AWS Storage Costs

AWS storage pricing encompasses multiple components that organizations must carefully consider. At its core, storage costs are based on the volume of data stored, with pricing varying significantly across different storage types. While S3 charges for actual usage, EBS volumes bill for provisioned capacity regardless of utilization. This distinction becomes crucial for cost optimization strategies.

Data transfer costs represent another significant component of storage expenses. AWS charges for data movement between regions and from AWS to the internet, though transfers into AWS and within the same region are typically free or lower cost. Organizations should carefully architect their applications to minimize costly cross-region data transfers.

Request and retrieval costs, while often overlooked, can substantially impact the total storage bill. Services like S3 charge for both PUT/GET operations, while Glacier adds retrieval fees based on speed requirements. Understanding these operational costs is crucial for accurately forecasting storage expenses.

Cost Estimation and Management

The AWS Pricing Calculator serves as an invaluable tool for projecting storage costs before deployment. Organizations can model different scenarios, comparing costs across storage types and usage patterns. This proactive approach helps avoid unexpected expenses and enables better budgeting decisions.

Regular bill analysis provides insights into actual storage usage patterns and costs. AWS Cost Explorer and detailed billing reports help identify cost drivers, unused resources, and opportunities for optimization. Monthly reviews of these reports should be standard practice for effective cost management.

AWS Storage Optimization Strategies

Selecting the appropriate storage type for each workload represents one of the most effective cost optimization strategies. For instance, frequently accessed data might justify the higher costs of EBS io2 volumes, while rarely accessed data could be more cost-effectively stored in S3 Glacier. Understanding access patterns and performance requirements enables informed decision-making.

Lifecycle policies automate the movement of data between storage tiers based on age or access patterns. For example, moving infrequently accessed data from S3 Standard to S3 Glacier after 90 days can significantly reduce storage costs while maintaining data accessibility when needed.

Data compression and deduplication technologies can substantially reduce storage requirements and costs. Modern compression algorithms offer excellent compression ratios with minimal performance impact, making them particularly valuable for large datasets or backup storage.

Ongoing Management

Regular storage audits should be a cornerstone of any cost optimization strategy. These reviews help identify orphaned snapshots, unused volumes, and opportunities for storage consolidation. Organizations should establish processes for regular cleanup and right-sizing of storage resources.

Implementing proper tagging and monitoring strategies enables better cost allocation and usage tracking. Tags help attribute costs to specific projects or departments, while monitoring helps identify usage patterns and potential cost optimization opportunities. This data-driven approach ensures storage resources are used efficiently and cost-effectively.

Frequently Asked Questions (FAQs)

What is the Cheapest AWS Storage Option?

Amazon Glacier is the most cost-effective storage option for long-term archival needs, though it has higher retrieval costs compared to other services.

How can i Reduce my AWS Storage Costs?

To reduce costs, choose the appropriate storage type, use lifecycle policies to transition data to lower-cost storage, compress data, and regularly review your storage usage.

What is the Difference between S3 and EBS?

S3 is an object storage service suitable for storing and retrieving any amount of data, while EBS provides block storage for use with EC2 instances, offering high performance and low latency.

How do i Estimate AWS Storage Costs?

Estimate costs using the AWS Pricing Calculator, which factors in storage type, amount of data, data transfer, and retrieval requests. Review your AWS bill for accurate cost management.

Can i use Multiple AWS Storage Types Together?

No, you can’t use multiple AWS storage types together for a single workload. You can however use simplyblock orchestration to combine NVMe disk, EBS and S3 in a single solution.

Simplyblock integrates seamlessly with AWS storage services, offering cost-efficient yet high-performance cloud storage at scale in a single solution.

How Simplyblock Can Be Used To Optimize AWS Storage Cost?

Simplyblock can help you optimize AWS storage costs and utilize various AWS storage types effectively by providing a seamless bridge between local NVMe disk, Amazon EBS, and Amazon S3, integrating these storage options into a single, cohesive system designed for ultimate scale and performance of IO-intensive stateful workloads. By combining the high performance of local NVMe storage with the reliability and cost-efficiency of EBS and S3 respectively, simplyblock enables enterprises to optimize their storage infrastructure for stateful applications, ensuring scalability, cost savings, and enhanced performance. With simplyblock, you can save up to 80% on your EBS costs on AWS.

Ideal for high-performance Kubernetes environments, simplyblock combines the benefits of local-like latency with the scalability and flexibility necessary for dynamic AWS EKS deployments, ensuring optimal performance for I/O-sensitive workloads like databases. Using erasure coding (a better RAID) instead of replicas helps to minimize storage overhead without sacrificing data safety and fault tolerance. Simplyblock uses NVMe over TCP for minimal access latency, high IOPS/GB, and efficient CPU core utilization, surpassing local NVMe disks and Amazon EBS in cost/performance ratio at scale. Moreover, simplyblock can be used alongside various AWS storage types, ensuring a versatile storage solution.

Additional features such as instant snapshots (full and incremental), copy-on-write clones, thin provisioning, compression, encryption, and many more, simplyblock meets your requirements before you set them. Get started using simplyblock right now or learn more about our feature set. Simplyblock is available on AWS Marketplace.

The post AWS Storage Optimization: Best Practices for Cost and Performance appeared first on simplyblock.

]]>
What is AWS Marketplace? https://www.simplyblock.io/blog/what-is-aws-marketplace/ Wed, 07 Aug 2024 01:29:19 +0000 https://www.simplyblock.io/?p=1755 Introduction to AWS Marketplace Overview of AWS Marketplace AWS Marketplace is a digital catalog that allows customers to find, buy, and deploy software and services that run on Amazon Web Services (AWS) . It’s designed to simplify the software procurement process for organizations, making it easier to access a vast array of solutions tailored to […]

The post What is AWS Marketplace? appeared first on simplyblock.

]]>
Introduction to AWS Marketplace

Overview of AWS Marketplace AWS Marketplace is a digital catalog that allows customers to find, buy, and deploy software and services that run on Amazon Web Services (AWS) . It’s designed to simplify the software procurement process for organizations, making it easier to access a vast array of solutions tailored to various business needs.

Importance of AWS Marketplace The AWS Marketplace plays a critical role in the cloud ecosystem by providing a centralized platform where buyers and sellers can interact seamlessly. It eliminates the complexity associated with traditional software procurement, offering a streamlined approach that saves time and resources.

What is AWS Marketplace? AWS Marketplace is a digital catalog offered by Amazon Web Services (AWS) that allows customers to find, buy, deploy, and manage third-party software, data, and services that run on AWS. It serves as a platform where software vendors can list their products and services, making it easier for AWS customers to discover and utilize a wide range of tools and solutions that complement their cloud infrastructure.

How many products are in AWS Marketplace? As of 2024, the AWS Marketplace featured around 42,240 products and services, with 11,478 in infrastructure software. It’s a digital catalog where independent software vendors can list their offerings. Current offerings can be seen here .

Key Features of AWS Marketplace

  • Extensive Software Catalog : AWS Marketplace offers an extensive catalog of software listings across various categories, including security, networking, storage, machine learning, and DevOps. Users can browse thousands of products from leading software vendors, ensuring they find the right tools for their specific requirements.
  • Simplified Procurement Process : The procurement process in AWS Marketplace is designed to be straightforward and efficient. Users can quickly find and purchase software with a few clicks, avoiding lengthy negotiations and paperwork. The streamlined process helps organizations adopt new technologies faster.
  • Integrated Billing : AWS Marketplace integrates billing with AWS accounts, allowing users to consolidate their software and cloud infrastructure expenses into a single invoice. This simplifies financial management and provides clear visibility into spending.

Benefits of using AWS Marketplace

Cost Efficiency

  • Pay-as-you-go Pricing : One of the major advantages of AWS Marketplace is its pay-as-you-go pricing model. Users only pay for what they use, avoiding upfront costs and minimizing financial risk. This model is particularly beneficial for startups and small businesses with limited budgets.
  • Free Trials and Discounts : Many products on AWS Marketplace offer free trials and discounts, enabling users to test solutions before committing to a purchase. These offers help organizations make informed decisions and maximize their return on investment.

Ease of Deployment

  • One-Click Deployments: AWS Marketplace supports one-click deployments, allowing users to quickly deploy software solutions with minimal effort. This feature accelerates time-to-market and reduces the complexity of setting up new systems.
  • Pre-configured Solutions: The marketplace provides pre-configured solutions that are optimized for AWS environments. These solutions come with best practice configurations, ensuring optimal performance and security.

With AWS Marketplace, you can shorten procurement times, implement the controls you need to operate with confidence, and enable your organization to unlock innovation.

Enhanced Security

  • Compliance with Industry Standards: Security is a top priority for AWS Marketplace. All products listed in the marketplace comply with industry standards and best practices, ensuring that users can trust the solutions they deploy.
  • Secure Transactions: AWS Marketplace ensures secure transactions through rigorous vetting processes for sellers and robust encryption technologies . Users can confidently purchase software without worrying about security risks.

How AWS Marketplace Works

  • Browsing the Catalog: Users can browse the AWS Marketplace catalog by category, vendor, or keyword. The intuitive search functionality helps users quickly find the software that meets their needs.
  • Subscribing to Products: Once users find a product they want, they can subscribe to it directly through the marketplace. The subscription process is simple and transparent, with clear pricing and terms.
  • Deploying Solutions: After subscribing, users can deploy solutions directly from the AWS Management Console . The deployment process is automated and integrates seamlessly with existing AWS infrastructure.
  • Managing Subscriptions: AWS Marketplace provides tools for managing subscriptions, including tracking usage, monitoring costs, and renewing or canceling subscriptions as needed.

Types of Products available on AWS Marketplace

  • Software as a Service (SaaS): SaaS products in AWS Marketplace offer cloud-based applications that users can access over the Internet. These solutions cover a wide range of business needs, from CRM to project management.
  • Infrastructure as a Service (IaaS): IaaS products provide virtualized computing resources over the internet. Users can purchase and deploy servers, storage, and networking components to build and manage their IT infrastructure.
  • Platform as a Service (PaaS): PaaS products offer a platform for developing, testing, and deploying applications. These solutions provide the necessary tools and frameworks, reducing the complexity of application development.
  • Professional Services: AWS Marketplace also includes professional services such as consulting, implementation, and training. These services help organizations effectively leverage AWS solutions and achieve their business goals.
    Types of products on AWS Marketplace

AWS Marketplace for Buyers

  • Finding the Right Solutions: Buyers can use various filters and search criteria to find the right solutions for their needs. Detailed product descriptions, specifications, and pricing information are available to aid decision-making.
  • Comparing Products: AWS Marketplace allows AWS buyers to compare multiple products side by side. This feature helps users evaluate different options based on features, pricing, and user reviews.
  • Reading Reviews and Ratings: User reviews and ratings provide valuable insights into the performance and reliability of products. Buyers can read feedback from other users to make informed purchasing decisions.

AWS Marketplace for Sellers

  • Listing Products: Sellers can list their products on AWS Marketplace by creating detailed product pages that highlight key features and benefits. The listing process is straightforward and provides access to a global audience.
  • Managing Listings: AWS Marketplace offers tools for managing product listings, including updating product information, monitoring sales performance, and responding to customer inquiries.
  • Accessing Sales Reports: Sellers can access detailed sales reports that provide insights into revenue, customer demographics, and product performance. These reports help AWS sellers optimize their offerings and marketing strategies.

Integration with other AWS Services

  • AWS CloudFormation : AWS CloudFormation integration allows users to automate the deployment of AWS Marketplace products using infrastructure as code. This ensures consistent and repeatable deployments.
  • AWS CloudTrail : Integration with AWS CloudTrail enables users to track and audit actions taken on AWS Marketplace products. This enhances security and compliance by providing detailed logs of all activities.
  • AWS CloudWatch : AWS CloudWatch integration provides monitoring and logging for AWS Marketplace products. Users can set up alerts and dashboards to track the performance and health of their deployed solutions.

Conclusion

AWS Marketplace is an essential tool for organizations looking to streamline their software procurement and deployment processes. With its extensive catalog, simplified procurement, and integrated billing, AWS Marketplace offers unmatched convenience and efficiency. The platform’s cost-effective pricing models, ease of deployment, and enhanced security measures make it an attractive option for businesses of all sizes. Whether you’re a small business looking for affordable solutions or a large enterprise seeking scalable software, AWS Marketplace has something to offer.

Moreover, AWS Marketplace’s seamless integration with other AWS services ensures that users can fully leverage their AWS environment, enhancing overall productivity and performance. AWS Marketplace fosters a vibrant ecosystem that drives innovation and growth by providing a centralized platform for buyers and sellers.

AWS Marketplace is more than just a digital catalog ; it’s a comprehensive solution that empowers businesses to quickly and efficiently adopt the software and services they need to succeed in today’s fast-paced digital landscape. Explore AWS Marketplace today and discover how it can transform your approach to software procurement and deployment.

Frequently Asked Questions (FAQs)

How do i get Started with AWS Marketplace?

To get started with AWS Marketplace, sign in to your AWS account, browse the catalog, and select the products you want to subscribe to. Follow the prompts to complete the subscription and deployment process.

Are there any Costs associated with using AWS Marketplace?

While browsing the AWS Marketplace catalog is free, you will incur costs when you subscribe to and deploy products. Each product has its pricing model, which may include pay-as-you-go, subscription, or one-time fees.

How Secure is AWS Marketplace?

AWS Marketplace is highly secure, with all products adhering to industry standards and best practices. AWS uses robust encryption and security measures to protect transactions and customer data.

Can i Cancel my Subscription at any Time?

Yes, you can cancel your subscription to any product on AWS Marketplace at any time. The cancellation process is straightforward, and you will only be billed for the usage up to the cancellation date.

What Support Options are available for AWS Marketplace Users?

AWS Marketplace users have access to various support options, including detailed documentation, customer support from vendors, and AWS support plans that provide technical assistance and guidance.

How can Simplyblock be used with AWS Marketplace?

AWS Marketplace storage solutions, such as simplyblock can help reducing your database costs on AWS up to 80% . Simplyblock offers high-performance cloud block storage that enhances the performance of your databases and applications. This ensures you get better value and efficiency from your cloud resources.

Simplyblock software provides a seamless bridge between local NVMe disk, Amazon EBS, and Amazon S3, integrating these storage options into a single, cohesive system designed for ultimate scale and performance of IO-intensive stateful workloads. By combining the high performance of local NVMe storage with the reliability and cost-efficiency of EBS (gp2 and gp3 volumes) and S3 respectively, simplyblock enables enterprises to optimize their storage infrastructure for stateful applications, ensuring scalability, cost savings, and enhanced performance. With simplyblock, you can save up to 80% on your EBS costs on AWS.

Our technology uses NVMe over TCP for minimal access latency, high IOPS/GB, and efficient CPU core utilization, outperforming local NVMe disks and Amazon EBS in cost/performance ratio at scale. Ideal for high-performance Kubernetes environments, simplyblock combines the benefits of local-like latency with the scalability and flexibility necessary for dynamic AWS EKS deployments , ensuring optimal performance for I/O-sensitive workloads like databases. By using erasure coding (a better RAID) instead of replicas, simplyblock minimizes storage overhead while maintaining data safety and fault tolerance. This approach reduces storage costs without compromising reliability.

Simplyblock also includes additional features such as instant snapshots (full and incremental), copy-on-write clones, thin provisioning, compression, encryption, and many more – in short, there are many ways in which simplyblock can help you optimize your cloud costs. Get started using simplyblock right now and see how simplyblock can help you on the AWS Marketplace . Simplyblock is available on AWS marketplace .

The post What is AWS Marketplace? appeared first on simplyblock.

]]>
Types of products on AWS Marketplace
How to reduce AWS cloud costs with AWS marketplace products? https://www.simplyblock.io/blog/how-to-reduce-aws-cloud-costs-with-aws-marketplace-products/ Fri, 28 Jun 2024 02:19:03 +0000 https://www.simplyblock.io/?p=1793 The AWS Marketplace is a comprehensive catalog consisting of thousands of offerings that help organizations find, purchase, deploy and manage third-party software and services to optimize their cloud operations. It’s also a great place to find numerous tools specifically designed to help you optimize your AWS cloud costs. These tools can help you monitor your […]

The post How to reduce AWS cloud costs with AWS marketplace products? appeared first on simplyblock.

]]>
The AWS Marketplace is a comprehensive catalog consisting of thousands of offerings that help organizations find, purchase, deploy and manage third-party software and services to optimize their cloud operations. It’s also a great place to find numerous tools specifically designed to help you optimize your AWS cloud costs. These tools can help you monitor your cloud usage, right-size resources, leverage cost-effective pricing models, and implement automated management practices to reduce waste and improve efficiency.

In this blog post you will learn more on the key drivers behind the cost with AWS Cloud, what cloud cost optimization is, why you need to think about it and what tools are at your disposal, particularly in the AWS Marketplace. what is aws marketplace

What are the Fundamental Drivers of Cost with AWS Cloud?

Industry studies show that almost 70% of organizations experience higher-than-anticipated cloud costs. Understanding the key factors that drive costs in AWS Cloud is essential for effective cost management. Below is a breakdown of the key drivers of cloud costs, including compute resources and storage, which together make up almost 60 -70% of the total spend, costs associated with data transfer, networking, database services, what support plans you opt for, additional costs of licensing & marketplace products and serverless services like API calls.

Based on the Vantage Cloud Cost Report for Q1 2024 , we can see that most used services in public clouds are by far comput instances (EC2 on AWS, Compute Engine on Google Cloud and Virtual Machines on Microsoft Azure), followed by storage and databases. Optimizing costs of compute, storage and databases will have the highest impact on cloud bill reduction. Top 10 services by spend on AWS, Google Cloud and Azure Q1 2024

Looking more granularly on AWS, here are key services to look into when optimizing cloud costs:

Compute Resources

  • EC2 Instances : The cost depends on the type, size, and number of EC2 instances you run. Different instance types have varying performance and pricing.
  • Lambda Functions : Pricing is based on the number of requests and the duration of execution.

Cloud Storage

  • S3 Buckets : Costs vary depending on the amount of data stored, the frequency of access (standard, infrequent access, or Glacier), and the number of requests made.
  • EBS Volumes : Pricing is based on the type and size of the volume, provisioned IOPS and snapshots. Cloud block storage prices can be very high if used for highly transaction workloads such as relational, NoSQL or vector databases.
  • EFS and FSx : Pricing is based on the service type, IOPS and other requested services. Prices of file systems in the cloud can become very expensive with extensive usage.

Data Transfer

  • Data Ingress and Egress : Inbound data transfer is generally free, but outbound data transfer (data leaving AWS) incurs charges. Costs can add up, especially with high-volume transfers across regions or to the internet. Networking
  • VPC: Costs associated with using features like VPN connections, VPC peering, and data transfer between VPCs.
  • Load Balancer s: Costs for using ELB (Elastic Load Balancers) vary based on the type (Application, Network, or Classic) and usage. Database Services:
  • RDS: Charges depend on the database engine, instance type, storage, and backup storage.
  • DynamoDB: Pricing is based on read and write throughput, data storage, and optional features like backups and data transfer.

Understanding these drivers helps you identify areas where you can cut costs without sacrificing performance, allowing for better budgeting, more efficiency in operations and better scalability as demand increases.

What is Cloud Cost Optimization?

Cloud cost optimization involves using various strategies, techniques, best practices, and tools to lower cloud expenses. It aims to find the most economical way to operate your applications in the cloud, ensuring you get the highest business value from your investment. It may involve tactics like monitoring your cloud usage, identifying waste, and making adjustments to use resources more effectively without compromising performance or reliability and using marketplace solutions instead of some cloud-provider-native offerings.

Why do you need Cloud Cost Optimization?

Organizations waste approximately 32% of their cloud spending, which is a significant amount whether you’re a small business or a large one spending millions on cloud services. Cloud optimization helps you minimize this redundancy and avoid overspending. Cloud cost optimization also goes beyond just cost-cutting; it also focuses on thorough analysis of current usage, identifying inefficiencies and eliminating wastage to optimize value.

More than just cutting costs, it’s also about ensuring your spending aligns with your business goals. Cloud cost optimization means understanding your cloud expenses and making smart adjustments to control costs without sacrificing performance. Also see our blog post on AWS and cloud cost optimization .

What is the AWS Marketplace?

The AWS Marketplace is a “curated digital catalog that customers can use to find, buy, deploy, and manage third-party software, data, and services to build solutions and run their businesses.” It features thousands of software solutions, including but not limited to security, networking, storage, machine learning, and business applications, from independent software vendors (ISVs). These offerings are easy to use and can be quickly deployed directly to an AWS environment, making it easy to integrate new solutions into your existing cloud infrastructure.

AWS Marketplace also offers various flexible pricing options, including hourly, monthly, annual, and BYOL (Bring Your Own License). And lastly, many of the software products available in the Marketplace have undergone rigorous security assessments and comply with industry standards and regulations. Also note that purchases from the AWS Marketplace can count towards AWS Enterprise Discount Program (EDP) commitments. See our blog post on the EDP .

Cloud Cost Optimization Tools on AWS Marketplace you can use to Optimize your Cloud Costs

In addition to its thousands of software products, AWS Marketplace also offers many products and services that can help you optimize your cloud costs. Here are some tools and ways in which you can use AWS Marketplace to do so effectively.

Cloud Cost Management Tools AWS Marketplace hosts a variety of cost management tools that provide insights into your cloud spending. Products like CloudHealth and CloudCheckr offer comprehensive dashboards and reports that help you understand where your money is going. These tools can identify underutilized resources, recommend rightsizing opportunities, and alert you to unexpected cost spikes, enabling proactive management of your AWS expenses.

Optimzing Compute Costs: Reserved Instances and Savings Plans One of the most effective ways to reduce AWS costs is by purchasing Reserved Instances (RIs) and Savings Plans, as mentioned above. However, understanding the best mix and commitment level can be challenging. Tools like Spot.io and Cloudability available on AWS Marketplace can analyze your usage patterns and recommend the optimal RI or Savings Plan purchases. These products ensure you get the best return on your investment while maintaining the flexibility to adapt to changing workloads.

Optimizing Cloud Storage Costs Data storage can quickly become one of the largest expenses in your AWS bill. Simplyblock, available on AWS Marketplace, is the next generation of software-defined storage, enabling storage requirements for the most demanding workloads. High IOPS per Gigabyte density, low predictable latency, and high throughput is enabled using the pooled storage, as well as our distributed data placement algorithm. Using erasure coding (a better RAID) instead of replicas helps to minimize storage overhead without sacrificing data safety and fault tolerance .

Automate Resource Management Automated resource management tools can help you scale your resources up or down based on demand, ensuring you only pay for what you use. Products like ParkMyCloud and Scalr can automate the scheduling of non-production environments to shut down during off-hours, significantly reducing costs. These tools also help in identifying and terminating idle resources, ensuring no wastage of your cloud budget.

Enhance Security and Compliance Security and compliance are critical but can also be cost-intensive. Utilizing AWS Marketplace products like Trend Micro and Alert Logic can enhance your security posture without the need for a large in-house team. These services provide continuous monitoring and automated compliance checks, helping you avoid costly breaches and fines while optimizing the allocation of your security budget.

Consolidate Billing and Reporting For organizations managing multiple AWS accounts, consolidated billing and reporting tools can simplify cost management. AWS Marketplace offers solutions like CloudBolt and Turbonomic that provide a unified view of your cloud costs across all accounts. These tools offer detailed reporting and chargeback capabilities, ensuring each department or project is accountable for their cloud usage, promoting cost-conscious behavior throughout the organization.

By leveraging the diverse range of products available on AWS Marketplace, organizations can gain better control over their AWS spending, optimize resource usage, and enhance operational efficiency. Whether it’s through cost management tools, automated resource management, or enhanced security solutions, AWS Marketplace products provide the necessary tools to reduce cloud costs effectively.

How to Reduce EBS Cost in AWS?

AWS Marketplace storage solutions, such as simplyblock can help reducing Amazon EBS costs and AWS database costs up to 80% . Simplyblock offers high-performance cloud block storage that enhances the performance of your databases and applications. This ensures you get better value and efficiency from your cloud resources.

Simplyblock software provides a seamless bridge between local EC2 NVMe disk, Amazon EBS, and Amazon S3, integrating these storage options into a single, cohesive system designed for ultimate scale and performance of IO-intensive stateful workloads. By combining the high performance of local NVMe storage with the reliability and cost-efficiency of EBS and S3 respectively, simplyblock enables enterprises to optimize their storage infrastructure for stateful applications, ensuring scalability, cost savings, and enhanced performance. With simplyblock, you can save up to 80% on your EBS costs on AWS.

Our technology uses NVMe over TCP for minimal access latency, high IOPS/GB, and efficient CPU core utilization, outperforming local NVMe disks and Amazon EBS in cost/performance ratio at scale. Ideal for high-performance Kubernetes environments, simplyblock combines the benefits of local-like latency with the scalability and flexibility necessary for dynamic AWS EKS deployments , ensuring optimal performance for I/O-sensitive workloads like databases. By using erasure coding (a better RAID) instead of replicas, simplyblock minimizes storage overhead while maintaining data safety and fault tolerance. This approach reduces storage costs without compromising reliability.

Simplyblock also includes additional features such as instant snapshots (full and incremental), copy-on-write clones, thin provisioning, compression, encryption, and many more – in short, there are many ways in which simplyblock can help you optimize your cloud costs. Get started using simplyblock right now and see how simplyblock can help you on the AWS Marketplace .

To save on your cloud costs, you can also take advantage of discounts provided by various platforms. You can visit here to grab a discount on your AWS credits.

The post How to reduce AWS cloud costs with AWS marketplace products? appeared first on simplyblock.

]]>
what is aws marketplace Top 10 services by spend on AWS, Google Cloud and Azure Q1 2024
How to benefit from AWS Enterprise Discount Program (EDP) https://www.simplyblock.io/blog/aws-enterprise-discount-program-edp/ Thu, 13 Jun 2024 12:08:44 +0000 https://www.simplyblock.io/?p=256 What is the AWS Enterprise Discount Program (EDP)? The AWS Enterprise Discount Program (EDP) is a discount initiative designed for organizations spending at least $1m per year on AWS cloud services and committed to extensive and long-term usage of Amazon Web Services (AWS). The program helps businesses optimize their cloud spending while expanding their operations […]

The post How to benefit from AWS Enterprise Discount Program (EDP) appeared first on simplyblock.

]]>
What is the AWS Enterprise Discount Program (EDP)?

The AWS Enterprise Discount Program (EDP) is a discount initiative designed for organizations spending at least $1m per year on AWS cloud services and committed to extensive and long-term usage of Amazon Web Services (AWS). The program helps businesses optimize their cloud spending while expanding their operations on AWS. By entering into an EDP agreement, enterprises can secure significant cost savings and enhanced value from their AWS investments, which is particularly advantageous during economic downturns. important facts about aws edp program

How does the AWS Enterprise Discount Program (EDP) Work?

The AWS Enterprise Discount Program operates on a tiered discount system based on an organization’s annual AWS spending commitment, usually starting at $1 million per year. Key features of the program include:

  • Customizable Discounts : Discounts are negotiated based on total committed spend and commitment duration, typically ranging from 1 to 5 years. Greater commitments yield higher discounts.
  • Broad Coverage : Discounts apply to nearly all AWS services and regions, ensuring consistent savings across the AWS ecosystem.
  • Marketplace offerings : AWS Marketplace can contribute up to 25% EDP spend.
  • Scalability : As AWS usage grows, the program allows organizations to benefit from increased discounts, promoting a sustainable and cost-effective cloud strategy.

What is EDP in AWS?

In AWS EDP stands for the Enterprise Discount Program . This is a contractual agreement between AWS and enterprises that guarantees significant discounts in exchange for a minimum level of AWS spending over a specified period. This program helps reduce cloud costs and encourages deeper engagement with the AWS ecosystem, fostering long-term partnerships and more efficient cloud usage.

How to Negotiate AWS EDP?

When negotiating an AWS Enterprise Agreement , consider these strategies to maximize benefits:

  1. Understand Your Usage Patterns : Analyze your current and projected AWS usage to accurately determine your commitment levels.
  2. Leverage Historical Spend : Use your historical AWS spend data to negotiate better discount rates.
  3. Seek Flexibility : Aim for terms that allow flexibility in service usage and scalability.
  4. Engage AWS Account Managers : Collaborate with AWS account managers to understand all available options and potential incentives.
  5. Evaluate Support and Training : Include provisions for enhanced support and training services in the agreement.

How to Join AWS EDP?

To join the AWS Enterprise Discount Program , follow these steps:

  1. Assess Eligibility : Ensure your organization meets the minimum annual spend requirement, typically around $1 million.
  2. Contact AWS Sales : Reach out to your AWS account manager or AWS sales team to express interest in the program.
  3. Prepare for Negotiations : Gather your usage data and financial projections to negotiate the best possible terms.
  4. Sign Agreement : Finalize and sign the EDP agreement, detailing the committed spend and discount structure.
  5. Monitor and Optimize : Regularly review your AWS usage and costs to ensure you are maximizing the benefits of the EDP.

Understanding AWS Marketplace with AWS EDP

To maximize the benefits of the AWS Enterprise Discount Program , it’s crucial to understand your AWS Marketplace usage. Determine which Independent Software Vendors (ISVs) you are currently purchasing from and explore opportunities to route these purchases through the AWS Marketplace. Purchases made via the AWS Marketplace can contribute to your total commitment under the EDP, with a cap of 25%. This can be a strategic way to ensure your software investments also help you meet your EDP commitments.

Can i Join EDP as a Startup?

For startups, joining the AWS Enterprise Discount Program (EDP) might not be feasible due to the high minimum spend requirement, typically around $1 million annually. However, there are other ways to maximize savings on AWS:

  1. AWS Credits : Startups can benefit from AWS credits through programs like the AWS Activate program. These credits can significantly reduce your cloud costs during the early stages of growth. For example, AWS Activate provides up to $100,000 in credits for eligible startups.
  2. Marketplace Solutions : Utilize the AWS Marketplace to purchase software solutions that can contribute to your overall AWS spend. For example, AWS marketplace offerings such as simplyblock can help you significantly reduce spending on AWS storage services while scaling the operations.

By leveraging these alternatives, startups can achieve substantial savings and optimize their AWS spending without needing to meet the high thresholds required for the EDP.

What’s the Difference between an EDP and a PPA?

EDP (Enterprise Discount Program) offers custom discounts based on high-volume, long-term AWS usage commitments, providing scalable savings across most AWS services. In contrast, a PPA (Private Pricing Agreement) is a more flexible, negotiated contract tailored to specific needs, often used for unique pricing arrangements and custom terms that might not fit the broader structure of an EDP. While both aim to reduce cloud costs, an EDP is typically for larger, ongoing commitments, whereas a PPA can address more specific, immediate requirements.

Other AWS Programs and Discounts

AWS offers various pricing models to help organizations achieve cost savings based on usage frequency, volume, and commitment duration. Here are some common ones:

  • Spot Instances: You use spare AWS capacity at a lower price. But, AWS can take back this capacity when they need it. Best for flexible workloads.
  • Reserved Instances: You commit to use AWS for a long time (1-3 years), and in return, you get a big discount. Best for predictable workloads.
  • Savings Plans: Similar to Reserved Instances, but more flexible. You commit to use a certain amount of AWS services, and you get a discount.
  • Vantage Autopilot : Provides automated optimization of AWS costs by dynamically adjusting instances and resources based on usage patterns, helping organizations reduce their AWS bills without manual intervention. Vantage autopilot can be used alongside simplyblock to further reduce storage cost with lower underlying EC2 instance costs (simplyblock deploys onto EC2 instances with local NVMe storage, pooling the resources into scalable enterprise-grade storage system).

How can Simplyblock be used with AWS EDP?

simplyblock can be a game-changer for your AWS Enterprise Discount Program (EDP) . It offers high-performance cloud block storage that not only enhances performance of your databases and applications but also brings cost efficiency. Most importantly, spending on simplyblock through AWS Marketplace can contribute towards the 25% marketplace spend requirement of AWS EDP. This means you can leverage simplyblock’s services while also fulfilling your commitment to AWS. It’s a win-win situation for AWS users seeking performance, scalability, and cost-effectiveness.

Simplyblock uses NVMe over TCP for minimal access latency, high IOPS/GB, and efficient CPU core utilization, surpassing local NVMe disks and Amazon EBS in cost/performance ratio at scale. Ideal for high-performance Kubernetes environments, simplyblock combines the benefits of local-like latency with the scalability and flexibility necessary for dynamic AWS EKS deployments , ensuring optimal performance for I/O-sensitive workloads like databases. Using erasure coding (a better RAID) instead of replicas helps to minimize storage overhead without sacrificing data safety and fault tolerance.

Additional features such as instant snapshots (full and incremental), copy-on-write clones, thin provisioning, compression, encryption, and many more, simplyblock meets your requirements before you set them. Get started using simplyblock right now or learn more about our feature set .

The post How to benefit from AWS Enterprise Discount Program (EDP) appeared first on simplyblock.

]]>
important facts about aws edp program
Block Storage Volume Pooling for the Cloud-Age https://www.simplyblock.io/blog/block-storage-volume-pooling-for-the-cloud-age/ Wed, 17 Apr 2024 12:13:28 +0000 https://www.simplyblock.io/?p=290 If you have services running in the AWS, you’ll eventually need block storage to store data. Services like Amazon EBS (Elastic Block Storage) provide block storage to be used in your EC2 instances, Amazon EKS (Elastic Kubernetes Services), and others. While providing an easy to use, and fast option, there are several limitations you’ll eventually […]

The post Block Storage Volume Pooling for the Cloud-Age appeared first on simplyblock.

]]>
If you have services running in the AWS, you’ll eventually need block storage to store data. Services like Amazon EBS (Elastic Block Storage) provide block storage to be used in your EC2 instances, Amazon EKS (Elastic Kubernetes Services), and others. While providing an easy to use, and fast option, there are several limitations you’ll eventually run into.

Amazon EBS: the Limitations

When building out a new system, quick iterations are generally key. That includes fast turnovers to test ideas, or validate approaches. Using out of the box services, like Amazon EBS helps with these requirements. Cloud providers like AWS offer these services to get customers started quickly.

That said, cloud block storage volumes, such as Amazon EBS (gp3, io2, io2 Block Express), provide fast storage with high IOPS, and low latency for your compute instances (Amazon EC2) or Kubernetes environments (Amazon EKS or self-hosted) in the same availability zones.

While quick to get started, eventually you may run into the following shortcomings, which will make scaling either complicated or expensive.

Limited Free IOPS: The number of free IOPS is limited on a per volume basis, meaning that if you need high IOPS numbers, you have to pay extra. Sometimes you even have to change the volume type (gp3 only supports up to 16k IOPS, whereas io2 Block Express supports up to 256k IOPS).

Limited Durability: Depending on the selected volume type you’ll have to deal with limited data durability and availability (e.g. gp3 offers only 99.9% availability). There is no way to buy your way out of it, except using two volumes with some RAID1 like configuration.

No Thin Provisioning: Storage cost is paid per provisioned capacity per time unit and not by actual usage (which is around 50% less). When a volume is created, the given capacity is used for calculating the price, meaning if you create a 1TB volume but only use 100GB, you’ll still pay for the 1TB.

Limited Capacity Scalability: While volumes can be grown in size, you can only increase the capacity once every 6h (at least on Amazon EBS). Therefore, you need to estimate upfront how large the volume will grow in that time frame. If you miscalculated, you’ll run out of memory.

No Replication Across Availability Zones: Volumes cannot be replicated across availability zones, limiting the high availability if there are issues with one of the availability zones. This can be mitigated using additional tools, but they invoke additional cost.

Missing Multi-Attach: Attaching a volume to multiple compute instances or containers offers shared access to a dataset. Depending on volume-type, there is no option to multi-attach a volume to multiple instances.

Limited Latency: Depending on volume type, the access latency of a volume is located in the single or double-digit milliseconds range. Latency may also fluctuate. With low and predictable latency requirements, you may be limited here.

Simplyblock, Elastic Block Storage for the Cloud-Age

Simplyblock is built from the ground up to break-free of the typical cloud limitations and provide sub-millisecond predictable latency, virtually unlimited IOPS, while empowering you with scalability and a minimum five9s (99.999%) availability. Simple simplyblock setup with one storage node being connected to from three Kubernetes worker nodes via the NVMe over TCP protocol, offering virtual NVMe storage volumes

To overcome your typical cloud block storage service limitations, simplyblock implements a transparent and dynamic pooling of cloud-based block storage volumes, combining the individual drives into one large storage pool.

In its simplest form, block storage is pooled in a single, but separated storage node (typically a virtual machine). From this pooled storage, which can be considered a single, large virtual disk (or stripe), we carve out logical volumes. These logical volumes can differ in capacity and their particular performance characteristics (IOPS, throughput). All logical volumes are thin-provisioned, thus making more efficient use of raw disk space available to you.

On the client-side, meaning Linux and Windows, no additional drivers are required. Simplyblock’s logical volumes are exported as NVMe devices and use the NVMe over Fabrics industry standard for fast SSD storage, hence the NVMe initiator protocol for NVMe over TCP is already part of the operating system kernels on Linux and the latest versions of Windows Server. Simplyblock logical volumes are designed for ease of use and the out of the box experience, simply by partitioning them and formatting them with any operating-system specific file system.

Additional data services include instant snapshotting of volumes (for fast backup) and instant cloning (for speed and storage efficiency) as well as storage replication across availability zones (for disaster recovery purposes). All of this is powered by the copy-on-write nature of the simplyblock storage engine.

Last but not least, logical volumes can also be multi-attached to multiple compute instances.

More importantly though, simplyblock has its own Kubernetes CSI driver for automated container storage lifecycle management under the Kubernetes ecosystem.

Scaling out the Simplyblock Storage Pool

If the processing power of a single storage node isn’t sufficient anymore, or high-availability is required, you will use a cluster. When operating as a cluster, multiple storage nodes are combined as a single virtual storage pool and compute instances are connected to all of them.

Simplyblock cluster being connected to Kubernetes workers through multi-pathing

In this scenario, a transparent online fail-over mechanism takes care of switching the connection of logical volumes from one node to another in case of connection issues. This mechanism (NVMe multi-pathing with ANA) is already built into the operating system kernels of Linux and Windows Server, therefore, no additional software is required on the clients.

It is important to note that clusters can be expanded by either increasing the attached block storage pools (on the storage nodes) or by adding additional storage nodes to the cluster. This expansion can happen online, doesn’t require any downtime, and eventually results in an automatically re-balanced storage cluster (background operation).

Simplyblock and Microsecond Latency

When double-digit microsecond latency is required, simplyblock can utilize client-local NVMe disks as caches.

Simplyblock cluster with client-local caches In this case simplyblock can boost read IOPS and decrease read access latency to below 100 microseconds. To achieve this you configure the local NVMe devices as a write-through (read) cache. Simplyblock client-local caches are deployed as containers. In a typical Kubernetes environment, this is done as part of CSI driver deployment via the helm chart. Caches are transparent to the compute instances and containers and look like any access to a local NVMe storage. They are, however, managed as part of the simplyblock cluster.

Simplyblock as Hyper-converged instead of Disaggregated

Simplyblock running in hyper-converged mode, alongside other services

If you have sufficient spare capacity (CPU, RAM resources) on your compute instances and don’t want to deploy additional, separated storage nodes, a hyper-converged setup can be chosen as well. A hyper-converged setup is more cost-efficient as no additional virtual server is required.

On the other hand, resources are shared between services consuming storage and the simplyblock storage engine. While this is isn’t necessarily a problem, it requires some additional capacity planning on your end. Simplyblock generally recommends a disaggregated setup where storage and compute resources are strictly separated.

Simplyblock: Scalable Elastic Block Storage Made Easy

No matter what deployment strategy you choose, storage nodes are connected to simplyblock’s the hosted and managed control plane. The control plane is highly scalable and can serve thousands of storage clusters. That said, if you deploy multiple clusters, the management infrastructure is shared between all of them.

Likewise, all of the shown deployment options are realized using different deployment configurations of the same core components. Additionally, none of the configurations requires special software on the client side.

Anyhow, simplyblock enables you to build your own elastic block storage, overcoming the shortcomings of typical cloud provider offered services, such as Amazon EBS. Simplyblock provides advantages as overcome the (free) IOPS limit with a single volumes benefitting from 100k or more free IOPS (compared to e.g. 3,000 for aws gp3) reach read access latency lower than 100 microseconds multi-attach a volume to many instances replicate a volume across availability zones (synchronous and asynchronous replication) and bring down the cost of capacity / increase storage efficiency by multiple times via thin provisioning and copy-on-write clones scale the cluster according to your needs with zero downtime cluster extension

If you want to learn more about simplyblock see why simplyblock, or do you want to get started right away ?

The post Block Storage Volume Pooling for the Cloud-Age appeared first on simplyblock.

]]>
Simple simplyblock setup with one storage node being connected to from three Kubernetes worker nodes via the NVMe over TCP protocol, offering virtual NVMe storage volumes Simplyblock cluster being connected to Kubernetes workers through multi-pathing Simplyblock cluster with client-local caches Simplyblock running in hyper-converged mode, alongside other services
How the CSI (Container Storage Interface) Works https://www.simplyblock.io/blog/how-the-csi-container-storage-interface-works/ Fri, 29 Mar 2024 12:13:27 +0000 https://www.simplyblock.io/?p=302 If you work with persistent storage in Kubernetes, maybe you’ve seen articles about how to migrate from in-tree to CSI volumes, but aren’t sure what all the fuss is about? Or perhaps you’re trying to debug a stuck VolumeAttachment that won’t unmount from a node, holding up your important StatefulSet rollout? A clear understanding of […]

The post How the CSI (Container Storage Interface) Works appeared first on simplyblock.

]]>
If you work with persistent storage in Kubernetes, maybe you’ve seen articles about how to migrate from in-tree to CSI volumes, but aren’t sure what all the fuss is about? Or perhaps you’re trying to debug a stuck VolumeAttachment that won’t unmount from a node, holding up your important StatefulSet rollout? A clear understanding of what the Container Storage Interface (or CSI for short) is and how it works will give you confidence when dealing with persistent data in Kubernetes, allowing you to answer these questions and more!

Editorial: This blog post is written by a guest author, Steven Sklar from QuestDB. It appeared first on his private blog at sklar.rocks. We appreciate his contributions to the Kubernetes ecosystem and wanted to thank him for letting us repost his article. Steven, you rock! 🔥

The Container Storage Interface is an API specification that enables developers to build custom drivers which handle the provisioning, attaching, and mounting of volumes in containerized workloads. As long as a driver correctly implements the CSI API spec, it can be used in any supported Container Orchestration system, like Kubernetes. This decouples persistent storage development efforts from core cluster management tooling, allowing for the rapid development and iteration of storage drivers across the cloud native ecosystem.

In Kubernetes, the CSI has replaced legacy in-tree volumes with a more flexible means of managing storage mediums. Previously, in order to take advantage of new storage types, one would have had to upgrade an entire cluster’s Kubernetes version to access new PersistentVolume API fields for a new storage type. But now, with the plethora of independent CSI drivers available, you can add any type of underlying storage to your cluster instantly, as long as there’s a driver for it.

But what if existing drivers don’t provide the features that you require and you want to build a new custom driver? Maybe you’re concerned about the ramifications of migrating from in-tree to CSI volumes? Or, you simply want to learn more about how persistent storage works in Kubernetes? Well, you’re in the right place! This article will describe what the CSI is and detail how it’s implemented in Kubernetes.

It’s APIs all the way down

Like many things in the Kubernetes ecosystem, the Container Storage Interface is actually just an API specification. In the container-storage-interface/spec GitHub repo, you can find this spec in 2 different versions:

  1. A protobuf file that defines the API schema in gRPC terms
  2. A markdown file that describes the overall system architecture and goes into detail about each API call

What I’m going to discuss in this section is an abridged version of that markdown file, while borrowing some nice ASCII diagrams from the repo itself!

Architecture

A CSI Driver has 2 components, a Node Plugin and a Controller Plugin. The Controller Plugin is responsible for high-level volume management; creating, deleting, attaching, detatching, snapshotting, and restoring physical (or virtualized) volumes. If you’re using a driver built for a cloud provider, like EBS on AWS, the driver’s Controller Plugin communicates with AWS HTTPS APIs to perform these operations. For other storage types like NFS, EXSI, ZFS, and more, the driver sends these requests to the underlying storage’s API endpoint, in whatever format that API accepts.

Editorial: The same is true for simplyblock. Simplyblock’s CSI driver implements all necessary, and following described calls, making it a perfect drop-in replacement for Amazon EBS. If you want to learn more read: Why simplyblock.

On the other hand, the Node Plugin is responsible for mounting and provisioning a volume once it’s been attached to a node. These low-level operations usually require privileged access, so the Node Plugin is installed on every node in your cluster’s data plane, wherever a volume could be mounted.

The Node Plugin is also responsible for reporting metrics like disk usage back to the Container Orchestration system (referred to as the “CO” in the spec). As you might have guessed already, I’ll be using Kubernetes as the CO in this post! But what makes the spec so powerful is that it can be used by any container orchestration system, like Nomad for example, as long as it abides by the contract set by the API guidelines.

The specification doc provides a few possible deployment patterns, so let’s start with the most common one.

CO "Master" Host
+-------------------------------------------+
|                                           |
|  +------------+           +------------+  |
|  |     CO     |   gRPC    | Controller |  |
|  |            +----------->   Plugin   |  |
|  +------------+           +------------+  |
|                                           |
+-------------------------------------------+

CO "Node" Host(s)
+-------------------------------------------+
|                                           |
|  +------------+           +------------+  |
|  |     CO     |   gRPC    |    Node    |  |
|  |            +----------->   Plugin   |  |
|  +------------+           +------------+  |
|                                           |
+-------------------------------------------+ 

Since the Controller Plugin is concerned with higher-level volume operations, it does not need to run on a host in your cluster’s data plane. For example, in AWS, the Controller makes AWS API calls like ec2:CreateVolume, ec2:AttachVolume, or ec2:CreateSnapshot to manage EBS volumes. These functions can be run anywhere, as long as the caller is authenticated with AWS. All the CO needs is to be able to send messages to the plugin over gRPC. So in this architecture, the Controller Plugin is running on a “master” host in the cluster’s control plane.

On the other hand, the Node Plugin must be running on a host in the cluster’s data plane. Once the Controller Plugin has done its job by attaching a volume to a node for a workload to use, the Node Plugin (running on that node) will take over by mounting the volume to a well-known path and optionally formatting it. At this point, the CO is free to use that path as a volume mount when creating a new containerized process; so all data on that mount will be stored on the underlying volume that was attached by the Controller Plugin. It’s important to note that the Container Orchestrator, not the Controller Plugin, is responsible for letting the Node Plugin know that it should perform the mount.

Volume Lifecycle

The spec provides a flowchart of basic volume operations, also in the form of a cool ASCII diagram:

   CreateVolume +------------+ DeleteVolume
 +------------->|  CREATED   +--------------+
 |              +---+----^---+              |
 |       Controller |    | Controller       v
+++         Publish |    | Unpublish       +++
|X|          Volume |    | Volume          | |
+-+             +---v----+---+             +-+
                | NODE_READY |
                +---+----^---+
               Node |    | Node
            Publish |    | Unpublish
             Volume |    | Volume
                +---v----+---+
                | PUBLISHED  |
                +------------+

Mounting a volume is a synchronous process: each step requires the previous one to have run successfully. For example, if a volume does not exist, how could we possibly attach it to a node?

When publishing (mounting) a volume for use by a workload, the Node Plugin first requires that the Controller Plugin has successfully published a volume at a directory that it can access. In practice, this usually means that the Controller Plugin has created the volume and attached it to a node. Now that the volume is attached, it’s time for the Node Plugin to do its job. At this point, the Node Plugin can access the volume at its device path to create a filesystem and mount it to a directory. Once it’s mounted, the volume is considered to be published and it is ready for a containerized process to use. This ends the CSI mounting workflow.

Continuing the AWS example, when the Controller Plugin publishes a volume, it calls ec2:CreateVolume followed by ec2:AttachVolume. These two API calls allocate the underlying storage by creating an EBS volume and attaching it to a particular instance. Once the volume is attached to the EC2 instance, the Node Plugin is free to format it and create a mount point on its host’s filesystem.

Here is an annotated version of the above volume lifecycle diagram, this time with the AWS calls included in the flow chart.

   CreateVolume +------------+ DeleteVolume
 +------------->|  CREATED   +--------------+
 |              +---+----^---+              |
 |       Controller |    | Controller       v
+++         Publish |    | Unpublish       +++
|X|          Volume |    | Volume          | |
+-+                 |    |                 +-+
                    |    |
  |    | 
                    |    |
  |    | 
                    |    |
                +---v----+---+
                | NODE_READY |
                +---+----^---+
               Node |    | Node
            Publish |    | Unpublish
             Volume |    | Volume
                +---v----+---+
                | PUBLISHED  |
                +------------+

If a Controller wants to delete a volume, it must first wait for the Node Plugin to safely unmount the volume to preserve data and system integrity. Otherwise, if a volume is forcibly detached from a node before unmounting it, we could experience bad things like data corruption. Once the volume is safely unpublished (unmounted) by the Node Plugin, the Controller Plugin would then call ec2:DetachVolume to detach it from the node and finally ec2:DeleteVolume to delete it, assuming that the you don’t want to reuse the volume elsewhere.

What makes the CSI so powerful is that it does not prescribe how to publish a volume. As long as your driver correctly implements the required API methods defined in the CSI spec, it will be compatible with the CSI and by extension, be usable in COs like Kubernetes and Nomad.

Running CSI Drivers in Kubernetes

What I haven’t entirely make clear yet is why the Controller and Node Plugins are plugins themselves! How does the Container Orchestrator call them, and where do they plug into?

Well, the answer depends on which Container Orchestrator you are using. Since I’m most familiar with Kubernetes, I’ll be using it to demonstrate how a CSI driver interacts with a CO.

Deployment Model

Since the Node Plugin, responsible for low-level volume operations, must be running on every node in your data plane, it is typically installed using a DaemonSet. If you have heterogeneous nodes and only want to deploy the plugin to a subset of them, you can use node selectors, affinities, or anti-affinities to control which nodes receive a Node Plugin Pod. Since the Node Plugin requires root access to modify host volumes and mounts, these Pods will be running in privileged mode. In this mode, the Node Plugin can escape its container’s security context to access the underlying node’s filesystem when performing mounting and provisioning operations. Without these elevated permissions, the Node Plugin could only operate inside of its own containerized namespace without the system-level access that it requires to provision volumes on the node.

The Controller Plugin is usually run in a Deployment because it deals with higher-level primitives like volumes and snapshots, which don’t require filesystem access to every single node in the cluster. Again, lets think about the AWS example I used earlier. If the Controller Plugin is just making AWS API calls to manage volumes and snapshots, why would it need access to a node’s root filesystem? Most Controller Plugins are stateless and highly-available, both of which lend themselves to the Deployment model. The Controller also does not need to be run in a privileged context.

Event-Driven Sidecar Pattern

Now that we know how CSI plugins are deployed in a typical cluster, it’s time to focus on how Kubernetes calls each plugin to perform CSI-related operations. A series of sidecar containers, that are registered with the Kubernetes API server to react to different events across the cluster, are deployed alongside each Controller and Node Plugin. In a way, this is similar to the typical Kubernetes controller pattern, where controllers react to changes in cluster state and attempt to reconcile the current cluster state with the desired one.

There are currently 6 different sidecars that work alongside each CSI driver to perform specific volume-related operations. Each sidecar registers itself with the Kubernetes API server and watches for changes in a specific resource type. Once the sidecar has detected a change that it must act upon, it calls the relevant plugin with one or more API calls from the CSI specification to perform the desired operations.

Controller Plugin Sidecars

Here is a table of the sidecars that run alongside a Controller Plugin:

Sidecar NameK8s Resources WatchedCSI API Endpoints Called
external-provisionerPersistentVolumeClaimCreateVolume, DeleteVolume
external-attacherVolumeAttachmentController(Un)PublishVolume
external-snapshotterVolumeSnapshot (Content)CreateSnapshot, DeleteSnapshot
external-resizerPersistentVolumeClaimControllerExpandVolume

How do these sidecars work together? Let’s use an example of a StatefulSet to demonstrate. In this example, we’re dynamically provisioning our PersistentVolumes (PVs) instead of mapping PersistentVolumeClaims (PVCs) to existing PVs. We start at the creation of a new StatefulSet with a VolumeClaimTemplate.

---
apiVersion: apps/v1
kind: StatefulSet
spec:
  volumeClaimTemplates:
  - metadata:
      name: www
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "my-storage-class"
      resources:
        requests:
         storage: 1Gi

Creating this StatefulSet will trigger the creation of a new PVC based on the above template. Once the PVC has been created, the Kubernetes API will notify the external-provisioner sidecar that this new resource was created. The external-provisioner will then send a CreateVolume message to its neighbor Controller Plugin over gRPC. From here, the CSI driver’s Controller Plugin takes over by processing the incoming gRPC message and will create a new volume based on its custom logic. In the AWS EBS driver, this would be an ec2:CreateVolume call.

At this point, the control flow moves to the built-in PersistentVolume controller, which will create a matching PV and bind it to the PVC. This allows the StatefulSet’s underlying Pod to be scheduled and assigned to a Node.

Here, the external-attacher sidecar takes over. It will be notified of the new PV and call the Controller Plugin’s ControllerPublishVolume endpoint, mounting the volume to the StatefulSet’s assigned node. This would be the equivalent to ec2:AttachVolume in AWS.

At this point, we have an EBS volume that is mounted to an EC2 instance, all based on the creation of a StatefulSet, PersistentVolumeClaim, and the work of the AWS EBS CSI Controller Plugin.

Node Plugin Sidecars

There is only one unique sidecar that is deployed alongside the Node Plugin; the node-driver-registrar. This sidecar, running as part of a DaemonSet, registers the Node Plugin with a Node’s kubelet. During the registration process, the Node Plugin will inform the kubelet that it is able to mount volumes using the CSI driver that it is part of. The kubelet itself will then wait until a Pod is scheduled to its corresponding Node, at which point it is then responsible for making the relevant CSI calls ( PublishVolume ) to the Node Plugin over gRPC.

Common Sidecars

There is also a livenessprobe sidecar that runs in both the Container and Node Plugin Pods that monitors the health of the CSI driver and reports back to the Kubernetes Liveness Probe mechanism.

Communication over Sockets

How do these sidecars communicate with the Controller and Node Plugins? Over gRPC through a shared socket! So each sidecar and plugin contains a volume mount pointing to a single unix socket.

CSI Controller Deployment

This diagram highlights the pluggable nature of CSI Drivers. To replace one driver with another, all you have to do is simply swap the CSI Driver container with another and ensure that it’s listening to the unix socket that the sidecars are sending gRPC messages to. Becase all drivers advertise their own different capabilities and communicate over the shared CSI API contract, it’s literally a plug-and-play solution.

Conclusion

In this article, I only covered the high-level concepts of the Container Storage Interface spec and implementation in Kubernetes. While hopefully it has provided a clearer understanding of what happens once you install a CSI driver, writing one requires significant low-level knowledge of both your nodes’ operating system(s) and the underlying storage mechanism that your driver is implementing. Luckily, CSI drivers exist for a variety of cloud providers and distributed storage solutions, so it’s likely that you can find a CSI driver that already fulfills your requirements. But it always helps to know what’s happening under the hood in case your particular driver is misbehaving.

If this article interests you and you want to learn more about the topic, please let me know! I’m always happy to answer questions about CSI Drivers, Kubernetes Operators, and a myriad of other DevOps-related topics.

The post How the CSI (Container Storage Interface) Works appeared first on simplyblock.

]]>
CSI Controller Deployment