Cloud Spending Archives | simplyblock

AWS Cost Management: Strategies for Right-Sizing Storage in Dynamic Environments

Chris Engelbert — Tue, 10 Dec 2024 13:09:23 +0000

For companies that use Amazon Web Services (AWS) for storage, having firm control over costs is key. Mismanaging storage can directly lead to soaring expenses and unexpected losses in a fast-paced world with fluctuating workloads. A reliable and effective way to keep costs in check for a business is to implement efficient storage solutions and in the right size. This can ensure solid growth and performance, making it a plan that works for AWS-reliant businesses.

This article will look at several good ways to manage storage costs on AWS and give you practical tips to size storage in changing environments.

What Does Right-Sizing Mean for AWS Storage?

Choosing the right size in AWS means going with suitable storage solution types and scales based on the particular needs of a business. It’s a major strategic factor that can help you achieve and maintain a competitive lead in the market by avoiding unnecessary costs and putting a stop to overspending. All it takes is actively monitoring storage policies, checking how much storage usually goes unused, and proactively making prompt changes accordingly.

AWS offers various storage types, including Amazon S3 for object storage, Amazon EBS for block storage, and Amazon EFS for file storage, each suitable for different applications. By right-sizing, businesses can avoid paying for idle storage resources and only use what’s necessary.

AWS Storage Services and The Cost-Savings They Offer

(AWS Amazon)

With AWS, you get a few storage options at different price points that you can choose from based on your business needs

Amazon S3 (Simple Storage Service) offers an incredible amount of scalability, allowing growing businesses to adapt well. It works well for unorganized data and uses a pay-as-you-go system, which keeps costs down when storage needs change.
Amazon EBS (Elastic Block Store) gives lasting block storage for EC2 instances. EBS prices change based on the type of volume, its size, and input/output actions, so you need to watch it to keep expenses in check.
Amazon EFS (Elastic File System) is a managed file storage service that grows on its own, which helps applications that need shared storage. While it’s convenient, data costs can rise as volume grows.

To reduce overall cloud spending, it’s essential to understand which storage type suits your workloads and to handle these services.

Editor’s note: If you’re looking for ways to consolidate your Amazon EBS volumes, simplyblock got you covered.

Ways to Optimize Storage Size on AWS

1. Use Storage Class Levels

Amazon S3 offers various storage classes with different costs and speeds. You can save money by placing data in the appropriate class based on access frequency and retrieval speed needs. Here’s a breakdown:

S3 Standard is best for accessed data, but it’s more expensive.
S3 Infrequent Access (IA) is cheaper for less-used data that still needs retrieval.
S3 Glacier and Glacier Deep Archive are the least expensive options for long-term accessed data.

You can cut costs without losing access by reviewing and moving data to suitable storage classes based on usage patterns.

(Pixabay)

2. Set Up Data Lifecycle Rules

Managing data lifecycles helps companies with changing storage needs to save money. AWS lets you make rules to move, store, or delete data based on certain conditions. With S3 lifecycle policies, you can set up your data to move from S3 Standard to S3 IA and then to Glacier or be removed after a set time.

3. Use Automatic Tracking and Warnings

AWS offers tools to keep an eye on storage use and costs, like AWS CloudWatch and AWS Budgets. These tools help spot wasted resources, odd spikes in use, or costs that go over your set budget. Setting up warnings through AWS Budgets can tell you when you’re close to your budget limit and stop extra costs before they pile up.

4. Make EBS Volumes the Right Size

(Pixabay)

Elastic Block Store (EBS) volumes often waste resources when they’re bigger than needed. Checking EBS use often can show volumes that aren’t used much or at all. AWS has a tool called EBS Right Sizing Recommendation that helps find volumes you can make smaller without slowing things down.

EBS provides Volume Types such as General Purpose (gp3), Provisioned IOPS (io2), and Throughput Optimized (st1). Picking the right volume type for each workload, plus sizing, cuts costs. You pay for the storage performance and capacity you need.

Editor’s note: Save up to 80% on your high-performance AWS storage costs.

Smart Ways to Cut AWS Storage Costs Further

(Unsplash)

1. Use Reserved Instances and Savings Plans

For workloads you can predict, think about AWS Reserved Instances (RIs) and Savings Plans. People often link these to EC2 instances, but they also offer cheap options for related EBS storage. RIs and Savings Plans let you promise to use a certain amount, giving you lower rates if you commit for one or three years. This works best for steady workloads that need the same amount of storage over time, where you’re less likely to buy too much.

Savings plans shouldn’t just be limited to storage solutions. You can also reconsider other costs, such as switching to a cheap web hosting service that still meets your business needs.

2. Make Multi-Region Storage More Efficient

AWS gives you ways to copy your data across different regions. This makes your data more secure and helps you recover if something goes wrong. But storing data in multiple regions can cost a lot because of copying and moving data between regions. To cut these costs, you can look at how people use your data and put it in regions close to most of your users.

3. Consider Spot Instances for Short-Term Storage Needs

Spot Instances offer a more affordable option to handle tasks that can cope with interruptions. You can use short-term storage on Spot Instances for less crucial brief projects where storage requirements fluctuate. When you combine Spot Instances with Amazon EBS or S3, you gain flexibility and cut costs. However, remember that AWS has the right to reclaim Spot Instances at any moment. This makes them unsuitable for critical or high-availability tasks.

Summing Up: Managing AWS Storage Costs

(Unsplash)

Smart AWS cost control begins with a hands-on strategy to size storage. This includes picking the right S3 storage types, setting up lifecycle rules, keeping an eye on EBS use, or taking advantage of reserved options. These methods can help you keep a lid on your storage bills.

When you check usage and put these tried-and-true tips into action, you’ll be in a better position to handle your AWS expenses. At the same time, you’ll keep the ability to scale and the reliability your workloads need. In a cloud world where storage costs can get out of hand, clever management will pay off. It’ll help your company stay nimble and budget-friendly.

The post AWS Cost Management: Strategies for Right-Sizing Storage in Dynamic Environments appeared first on simplyblock.

RDS vs. EKS: The True Cost of Database Management

Rob Pankow — Thu, 12 Sep 2024 23:21:23 +0000

Databases can make up a significant portion of the costs for a variety of businesses and enterprises, and in particular for SaaS, Fintech, or E-commerce & Retail verticals. Choosing the right database management solution can make or break your business margins. But have you ever wondered about the true cost of your database management? Is your current solution really as cost-effective as you think? Let’s dive deep into the world of database management and uncover the hidden expenses that might be eating away at your bottom line.

The Database Dilemma: Managed Services or Self-Managed?

The first crucial decision comes when choosing the operating model for your databases: should you opt for managed services like AWS RDS or take the reins yourself with a self-managed solution on Kubernetes? It’s not just about the upfront costs – there’s a whole iceberg of expenses lurking beneath the surface.

The Allure of Managed Services

At first glance, managed services like AWS RDS seem to be a no-brainer. They promise hassle-free management, automatic updates, and round-the-clock support. But is it really as rosy as it seems?

The Visible Costs

Subscription Fees : You’re paying for the convenience, and it doesn’t come cheap.
Storage Costs : Every gigabyte counts, and it adds up quickly.
Data Transfer Fees : Moving data in and out? Be prepared to open your wallet.

The Hidden Expenses

Overprovisioning : Are you paying for more than you are actually using?
Personnel costs : Using RDS and assuming that you don’t need to understand databases anymore? Surprise! You still need team that will need to configure the database and set it up for your requirements.
Performance Limitations : When you hit a ceiling, scaling up can be costly.
Vendor Lock-in : Switching providers? That’ll cost you in time and money.
Data Migration : Moving data between services can cost a fortune.
Backup and Storage : Those “convenient” backups? They’re not free. In addition, AWS RDS does not let you plug in other storage solution than AWS-native EBS volumes, which can get quite expensive if your database is IO-intensive

The Power of Self-Managed Kubernetes Databases

On the flip side, managing your databases on Kubernetes might seem daunting at first. But let’s break it down and see where you could be saving big.

Initial Investment

Learning Curve : Yes, there’s an upfront cost in time and training. You need to have on your team engineers that are comfortable with Kubernetes or Amazon EKS.
Setup and Configuration : Getting things right takes effort, but it pays off.

Long-term Savings

Flexibility : Scale up or down as needed, without overpaying.
Multi-Cloud Freedom : Avoid vendor lock-in and negotiate better rates.
Resource Optimization : Use your hardware efficiently across workloads.
Resource Sharing : Kubernetes lets you efficiently allocate resources.
Open-Source Tools : Leverage free, powerful tools for monitoring and management.
Customization : Tailor your setup to your exact needs, no compromise.

Where are the Savings Coming from when using Kubernetes for your Database Management?

In a self-managed Kubernetes environment, you have greater control over resource allocation, leading to improved utilization and efficiency. Here’s why:

a) Dynamic Resource Allocation : Kubernetes allows for fine-grained control over CPU and memory allocation. You can set resource limits and requests at the pod level, ensuring databases only use what they need. Example: During off-peak hours, you can automatically scale down resources, whereas in managed services, you often pay for fixed resources 24/7.

b) Bin Packing : Kubernetes scheduler efficiently packs containers onto nodes, maximizing resource usage. This means you can run more workloads on the same hardware, reducing overall infrastructure costs. Example: You might be able to run both your database and application containers on the same node, optimizing server usage.

c) Avoid Overprovisioning : With managed services, you often need to provision for peak load at all times. In Kubernetes, you can use Horizontal Pod Autoscaling to add resources only when needed. Example: During a traffic spike, you can automatically add more database replicas, then scale down when the spike ends.

d) Resource Quotas : Kubernetes allows setting resource quotas at the namespace level, preventing any single team or application from monopolizing cluster resources. This leads to more efficient resource sharing across your organization.

Self-managed Kubernetes databases can also significantly reduce data transfer costs compared to managed services. Here’s how:

a) Co-location of Services : In Kubernetes, you can deploy your databases and application services in the same cluster. This reduces or eliminates data transfer between zones or regions, which is often charged in managed services. Example: If your app and database are in the same Kubernetes cluster, inter-service communication doesn’t incur data transfer fees.

b) Efficient Data Replication : Kubernetes allows for more control over how and when data is replicated. You can optimize replication strategies to reduce unnecessary data movement. Example: You might replicate data during off-peak hours or use differential backups to minimize data transfer.

c) Avoid Provider Lock-in : Managed services often charge for data egress, especially when moving to another provider. With self-managed databases, you have the flexibility to choose the most cost-effective data transfer methods. Example: You could use direct connectivity options or content delivery networks to reduce data transfer costs between regions or clouds.

d) Optimized Backup Strategies : Self-managed solutions allow for more control over backup processes. You can implement incremental backups or use deduplication techniques to reduce the amount of data transferred for backups. Example: Instead of full daily backups (common in managed services), you might do weekly full backups with daily incrementals, significantly reducing data transfer.

e) Multi-Cloud Flexibility : Self-managed Kubernetes databases allow you to strategically place data closer to where it’s consumed. This can reduce long-distance data transfer costs, which are often higher. Example: You could have a primary database in one cloud and read replicas in another, optimizing for both performance and cost.

By leveraging these strategies in a self-managed Kubernetes environment, organizations can significantly optimize their resource usage and reduce data transfer costs, leading to substantial savings compared to typical managed database services.

Breaking down the Numbers: a Cost Comparison between PostgreSQL on RDS vs EKS

Let’s get down to brass tacks. How do the costs really stack up? We’ve crunched the numbers for a small Postgres database between using managed RDS service and hosting on Kubernetes. For Kubernetes we are using EC2 instances with local NVMe disks that are managed on EKS and simplyblock as storage orchestration layer.

Scenario: 3TB Postgres Database with High Availability (3 nodes) and Single AZ Deployment

Managed Service (AWS RDS) using three Db.m4.2xlarge on Demand with Gp3 Volumes

Available resources

Costs

Available vCPU: 8 Available Memory: 32 GiB Available Storage: 3TB Available IOPS: 20,000 per volume Storage latency: 1-2 milliseconds

Monthly Total Cost: $2511,18
3-Year Total: $2511,18 * 36 months = $90,402

Editorial: See the pricing calculator for Amazon RDS for PostgreSQL

Self-Managed on Kubernetes (EKS) using three i3en.xlarge Instances on Demand

Available resources

Costs

Available vCPU: 12 Available Memory: 96 GiB Available

Storage: 3.75TB (7.5TB raw storage with assumed 50% data protection overhead for simplyblock) Available IOPS: 200,000 per volume (10x more than with RDS) Storage latency: below 200 microseconds (local NVMe disk orchestrated by simplyblock)

Monthly instance cost: $989.88 Monthly storage orchestration cost (e.g. Simplyblock): $90 (3TB x $30/TB)

Monthly EKS cost: $219 ($73 per cluster x 3)

Monthly Total Cost: $1298.88

3-Year Total: $1298.88 x 36 months = $46,759 Base Savings : $90,402 – $46,759 = $43,643 (48% over 3 years)

That’s a whopping 48% saving over three years! But wait, there’s more to consider. We have made some simplistic assumptions to estimate additional benefits of self-hosting to showcase the real potential of savings. While the actual efficiencies may vary from company to company, it should at least give a good understanding of where the hidden benefits might lie.

Additional Benefits of Self-Hosting (Estimated Annual Savings)

Resource optimization/sharing : Assumption: 20% better resource utilization (assuming existing Kubernetes clusters) Estimated Annual Saving: 20% x 989.88 x 12= $2,375
Reduced Data Transfer Costs : Assumption: 50% reduction in data transfer fees Estimated Annual Saving: $2,000
Flexible Scaling : Avoid over-provisioning during non-peak times Estimated Annual Saving: $3,000
Multi-Cloud Strategy : Ability to negotiate better rates across providers Estimated Annual Saving: $5,000
Open-Source Tools : Reduced licensing costs for management tools Estimated Annual Saving: $4,000

Disaster Recovery Insights

RTO (Recovery Time Objective) Improvement : Self-managed: Potential for 40% faster recovery Estimated value: $10,000 per hour of downtime prevented
RPO (Recovery Point Objective) Enhancement : Self-managed: Achieve near-zero data loss Estimated annual value: $20,000 in potential data loss prevention

Total Estimated Annual Benefit of Self-Hosting

Self-hosting pays off. Here is the summary of benefits: Base Savings: $8,400/year Additional Benefits: $15,920/year Disaster Recovery Improvement: $30,000/year (conservative estimate)

Total Estimated Annual Additional Benefit: $54,695

Total Estimated Additional Benefits over 3 Years: $164,085

Note: These figures are estimates and can vary based on specific use cases, implementation efficiency, and negotiated rates with cloud providers.

Beyond the Dollar Signs: the Real Value Proposition

Money talks, but it’s not the only factor in play. Let’s look at the broader picture.

Performance and Scalability

With self-managed Kubernetes databases, you’re in the driver’s seat. Need to scale up for a traffic spike? Done. Want to optimize for a specific workload? You’ve got the power.

Security and Compliance

Think managed services have the upper hand in security? Think again. With self-managed solutions, you have granular control over your security measures. Plus, you’re not sharing infrastructure with unknown entities.

Innovation and Agility

In the fast-paced tech world, agility is king. Self-managed solutions on Kubernetes allow you to adopt cutting-edge technologies and practices without waiting for your provider to catch up.

Is the Database on Kubernetes for Everyone?

Definitely not. While self-managed databases on Kubernetes offer significant benefits in terms of cost savings, flexibility, and control, they’re not a one-size-fits-all solution. Here’s why:

Expertise: Managing databases on Kubernetes demands a high level of expertise in both database administration and Kubernetes orchestration. Not all organizations have this skill set readily available. Self-management means taking on responsibilities like security patching, performance tuning, and disaster recovery planning. For smaller teams or those with limited DevOps resources, this can be overwhelming.
Scale of operations : For simple applications with predictable, low-to-moderate database requirements, the advanced features and flexibility of Kubernetes might be overkill. Managed services could be more cost-effective in these scenarios. Same applies for very small operations or startups in early stages – the cost benefits of self-managed databases on Kubernetes might not outweigh the added complexity and resource requirements.

While database management on Kubernetes offers compelling advantages, organizations must carefully assess their specific needs, resources, and constraints before making the switch. For many, especially larger enterprises or those with complex, dynamic database requirements, the benefits can be substantial. However, others might find that managed services better suit their current needs and capabilities.

Bonus: Simplyblock

There is one more bonus benefit that you get when running your databases in Kubernetes – you can add simplyblock as your storage orchestration layer behind a single CSI driver that will automatically and intelligently serve storage service of your choice. Do you need fast NVMe cache for some hot transactional data with random IO but don’t want to keep it hot forever? We’ve got you covered!

Simplyblock is an innovative cloud-native storage product, which runs on AWS, as well as other major cloud platforms. Simplyblock virtualizes, optimizes, and orchestrates existing cloud storage services (such as Amazon EBS or Amazon S3) behind a NVMe storage interface and a Kubernetes CSI driver. As such, it provides storage for compute instances (VMs) and containers. We have optimized for IO-heavy database workloads, including OLTP relational databases, graph databases, non-relational document databases, analytical databases, fast key-value stores, vector databases, and similar solutions.

This optimization has been built from the ground up to orchestrate a wide range of database storage needs, such as reliable and fast (high write-IOPS) storage for write-ahead logs and support for ultra-low latency, as well as high IOPS for random read operations. Simplyblock is highly configurable to optimally serve the different database query engines.

Some of the key benefits of using simplyblock alongside your stateful Kubernetes workloads are:

Cost Reduction, Margin Increase: Thin provisioning, compression, deduplication of hot-standby nodes, and storage virtualization with multiple tenants increases storage usage while enabling gradual storage increase.
Easy Scalability of Storage: Single node databases require highly scalable storage (IOPS, throughput, capacity) since data cannot be distributed to scale. Simplyblock pools either Amazon EBS volumes or local instance storage from EC2 virtual machines and provides a scalable and cost effective storage solution for single node databases.
Enables Database Branching Features: Using instant snapshots and clones, databases can be quickly branched out and provided to customers. Due to copy-on-write, the storage usage doesn’t increase unless the data is changed on either the primary or branch. Customers could be charged for “additional storage” though.
Enhances Security: Using an S3-based streaming of a recovery journal, the database can be quickly recovered from full AZ and even region outages. It also provides protection against typical ransomware attacks where data gets encrypted by enabling Point-in-Time-Recovery down to a few hundred milliseconds granularity.

Conclusion: the True Cost Revealed

When it comes to database management, the true cost goes far beyond the monthly bill. By choosing a self-managed Kubernetes solution, you’re not just saving money – you’re investing in flexibility, performance, and future-readiness. The savings and benefits will be always use-case and company-specific but the general conclusion shall remain unchanged. While operating databases in Kubernetes is not for everyone, for those who have the privilege of such choice, it should be a no-brainer kind of decision.

Is managing databases on Kubernetes complex?

While there is a learning curve, modern tools and platforms like simplyblock significantly simplify the process, often making it more straightforward than dealing with the limitations of managed services. The knowledge acquired in the process can be though re-utilized across different cloud deployments in different clouds.

How can I ensure high availability with self-managed databases?

Kubernetes offers robust features for high availability, including automatic failover and load balancing. With proper configuration, you can achieve even higher availability than many managed services offer, meeting any possible SLA out there. You are in full control of the SLAs.

How difficult is it to migrate from a managed database service to Kubernetes?

While migration requires careful planning, tools and services exist to streamline the process. Many companies find that the long-term benefits far outweigh the short-term effort of migration.

How does simplyblock handle database backups and point-in-time recovery in Kubernetes?

Simplyblock provides automated, space-efficient backup solutions that integrate seamlessly with Kubernetes. Our point-in-time recovery feature allows you to restore your database to any specific moment, offering protection against data loss and ransomware attacks.

Does simplyblock offer support for multiple database types?

Yes, simplyblock supports a wide range of database types including relational databases like PostgreSQL and MySQL, as well as NoSQL databases like MongoDB and Cassandra. Check out our “Supported Technologies” page for a full list of supported databases and their specific features.

The post RDS vs. EKS: The True Cost of Database Management appeared first on simplyblock.

What are AWS Credits and how to get them?

Rob Pankow — Tue, 18 Jun 2024 10:55:00 +0000

AWS Credits are cash credits provided by Amazon Web Services (AWS) to help offset the costs of using AWS cloud services. These credits can be applied to a wide range of AWS services, such as compute, storage or databases, allowing businesses to explore and expand their cloud infrastructure without incurring high upfront costs. AWS Credits are particularly valuable for startups and smaller companies looking to scale efficiently and affordably without incurring cloud infrastructure costs.

How do AWS Credits Work?

AWS credits function as a monetary balance that can be used to pay for AWS services. When you consume AWS resources, the cost is automatically deducted from your available credits until they are exhausted. These credits can cover various AWS services, including compute, storage, and data transfer costs, providing flexibility and significant savings. The lifetime of credits is usually 12 months and they are only valid for AWS services (AWS Marketplace offerings are not eligible for use of AWS Credits).

Free AWS Startup Credits

AWS Startup Credits are specifically designed to support early-stage companies in their cloud journey. These credits are typically part of the AWS Activate Program, which offers tailored packages for startups:

AWS Activate Founders: Designed for bootstrapped and self-funded startups, offering up to $1,000 in credits.
AWS Activate Portfolio: Available to startups in select accelerators, incubators, and VC funds, offering up to $100,000 in credits.

These credits help startups reduce their operational costs while leveraging AWS’s robust cloud infrastructure to innovate and scale. Eligibility criteria for startups applying for AWS Activate Portfolio Credits are:

Have not received funding beyond Series A (meaning that you are a bootstrapped, pre-seed, seed or Series A startup)
Not have exceeded $100,000 in awarded or redeemed AWS credits from AWS Activate
Active AWS account, company website, and LinkedIn page
Company has been incorporated in the last 10 years

How to get free AWS Credits?

There are several ways to obtain AWS credits:

AWS Activate Program: This program is designed for startups and offers various credit packages based on the startup’s stage and needs. It provides up to $100,000 in AWS credits for eligible startups. More information can be found on the AWS Activate page.
VCs & Institutional Investors: most of Tier1 and Tier2 VCs have partnered with AWS to provide credits to their portfolio companies. If your investor doesn’t know about it or hasn’t partnered with AWS yet, it is recommended to encourage them to do so as credits mutually benefit the startup (which gets free credits), as well as the investor themselves. In this way investors can prove their added value to startups while having their investment resources being utilized more efficiently (i.e. startup not having to spend on the cloud costs).
AWS Promotional Credits: AWS often provides promotional credits at conferences, webinars, and events. Attending these can be a great way to earn extra credits.
AWS Competitions and Hackathons: Participating in AWS-sponsored competitions and hackathons can also be a way to earn credits.
AWS Free Tier: New AWS customers can access the AWS Free Tier, which offers limited free usage of certain AWS services for 12 months. Check out the AWS Free Tier page for more details.
Educational Institutions: Students and educators can often get free AWS credits through programs like AWS Educate.
AWS Community Programs: Range of community programs designed to engage and support individuals and organizations working with AWS technologies. AWS Heroes, AWS Community Builders, AWS Cloud Captains, AWS User Group Leaders.
AWS Cloud Credits for Research: Designed to assist researchers and scientists by providing cloud credits that can be used to access AWS cloud services. You need to be a full-time faculty member at accredited research institutions or full-time research staff at accredited research institutions or graduate, post-graduate, or PhD student currently enrolled at accredited research institutions. Students would be awarded up to $5000 of AWS Credits.
AWS Credits for Non Profit: Access to a maximum of $5,000 in AWS Promotional Credit to assist in alleviating costs associated with the adoption of cloud-based solutions, enabling nonprofits to pursue their mission goals without the need for initial investments in physical infrastructure.
Ask for credits someone who works for AWS: Although often not listed as a valid way of getting AWS Credits, asking someone who works for AWS to help you get credits is a really good way to get some extra credits (e.g. in case your VC doesn’t offer you $100,000 and you already used up their initial allocation).

It’s worth noting that you can receive AWS Credits multiple times, from multiple sources and partners. However, AWS credits are not cumulative. That means that if the amount applied exceeds the previous credits, you can only receive the difference.

For example, if you already received $5,000 in AWS Activate credits and applied for $25,000 with a VC partner, you will receive just $20,000 in additional credits. If you later get $100,000 from your new investor, then your credits will be topped with additional $75,000 ($100,000 less $25,000 already received). The lifetime value of credits is usually $100,000, which means that once you use up $100,000 in AWS credits, you can’t get any extra credits anymore (at least through above described channels).

How to Check AWS Credits?

To check your AWS credits balance:

Sign in to your AWS Account: Navigate to the AWS Billing and Cost Management Dashboard.
View Your Credits: In the dashboard, you’ll find a section detailing your available credits, their expiration dates, and usage history.
AWS Cost Explorer: Use AWS Cost Explorer to track how your credits are being used across different services.

Maximizing AWS Credits for Startups

For startups, it is particularly important to remember that AWS credits are usually churning faster than expected. Hence strategic planning is required:

Optimize Resource Usage: Regularly monitor and optimize your resource usage to avoid unnecessary costs.
Leverage AWS Tools: Use AWS tools like AWS Cost Explorer and AWS Budgets to manage and track your spending.
Explore AWS Services: Use credits to experiment with different AWS services that can benefit your startup, such as machine learning, data analytics, or IoT services.
Plan for Growth: Strategically allocate credits to services that will support your startup’s growth and scalability.
Assign Responsibilities: clearly define who is the “budget” owner for AWS credits in your company and make sure that this person monitors usage frequently, as even small inefficiencies can lead to unexpected cloud infrastructure costs. In small startups it should be usually CTO or VP of Engineering who monitors the cloud spend, as it has budget implications.

Fine print with AWS Credits. What to know before applying for AWS Credits?

Although free AWS Credits are often a life-saver for startups, there are some things to consider before applying for credits:

Forecast your usage to make sure you don’t let credits expire. Since credits usually expire after 12 months, it is wise not to apply for maximum available to you amount from very beginning as you might risk credits expiring. Instead, start with $5,000 AWS Activate credits and only afterwards start using credits received from our investors.
Marketplace offerings can help you extend lifetime of credits. While AWS Marketplace spent can’t be covered by credits, some of the offerings on marketplace might significantly reduce your AWS bill and hence extend lifetime of the credits. It means you might start paying small amounts for cloud infrastructure early on, but in the long-run you might spend significantly less and use credits for the services where ROI is the highest.

As an example, with simplyblock you spend on average $100 on simplyblock license while spending $500 on EC2 Instances. That set-up lets you however save up to 5x on your EKS Kubernetes volumes costs on AWS, which means that instead of using as much as $3,000 in credits on various AWS storage services, you just pay $100 a month on marketplace and use $500 of AWS credits.

You can’t sell or trade AWS credits. Credits are purely for your company’s use within AWS and can’t be traded, sold, or redeemed for any cash value. You also can’t use credits on Amazon.com.

What Can You Spend AWS Credits On?

AWS credits can be spent on a wide range of AWS services including:

Compute: EC2 instances, Lambda functions
Storage: S3, EFS, Glacier, EBS, EC2 ephemeral storage
Data Transfer: Outbound data transfer
Machine Learning: SageMaker
Databases: RDS, DynamoDB

However, AWS credits typically cannot be used for:

Premium Support: AWS Support plans
Marketplace Products: Third-party products on AWS Marketplace
Reserved Instances: In some cases, specific promotions might exclude Reserve Instances

Other AWS Programs and Discounts

AWS offers numerous payment pricing models for cost savings that are premised on frequency, volume, and commitment tenure. Here are some common ones:

Reserved Instances (RI) provide up to 75% discount off On-Demand Instances, but require a commitment to a specific type of compute (instance family, region, operating system…). More details are available on the AWS Reserved Instances page.
Savings Plans (SP) provide discounts in exchange for a commitment to using a certain amount of compute over 1 or 3 year periods. They offer additional flexibility compared to RIs and are automatically applied by AWS to the spend that will result in the greatest discount. Check out the AWS Savings Plans page.
Spot Instances: Are spare AWS capacity that users can purchase at a heavy discount. The trick is that AWS may need the capacity back at any time — potentially disrupting workloads if not managed properly. More information can be found on the AWS Spot Instances page. You can also learn more on how to leverage Spot instances in our Cloud Commute podcast with Cristian Magherusan-Stanciu from AutoSpotting on “how to cut EC2 cost by 60%.”
Vantage Autopilot: Provides automated cost management and optimization solutions to help you make the most of your AWS credits and reduce overall cloud spending. Learn more about Vantage Autopilot here.

For more information, check the official AWS Credits documentation.

How can you use AWS Credits with simplyblock?

Simplyblock’s high-performance cloud storage clusters are built upon EC2 instances with local NVMe disk which means that you can use AWS Credits to reduce most of your storage costs with simplyblock. The license fee for simplyblock software is not covered by credits, however it can be part of other discount plans, such as AWS Enterprise Discount Program (EDP).

Simplyblock uses NVMe over TCP for minimal access latency, high IOPS/GB, and efficient CPU core utilization, surpassing local NVMe disks and Amazon EBS in cost/performance ratio at scale. Ideal for high-performance Kubernetes environments, simplyblock combines the benefits of local-like latency with the scalability and flexibility necessary for dynamic AWS EKS deployments, ensuring optimal performance for I/O-sensitive workloads like databases. Using erasure coding (a better RAID) instead of replicas helps to minimize storage overhead without sacrificing data safety and fault tolerance.

Additional features such as instant snapshots (full and incremental), copy-on-write clones, thin provisioning, compression, encryption, and many more, simplyblock meets your requirements before you set them. Get started using simplyblock right now or learn more about our feature set.

The post What are AWS Credits and how to get them? appeared first on simplyblock.

How to benefit from AWS Enterprise Discount Program (EDP)

Rob Pankow — Thu, 13 Jun 2024 12:08:44 +0000

What is the AWS Enterprise Discount Program (EDP)?

The AWS Enterprise Discount Program (EDP) is a discount initiative designed for organizations spending at least $1m per year on AWS cloud services and committed to extensive and long-term usage of Amazon Web Services (AWS). The program helps businesses optimize their cloud spending while expanding their operations on AWS. By entering into an EDP agreement, enterprises can secure significant cost savings and enhanced value from their AWS investments, which is particularly advantageous during economic downturns.

How does the AWS Enterprise Discount Program (EDP) Work?

The AWS Enterprise Discount Program operates on a tiered discount system based on an organization’s annual AWS spending commitment, usually starting at $1 million per year. Key features of the program include:

Customizable Discounts : Discounts are negotiated based on total committed spend and commitment duration, typically ranging from 1 to 5 years. Greater commitments yield higher discounts.
Broad Coverage : Discounts apply to nearly all AWS services and regions, ensuring consistent savings across the AWS ecosystem.
Marketplace offerings : AWS Marketplace can contribute up to 25% EDP spend.
Scalability : As AWS usage grows, the program allows organizations to benefit from increased discounts, promoting a sustainable and cost-effective cloud strategy.

What is EDP in AWS?

In AWS EDP stands for the Enterprise Discount Program . This is a contractual agreement between AWS and enterprises that guarantees significant discounts in exchange for a minimum level of AWS spending over a specified period. This program helps reduce cloud costs and encourages deeper engagement with the AWS ecosystem, fostering long-term partnerships and more efficient cloud usage.

How to Negotiate AWS EDP?

When negotiating an AWS Enterprise Agreement , consider these strategies to maximize benefits:

Understand Your Usage Patterns : Analyze your current and projected AWS usage to accurately determine your commitment levels.
Leverage Historical Spend : Use your historical AWS spend data to negotiate better discount rates.
Seek Flexibility : Aim for terms that allow flexibility in service usage and scalability.
Engage AWS Account Managers : Collaborate with AWS account managers to understand all available options and potential incentives.
Evaluate Support and Training : Include provisions for enhanced support and training services in the agreement.

How to Join AWS EDP?

To join the AWS Enterprise Discount Program , follow these steps:

Assess Eligibility : Ensure your organization meets the minimum annual spend requirement, typically around $1 million.
Contact AWS Sales : Reach out to your AWS account manager or AWS sales team to express interest in the program.
Prepare for Negotiations : Gather your usage data and financial projections to negotiate the best possible terms.
Sign Agreement : Finalize and sign the EDP agreement, detailing the committed spend and discount structure.
Monitor and Optimize : Regularly review your AWS usage and costs to ensure you are maximizing the benefits of the EDP.

Understanding AWS Marketplace with AWS EDP

To maximize the benefits of the AWS Enterprise Discount Program , it’s crucial to understand your AWS Marketplace usage. Determine which Independent Software Vendors (ISVs) you are currently purchasing from and explore opportunities to route these purchases through the AWS Marketplace. Purchases made via the AWS Marketplace can contribute to your total commitment under the EDP, with a cap of 25%. This can be a strategic way to ensure your software investments also help you meet your EDP commitments.

Can i Join EDP as a Startup?

For startups, joining the AWS Enterprise Discount Program (EDP) might not be feasible due to the high minimum spend requirement, typically around $1 million annually. However, there are other ways to maximize savings on AWS:

AWS Credits : Startups can benefit from AWS credits through programs like the AWS Activate program. These credits can significantly reduce your cloud costs during the early stages of growth. For example, AWS Activate provides up to $100,000 in credits for eligible startups.
Marketplace Solutions : Utilize the AWS Marketplace to purchase software solutions that can contribute to your overall AWS spend. For example, AWS marketplace offerings such as simplyblock can help you significantly reduce spending on AWS storage services while scaling the operations.

By leveraging these alternatives, startups can achieve substantial savings and optimize their AWS spending without needing to meet the high thresholds required for the EDP.

What’s the Difference between an EDP and a PPA?

EDP (Enterprise Discount Program) offers custom discounts based on high-volume, long-term AWS usage commitments, providing scalable savings across most AWS services. In contrast, a PPA (Private Pricing Agreement) is a more flexible, negotiated contract tailored to specific needs, often used for unique pricing arrangements and custom terms that might not fit the broader structure of an EDP. While both aim to reduce cloud costs, an EDP is typically for larger, ongoing commitments, whereas a PPA can address more specific, immediate requirements.

Other AWS Programs and Discounts

AWS offers various pricing models to help organizations achieve cost savings based on usage frequency, volume, and commitment duration. Here are some common ones:

Spot Instances: You use spare AWS capacity at a lower price. But, AWS can take back this capacity when they need it. Best for flexible workloads.
Reserved Instances: You commit to use AWS for a long time (1-3 years), and in return, you get a big discount. Best for predictable workloads.
Savings Plans: Similar to Reserved Instances, but more flexible. You commit to use a certain amount of AWS services, and you get a discount.
Vantage Autopilot : Provides automated optimization of AWS costs by dynamically adjusting instances and resources based on usage patterns, helping organizations reduce their AWS bills without manual intervention. Vantage autopilot can be used alongside simplyblock to further reduce storage cost with lower underlying EC2 instance costs (simplyblock deploys onto EC2 instances with local NVMe storage, pooling the resources into scalable enterprise-grade storage system).

How can Simplyblock be used with AWS EDP?

simplyblock can be a game-changer for your AWS Enterprise Discount Program (EDP) . It offers high-performance cloud block storage that not only enhances performance of your databases and applications but also brings cost efficiency. Most importantly, spending on simplyblock through AWS Marketplace can contribute towards the 25% marketplace spend requirement of AWS EDP. This means you can leverage simplyblock’s services while also fulfilling your commitment to AWS. It’s a win-win situation for AWS users seeking performance, scalability, and cost-effectiveness.

Simplyblock uses NVMe over TCP for minimal access latency, high IOPS/GB, and efficient CPU core utilization, surpassing local NVMe disks and Amazon EBS in cost/performance ratio at scale. Ideal for high-performance Kubernetes environments, simplyblock combines the benefits of local-like latency with the scalability and flexibility necessary for dynamic AWS EKS deployments , ensuring optimal performance for I/O-sensitive workloads like databases. Using erasure coding (a better RAID) instead of replicas helps to minimize storage overhead without sacrificing data safety and fault tolerance.

The post How to benefit from AWS Enterprise Discount Program (EDP) appeared first on simplyblock.

Production-grade Kubernetes PostgreSQL, Álvaro Hernández

Chris Engelbert — Fri, 05 Apr 2024 12:13:27 +0000

In this episode of the Cloud Commute podcast, Chris Engelbert is joined by Álvaro Hernández Tortosa, a prominent figure in the PostgreSQL community and CEO of OnGres. Álvaro shares his deep insights into running production-grade PostgreSQL on Kubernetes, a complex yet rewarding endeavor. The discussion covers the challenges, best practices, and innovations that make PostgreSQL a powerful database choice in cloud-native environments.

This interview is part of the simplyblock Cloud Commute Podcast, available on Youtube, Spotify, iTunes/Apple Podcasts, Pandora, Samsung Podcasts, and our show site.

Key Takeaways

Q: Should you deploy PostgreSQL in Kubernetes?

Deploying PostgreSQL in Kubernetes is a strategic move for organizations aiming for flexibility and scalability. Álvaro emphasizes that Kubernetes abstracts the underlying infrastructure, allowing PostgreSQL to run consistently across various environments—whether on-premise or in the cloud. This approach not only simplifies deployments but also ensures that the database is resilient and highly available.

Q: What are the main challenges of running PostgreSQL on Kubernetes?

Running PostgreSQL on Kubernetes presents unique challenges, particularly around storage and network performance. Network disks, commonly used in cloud environments, often lag behind local disks in performance, impacting database operations. However, these challenges can be mitigated by carefully choosing storage solutions and configuring Kubernetes to optimize performance. Furthermore, managing PostgreSQL’s ecosystem—such as backups, monitoring, and high availability—requires robust tooling and expertise, which can be streamlined with solutions like StackGres.

Q: Why should you use Kubernetes for PostgreSQL?

Kubernetes offers a powerful platform for running PostgreSQL due to its ability to abstract infrastructure details, automate deployments, and provide built-in scaling capabilities. Kubernetes facilitates the management of complex PostgreSQL environments, making it easier to achieve high availability and resilience without being locked into a specific vendor’s ecosystem.

Q: Can I use PostgreSQL on Kubernetes with PGO?

Yes, you can. Tools like the PostgreSQL Operator (PGO) for Kubernetes simplify the management of PostgreSQL clusters by automating routine tasks such as backups, scaling, and updates. These operators are essential for ensuring that PostgreSQL runs efficiently on Kubernetes while reducing the operational burden on database administrators.

In addition to highlighting the key takeaways, it’s essential to provide deeper context and insights that enrich the listener’s understanding of the episode. By offering this added layer of information, we ensure that when you tune in, you’ll have a clearer grasp of the nuances behind the discussion. This approach enhances your engagement with the content and helps shed light on the reasoning and perspective behind the thoughtful questions posed by our host, Chris Engelbert. Ultimately, this allows for a more immersive and insightful listening experience.

Key Learnings

Q: How does Kubernetes scheduler work with PostgreSQL?

Kubernetes uses its scheduler to manage how and where PostgreSQL instances are deployed, ensuring optimal resource utilization. However, understanding the nuances of Kubernetes’ scheduling can help optimize PostgreSQL performance, especially in environments with fluctuating workloads.

simplyblock Insight: Leveraging simplyblock’s solution, users can integrate sophisticated monitoring and management tools with Kubernetes, allowing them to automate the scaling and scheduling of PostgreSQL workloads, thereby ensuring that database resources are efficiently utilized and downtime is minimized. Q: What is the best experience of running PostgreSQL in Kubernetes?

The best experience comes from utilizing a Kubernetes operator like StackGres, which simplifies the deployment and management of PostgreSQL clusters. StackGres handles critical functions such as backups, monitoring, and high availability out of the box, providing a seamless experience for both seasoned DBAs and those new to PostgreSQL on Kubernetes.

simplyblock Insight: By using simplyblock’s Kubernetes-based solutions, you can further enhance your PostgreSQL deployments with features like dynamic scaling and automated failover, ensuring that your database remains resilient and performs optimally under varying loads. Q: How does disk access latency impact PostgreSQL performance in Kubernetes?

Disk access latency is a significant factor in PostgreSQL performance, especially in Kubernetes environments where network storage is commonly used. While network storage offers flexibility, it typically has higher latency compared to local storage, which can slow down database operations. Optimizing storage configurations in Kubernetes is crucial to minimizing latency and maintaining high performance.

simplyblock Insight: simplyblock’s advanced storage solutions for Kubernetes can help mitigate these latency issues by providing optimized, low-latency storage options tailored specifically for PostgreSQL workloads, ensuring your database runs at peak efficiency. Q: What are the advantages of clustering in PostgreSQL on Kubernetes?

Clustering PostgreSQL in Kubernetes offers several advantages, including improved fault tolerance, load balancing, and easier scaling. Kubernetes operators like StackGres enable automated clustering, which simplifies the process of setting up and managing a highly available PostgreSQL cluster.

simplyblock Insight: With simplyblock, you can easily deploy clustered PostgreSQL environments that automatically adjust to your workload demands, ensuring continuous availability and optimal performance across all nodes in your cluster.

Additional Nugget of Information

Q: What are the advantages of clustering in Postgres? A: Clustering in PostgreSQL provides several benefits, including improved performance, high availability, and better fault tolerance. Clustering allows multiple database instances to work together, distributing the load and ensuring that if one node fails, others can take over without downtime. This setup is particularly advantageous for large-scale applications that require high availability and resilience. Clustering also enables better scalability, as you can add more nodes to handle increasing workloads, ensuring consistent performance as demand grows.

Conclusion

Deploying PostgreSQL on Kubernetes offers powerful capabilities but comes with challenges. Álvaro Hernández Tortosa highlights how StackGres simplifies this process, enhancing performance, ensuring high availability, and making PostgreSQL more accessible. With the right tools and insights, you can confidently manage PostgreSQL in a cloud-native environment.

Full Video Transcript

Chris Engelbert: Welcome to this week’s episode of Cloud Commute podcast by simplyblock. Today, I have another incredible guest, a really good friend, Álvaro Hernández from OnGres. He’s very big in the Postgres community. So hello, and welcome, Álvaro.

Álvaro Hernández Tortosa: Thank you very much, first of all, for having me here. It’s an honor.

Chris Engelbert: Maybe just start by introducing yourself, who you are, what you’ve done in the past, how you got here. Well, except me inviting you.

Álvaro Hernández Tortosa: OK, well, I don’t know how to describe myself, but I would say, first of all, I’m a big nerd, big fan of open source. And I’ve been working with Postgres, I don’t know, for more than 20 years, 24 years now. So I’m a big Postgres person. There’s someone out there in the community that says that if you say Postgres three times, I will pop up there. It’s kind of like Superman or Batman or these superheroes. No, I’m not a superhero. But anyway, professionally, I’m the founder and CEO of a company called OnGres. Let’s guess what it means, On Postgres. So it’s pretty obvious what we do. So everything revolves around Postgres, but in reality, I love all kinds of technology. I’ve been working a lot with many other technologies. I know you because of being a Java programmer, which is kind of my hobby. I love programming in my free time, which almost doesn’t exist. But I try to get some from time to time. And everything related to technology in general, I’m also a big fan and supporter of open source. I have contributed and keep contributing a lot to open source. I also founded some open source communities, like for example, I’m a Spaniard. I live in Spain. And I founded Debian Spain, an association like, I don’t know, 20 years ago. More recently, I also founded a foundation, a nonprofit foundation also in Spain called Fundación PostgreSQL. Again, guess what it does? And I try to engage a lot with the open source communities. We, by the way, organized a conference for those who are interested in Postgres in the magnificent island of Ibiza in the Mediterranean Sea in September this year, 9th to 11th September for those who want to join. So yeah, that’s probably a brief intro about myself.

Chris Engelbert: All right. So you are basically the Beetlejuice of Postgres. That’s what you’re saying.

Álvaro Hernández Tortosa: Beetlejuice, right. That’s more upper bid than superheroes. You’re absolutely right.

Chris Engelbert: I’m not sure if he is a superhero, but he’s different at least. Anyway, you mentioned OnGres. And I know OnGres isn’t really like the first company. There were quite a few before, I think, El Toro, a database company.

Álvaro Hernández Tortosa: Yes, Toro DB.

Chris Engelbert: Oh, Toro DB. Sorry, close, close, very close. So what is up with that? You’re trying to do a lot of different things and seem to love trying new things, right?

Álvaro Hernández Tortosa: Yes. So I sometimes define myself as a 0.x serial entrepreneur, meaning that I’ve tried several ventures and sold none of them. But I’m still trying. I like to try to be resilient, and I keep pushing the ideas that I have in the back of my head. So yes, I’ve done several ventures, all of them, around certain patterns. So for example, you’re asking about Toro DB. Toro DB is essentially an open source software that is meant to replace MongoDB with, you guessed it, Postgres, right? There’s a certain pattern in my professional life. And Toro DB was. I’m speaking in the past because it no longer unfortunately maintained open source projects. We moved on to something else, which is OnGres. But the idea of Toro DB was to essentially replicate from Mongo DB live these documents and in the process, real time, transform them into a set of relational tables that got stored inside of a Postgres database. So it enabled you to do SQL queries on your documents that were MongoDB. So think of a MongoDB replica. You can keep your MongoDB class if you want, and then you have all the data in SQL. This was great for analytics. You could have great speed ups by normalizing data automatically and then doing queries with the power of SQL, which obviously is much broader and richer than query language MongoDB, especially for analytics. We got like 100 times faster on most queries. So it was an interesting project.

Chris Engelbert: So that means you basically generated the schema on the fly and then generated the table for that schema specifically? Interesting.

Álvaro Hernández Tortosa: Yeah, it was generating tables and columns on the fly.

OnGres StackGres: Operator for Production-Grade PostgreSQL on Kubernetes

Chris Engelbert: Right. Ok, interesting. So now you’re doing the OnGres thing. And OnGres has, I think, the main product, StackGres, as far as I know. Can you tell a little bit about that?

Álvaro Hernández Tortosa: Yes. So OnGres, as I said, means On Postgres. And one of our goals in OnGres is that we believe that Postgres is a fantastic database. I don’t need to explain that to you, right? But it’s kind of the Linux kernel, if I may use this parallel. It’s a bit bare bones. You need something around it. You need a distribution, right? So Postgres is a little bit the same thing. The core is small, it’s fantastic, it’s very featureful, it’s reliable, it’s trustable. But it needs tools around it. So our vision in OnGres is to develop this ecosystem around this Postgres core, right? And one of the things that we experience during our professional lifetime is that Postgres requires a lot of tools around it. It needs monitoring, it needs backups, it needs high availability, it needs connection pooling.

By the way, do not use Postgres without connection pooling, right? So you need a lot of tools around. And none of these tools come from a core. You need to look into the ecosystem. And actually, this is good and bad. It’s good because there’s a lot of options. It’s bad because there’s a lot of options. Meaning which one to choose, which one is good, which one is bad, which one goes with a good backup solution or the good monitoring solution and how you configure them all. So this was a problem that we coined as a stack problem. So when you really want to run Postgres in production, you need the stack on top of Postgres, right? To orchestrate all these components.

Now, the problem is that we’ve been doing this a lot of time for our customers. Typically, we love infrastructure scores, right? And everything was done with Ansible and similar tools and Terraform for infrastructure and Ansible for orchestrating these components. But the reality is that every environment into which we looked was slightly different. And we can just take our Ansible code and run it. You’ve got this stack. But now the storage is different. Your networking is different. Your entry point. Here, one is using virtual IPs. That one is using DNS. That one is using proxies. And then the compute is also somehow different. And it was not reusable. We were doing a lot of copy, paste, modify, something that was not very sustainable. At some point, we started thinking, is there a way in which we can pack this stack into a single deployable unit that we can take essentially anywhere? And the answer was Kubernetes. Kubernetes provides us this abstraction where we can abstract away this compute, this storage, this bit working and code against a programmable API that we can indeed create this package. So that’s a StackGres.

StackGres is the stack of components you need to run production Postgres, packaging a way that is uniform across any environment where you want to run it, cloud, on-prem, it doesn’t matter. And is production ready! It’s packaged at a very, very high level. So basically you barely need, I would say, you don’t need Postgres knowledge to run a production ready enterprise quality Postgres cluster introduction. And that’s the main goal of StackGres.

Chris Engelbert: Right, right. And as far as I know, I think it’s implemented as a Kubernetes operator, right?

Álvaro Hernández Tortosa: Yes, exactly.

Chris Engelbert: And there’s quite a few other operators as well. But I know that StackGres has some things which are done slightly differently. Can you talk a little bit about that? I don’t know how much you wanna actually make this public right now.

Álvaro Hernández Tortosa: No, actually everything is open source. Our roadmap is open source, our issues are open source. I’m happy to share everything. Well, first of all, what I would say is that the operator pattern is essentially these controllers that take actions on your cluster and the CRDs. We gave a lot of thought to these CRDs. I would say that a lot of operators, CRDs are kind of a byproduct. A second thought, “I have my objects and then some script generates the CRDs.” No, we said CRDs are our user-facing API. The CRDs are our extended API. And the goal of operators is to abstract the way and package business logic, right? And expose it with a simple user interface.

So we designed our CRDs to be very, very high level, very amenable to the user, so that again, you don’t require any Postgres expertise. So if you look at the CRDs, in practical terms, the YAMLs, right? The YAMLs that you write to deploy something on StackGres, they should be able to deploy, right? You could explain to your five-year-old kid and your five-year-old kid should be able to deploy Postgres into a production-quality cluster, right? And that’s our goal. And if we didn’t fulfill this goal, please raise an issue on our public issue tracker on GitLab because we definitely have failed if that’s not true. So instead of focusing on the Postgres usual user, very knowledgeable, very high level, most operators focused on low level CRDs and they require Postgres expertise, probably a lot. We want to make Postgres more mainstream than ever, right? Postgres increases in popularity every year and it’s being adopted by more and more organizations, but not everybody’s a Postgres expert. We want to make Postgres universally accessible for everyone. So one of the things is that we put a lot of effort into this design. And we also have, instead of like a big one, gigantic CRD. We have multiple. They actually can be attached like in an ER diagram between them. So you understand relationships, you create one and then you reference many times, you don’t need to restart or reconfigure the configuration files. Another area where I would say we have tried to do something is extensions. Postgres extensions is one of the most loved, if not the most loved feature, right?

And StackGres is the operator that arguably supports the largest number of extensions, over 200 extensions of now and growing. And we did this because we developed a custom solution, which is also open source by StackGres, where we can load extensions dynamically into the cluster. So we don’t need to build you a fat container with 200 images and a lot of security issues, right? But rather we deploy you a container with no extensions. And then you say, “I want this, this, this and that.” And then they will appear in your cluster automatically. And this is done via simple YAML. So we have a very powerful extension mechanism. And the other thing is that we not only expose the usual CRD YAML interface for interacting with StackGres, it’s more than fine and I love it, but it comes with a fully fledged web console. Not everybody also likes the command line or GitOps approach. We do, but not everybody does. And it’s a fully fledged web console which also supports single sign-on, where you can integrate with your AD, with your OIDC provider, anything that you want. Has detailed fine-grained permissions based on Kubernetes RBAC. So you can say, “Who can create clusters, who can view configurations, who can do anything?” And last but not least, there’s a REST API. So if you prefer to automate and integrate with another kind of solution, you can also use the REST API and create clusters and manage clusters via the REST API. And these three mechanisms, the YAML files, CRDs, the REST API and the web console are fully interchangeable. You can use one for one operation, the other one for everything goes back to the same. So you can use any one that you want.

And lately we also have added sharding. So sharding scales out with solutions like Citus, but we also support foreign interoperability, Postgres with partitioning and Apache ShardingSphere. Our way is to create a cluster of multiple instances. Not only one primary and one replica, but a coordinator layer and then shards, and it shares a coordinator of the replica. So typically dozens of instances, and you can create them with a simple YAML file and very high-level description, requires some knowledge and wires everything for you. So it’s very, very convenient to make things simple.

Chris Engelbert: Right. So the plugin mechanism or the extension mechanism, that was exactly what I was hinting at. That was mind-blowing. I’ve never seen anything like that when you showed it last year in Ibiza. The other thing that is always a little bit of like a hat-scratcher, I think, for a lot of people when they hear that a Kubernetes operator is actually written in Java. I think RedHat built the original framework. So it kind of makes sense that RedHat is doing that, I think the original framework was a Go library. And Java would probably not be like the first choice to do that. So how did that happen?

Álvaro Hernández Tortosa: Well, at first you’re right. Like the operator framework is written in Go and there was nothing else than Go at the time. So we were looking at that, but our team, we had a team of very, very senior Java programmers and none of them were Go programmers, right? But I’ve seen the Postgres community and all the communities that people who are kind of more in the DevOps world, then switching to Go programmers is a bit more natural, but at the same time, they are not senior from a Go programming perspective, right? The same would have happened with our team, right? They would switch from Java to Go. They would have been senior in Go, obviously, right? So it would have taken some time to develop those skills. On the other hand, we looked at what is the technology behind, what is an operator? An operator is no more than essentially an HTTP server that receives callbacks from Kubernetes and a client because it makes calls to Kubernetes. And HTTP clients and servers can read written in any language. So we look at the core, how complicated this is and how much does this operator framework bring to you? How we saw that it was not that much.

And actually something, for example, just mentioned before, the CRDs are kind of generated from your structures and we really wanted to do the opposite way. This is like the database. You use an ORM to read your database existing schema that we develop with all your SQL capabilities or you just create an object and let that generate a database. I prefer the format. So we did the same thing with the CRDs, right? And we wanted to develop them. So Java was more than okay to develop a Kubernetes operator and our team was expert in Java. So by doing it in Java, we were able to be very efficient and deliver a lot of value, a lot of features very, very fast without having to retrain anyone, learn a new language, or learn new skills. On top of this, there’s sometimes a concern that Java requires a JVM, which is kind of a heavy environment, right? And consumes memory and resources, and disk. But by default, StackGres uses a compilation technology and will build a whole project around it called GraalVM. And this allows you to generate native images that are indistinguishable from any other binary, Linux binary you can have with your system. And we deploy StackGres with native images. You can also switch JVM images if you prefer. We over expose both, but by default, there are native images. So at the end of the day, StackGres is several megabytes file, Linux binary and the container and that’s it.

Chris Engelbert: That makes sense. And I like that you basically pointed out that the efficiency of the existing developers was much more important than like being cool and going from a new language just because everyone does. So we talked about the operator quite a bit. Like what are your general thoughts on databases in the cloud or specifically in Kubernetes? What are like the issues you see, the problems running a database in such an environment? Well, it’s a wide topic, right? And I think one of the most interesting topics that we’re seeing lately is a concern about cost and performance. So there’s kind of a trade off as usual, right?

Álvaro Hernández Tortosa: There’s a trade off between the convenience I want to run a database and almost forget about it. And that’s why you switched to a cloud managed service which is not always true by the way, because forgetting about it means that nobody’s gonna then back your database, repack your tables, right? Optimize your queries, analyze if you haven’t used indexes. So if you’re very small, that’s more than okay. You can assume that you don’t need to touch your database even if you grow over a certain level, you’re gonna need the same DBAs, the same, at least to operate not the basic operations of the database which are monitoring, high availability and backups. So those are the three main areas that a managed service provides to you.

But so there’s convenience, but then there’s an additional cost. And this additional cost sometimes is quite notable, right? So it’s typically around 80% premium on a N+1/N number of instances because sometimes we need an extra even instance for many cloud services, right? And that multiply by 1.8 ends up being two point something in the usual case. So you’re overpaying that. So you need to analyze whether this is good for you from this perspective of convenience or if you want to have something else. On the other hand, almost all cloud services use network disks. And these network disks are very good and have improved performance a lot in the last years, but still they are far from the performance of a local drive, right? And running databases with local drives has its own challenges, but they can be addressed. And you can really, really move the needle by kind of, I don’t know if that’s the right term to call it self-hosting, but this trend of self-hosting, and if we could marry the simplicity and the convenience of managed services, right?

With the ability of running on any environment and running on any environment at a much higher performance, I think that’s kind of an interesting trend right now and a good sweet spot. And Kubernetes, to try to marry all the terms that you mentioned in the question, actually is one driver towards this goal because it enables us infrastructure independence and it enables both network disks and local disks and equally the same. And it’s kind of an enabler for this pattern that I see more trends, more trends as of now, more important and one that definitely we are looking forward to.

Chris Engelbert: Right, I like that you pointed out that there’s ways to address the local storage issues, just shameless plug, we’re actually working on something.

Álvaro Hernández Tortosa: I heard something.

The Biggest Trend in Containers?

Chris Engelbert: Oh, you heard something. (laughing) All right, last question because we’re also running out of time. What do you see as the biggest trend right now in containers, cloud, whatever? What do you think is like the next big thing? And don’t say AI, everyone says that.

Álvaro Hernández Tortosa: Oh, no. Well, you know what? Let me do a shameless plug here, right?

Chris Engelbert: All right. I did one. (laughing)

Álvaro Hernández Tortosa: So there’s a technology we’re working on right now that works for our use case, but will work for many use cases also, which is what we’re calling dynamic containers. So containers are essential as something that is static, right? You build a container, you have a build with your Dockerfile, whatever you use, right? And then that image is static. It is what it is. Contains the layers that you specified and that’s all. But if you look at any repository in Docker Hub, right? There’s plenty of tags. You have what, for example, Postgres. There’s Postgres based on Debian. There’s Postgres based on Alpine. There’s Postgres with this option. Then you want this extension, then you want this other extension. And then there’s a whole variety of images, right? And each of those images needs to be built independently, maintained, updated independently, right? But they’re very orthogonal. Like upgrading the Debian base OS has nothing to do with the Postgres layer, has nothing to do with the timescale extension, has nothing to do with whether I want the debug symbols or not. So we’re working on technology with the goal of being able to, as a user, express any combination of items I want for my container and get that container image without having to rebuild and maintain the image with the specific parameters that I want.

Chris Engelbert: Right, and let me guess, that is how the Postgres extension stuff works.

Álvaro Hernández Tortosa: It is meant to be, and then as a solution for the Postgres extensions, but it’s actually quite broad and quite general, right? Like, for example, I was discussing recently with some folks of the OpenTelemetry community, and the OpenTelemetry collector, which is the router for signals in the OpenTelemetry world, right? Has the same architecture, has like around 200 plugins, right? And you don’t want a container image with those 200 plugins, which potentially, because many third parties may have some security vulnerabilities, or even if there’s an update, you don’t want to update all those and restart your containers and all that, right? So why don’t you kind of get a container image with the OpenTelemetry collector with this source and this receiver and this export, right? So that’s actually probably more applicable. Yeah, I think that makes sense, right? I think that is a really good end, especially because the static containers in the past were in the original idea was that the static gives you some kind of consistency and some security on how the container looks, but we figured out over time, that is not the best solution. So I’m really looking forward to that being probably a more general thing. To be honest, actually the idea, I call it dynamic containers, but in reality, from a user perspective, they’re the same static as before. They are dynamic from the registry perspective.

Chris Engelbert: Right, okay, fair enough. All right, thank you very much. It was a pleasure like always talking to you. And for the other ones, I see, hear, or read you next week with my next guest. And thank you to Álvaro, thank you for being here. It was appreciated like always.

Álvaro Hernández Tortosa: Thank you very much.

The post Production-grade Kubernetes PostgreSQL, Álvaro Hernández appeared first on simplyblock.

AWS Cost Optimization with Cristian Magherusan-Stanciu from AutoSpotting (interview)

Chris Engelbert — Thu, 28 Mar 2024 12:13:27 +0000

This interview is part of the simplyblock Cloud Commute Podcast, available on Youtube, Spotify , iTunes/Apple Podcasts , Pandora , Samsung Podcasts, and our show site .

In this installment, we’re talking to Cristian Magherusan-Stanciu from AutoSpotting , a company helping to cost-optimize their AWS EC2 spent by automatically supplying matching workloads with spot instances. Cristian is talking about how spot instances work, how you can use them to save up to 60% of your EC2 cost, as well as how tools like ChatGPT, CoPilot, and AI Assistant help you writing (better) code. See more information below on what AWS cost optimization is, what the components of cloud storage pricing are and how simplyblock can help with cloud cost optimization.

Key Learnings

What is AWS Cost Optimization?

AWS cost optimization involves strategies and tools to reduce and manage the costs associated with using Amazon Web Services. Key components include:

Right-Sizing of Instances: Adjusting instance types and sizes based on actual usage patterns. Reserved Instances and Savings Plans: Committing to long-term usage to benefit from reduced rates. For more information see our blog post on the AWS Enterprise Discount Program (EDP). Auto Scaling: Automatically adjusting resource capacity to meet demand without over-provisioning. Monitoring and Analysis: Using tools like AWS Cost Explorer and Trusted Advisor to monitor usage and identify savings opportunities. Resource Tagging: Implementing tags to track and allocate costs effectively. Use reseller programs like DoiT Flexsave ™ that provide higher flexibility in pricings. Look at alternative providers of certain features, like elastic block storage .

These strategies help organizations maximize their AWS investments while maintaining performance and scalability. AWS provides a suite of management tools designed to monitor application costs and identify opportunities for modernization and right-sizing. These tools enable seamless scaling up or down, allowing you to operate more cost-effectively in an uncertain economy. By leveraging AWS, you can better position your organization for long-term success.

What are the Components of Cloud Storage Pricing?

Cloud storage pricing is typically composed of several components:

Storage Capacity: The amount of data stored, usually measured in gigabytes (GB) or terabytes (TB). Data Transfer: Costs associated with moving data in and out of the storage service. Access Frequency: Pricing can vary based on how often data is accessed (e.g. frequent vs. infrequent access tiers). Operations: Charges for operations like data retrieval, copying, or listing files. Data Retrieval: Costs associated with retrieving data from storage, especially from archival tiers. Replication and Redundancy: Fees for replicating data across regions for durability and availability. Performance and Throughput Requirements: IOPS (Input Output Operations per Second) define how many storage operations can be performed per second on a given device. Cloud providers charge for high-performance storage that exceeds the included IOPS.

It’s important to thoroughly understand the components of cloud storage pricing in order to better understand how to optimize your cloud costs. This is important for several reasons including reducing redundant expenses, ensuring optimal allocation of cloud resources to prevent over-provisioning and under-utilization, allowing scalability and investing in other areas to enhance overall competitiveness.

How can Simplyblock help with Cloud Cost Optimization?

Simplyblock aids in cloud cost optimization by providing high-performance, low-latency elastic storage which combines the speed of local disks with the flexibility and features of SAN (Storage Area Networks) in a cloud-native environment. Simplyblock storage solutions are seamlessly integrated with Kubernetes and provide zero downtime scalability. A storage cluster that grows with your needs. More importantly, simplyblock provides cost efficiency gains of 60% or more over Amazon EBS. Calculate your savings with simplyblock now.

The post AWS Cost Optimization with Cristian Magherusan-Stanciu from AutoSpotting (interview) appeared first on simplyblock.

Data center and application sustainability with Rich Kenny from Interact (interview)

Chris Engelbert — Fri, 08 Mar 2024 12:13:27 +0000

This interview is part of the simplyblock Cloud Commute Podcast, available on Youtube , Spotify , iTunes/Apple Podcasts , Pandora , Samsung Podcasts, and our show site .

In this installment , we’re talking to Rich Kenny from Interact , an environmental consultancy company, about how their machine-learning based technology helps customers to minimize their carbon footprint, as well as optimizing infrastructure cost. He sheds light on their innovative approach to optimize data center performance for sustainability.

Chris Engelbert: Hello, folks! Great to have you here for our first episode of the Cloud Commute Podcast by simplyblock. I’m really happy to have our first guest Richard. Who’s really interesting. He’s done a lot of things, and he’s going to talk about that in a second. But apart from that, you can expect a new episode every week from now on. So with that. Thank you, Richard, for being here. Really happy to have you on board, and maybe just start with a short introduction of yourself.

Rich Kenny: Yeah, cool. So my name’s Rich Kenny. I’m the managing director of Interact. We’re a machine learning based environmental consultancy that specializes in circular economy. And I’m also a visiting lecturer and research fellow at London South Bank University, in the School of engineering. So a bit of business, a bit of academia, a bit of research. I know a few things about a few things.

Chris Engelbert: You know a few things about it, a few things. That’s always better than most people.

Rich Kenny: Certainly better than knowing nothing about a lot of things.

Chris Engelbert: That’s fair. I think it’s good to know what you don’t know. That’s the important thing. Right? So you said you’re doing a little bit of university work, but you also have a company doing sustainability through AI management. Can you? Can you go and elaborate a little bit on that?

Rich Kenny: Yeah. So we’ve got a product that looks at the performance of enterprise IT, so it’s servers, storage, networking. It’s got the world’s largest data set behind it, and some very advanced mathematical models and energy calculations and basically allows us to look at data, center hardware and make really really good recommendations for lower carbon compute, reconfiguration of assets, product life extension, basically lets us holistically look at the it performance of an estate, and then apply very advanced techniques to reduce that output. So, saving cost of energy and carbon to do the same work better. We’ve done about 400 data centers now, in the last 3 years, and we saw an average of about 70% energy reduction, which is also quite often a 70% carbon reduction in a lot of cases as well from a scope two point of view. There’s nothing like you on the market at the moment, and we’ve been doing this, as a business, for probably 3.5 or 4 years, and as a research project for the better part of 7 years.

Chris Engelbert: So, how do I have to think about that? Is it like a web UI that shows you how much energy is being used and you can zoom right into a specific server and that would give you a recommendation like, I don’t know, exchange the graphics card or or storage, or whatever.

Rich Kenny: So specifically it looks at the configuration and what work it’s capable of doing. So, every time you have a variation of configuration of a server it is more or less efficient. It does more or less work per watt . So what we do is we apply a massive machine learning dataset to any make model generation configuration of any type of server, and we tell you how much work it can do, how effectively it can do it. What the utilization pathway looks like. So it’s really great to be able to apply that to existing data center architecture. Once you’ve got the utilization and the config and say you could do the same work you’re doing with 2,000 servers in this way, with 150 servers in this way. And this is how much energy that would use, how much carbon that will generate, and how much work it will do. And we can do things like carbon shifting scenarios. So we can take a service application, say a CRM, that’s in 20 data centers across a 1000 machines, using fractional parts of it and say, this service is using X amount of carbon costing this much energy. So basically, your CRM is costing X to run from an energy and carbon point of view. And you could consolidate that to Z, for example. So the ability to look at service level application level and system level data and then serve that service more efficiently. So we’re not talking about sort of rewriting the application, because that’s one step low down the stack. We’re talking about how you do the same work more efficiently and more effectively by looking at the hardware itself and the actual, physical asset. And it’s a massive, low hanging fruit, because no one’s ever done this before. So, it is not unusual to see consolidation options of 60+% of just waste material. A lot of it is doing the same work more effectively and efficiently. And that drives huge sustainability based outcomes, because you’re just removing stuff you don’t need. The transparency bit is really important, because quite often you don’t know what your server can do or how it does it, like, I bought this, it’s great, it’s new, and it must be really really effective. But the actual individual configuration, the interplay between the CPU, RAM, and the storage determines actually how good it is at doing its job, and how much bang you get for your buck and you can see, you know, intergenerational variance of 300%. Like, you know, we’ve got the L360, all the L360s are pretty much the same of this generation. But it is not. There’s like a 300% variance depending on how you actually build the build of materials.

Chris Engelbert: Alright! So I think it sounds like, if it does things more efficiently, it’s not only about carbon footprint, it’s also about cost savings, right? So I guess that’s something that is really interesting for your customers, for the enterprise is buying that?

Rich Kenny: Yes absolutely. It’s the first time they’re saving money while working towards sustainability outcomes other than what you would do in cloud for, like GreenOps, where, realistically, you’re doing financial operations and saying, I’m gonna reduce carbon, but realistically, I’m reducing compute, reducing wastage, or removing stranded applications. We’re doing the exact same thing on the hardware level and going “how do you do the same work efficiently rather than just doing it?” And so you’re going to get huge cost savings in the millions. You get thousands of tons of carbon reduction, and none of it has an impact on your business, because you’re just eradicating waste.

Chris Engelbert: Right? So that means your customers are mostly the data center providers?

Rich Keynn: Oh no, it’s mostly primary enterprise, truth be told, because the majority of data centers operate as a colo or hyper scale. Realistically, people have got 10 other people’s co-located facilities. The colos [editor: colocation operator] are facility managers. They’re not IT specialists. They’re not experts in computers. They’re experts in providing a good environment for that computer. Which is why all the efficiency metrics geared towards the data center have historically been around buildings. Since it’s been about “how do we build efficiently? How do we cool efficiently? How do we reduce heat, density?” All this sort of stuff. None of that addresses the question “why is the building there?” The building’s there to serve, storage and compute. And every colocation operator washes their hands of that, and goes “it’s not our service. Someone else is renting the space. We’re just providing the space.” So you have this real unusual gap, which you don’t see in many businesses where the supplier has a much higher level of knowledge than the owner. So when you’re talking to someone saying “I think you should buy this server,” the manufacturer tells you what to buy, and the colo tells you where to put it, but in-between that, it’s the IT professional, who really has no control over the situation. The IT provider doesn’t tell me how good it is and the colo doesn’t tell me how to effectively run it. So what I get is my asset and I give it to someone else to manage, meaning, what you get is this perfect storm of nobody really trying to serve it better, and that’s what we do. We come in and let you know ”there’s this huge amount of waste here.”

Chris Engelbert: Yeah, that makes sense. So it’s the people or the companies that co-locate their hardware in a data center.

Rich Kenny: Right, or running their own data centers on premise, running their own server rooms, or cabinets. We do work sometimes with people that have got as few as 8 servers. And we might recommend changing the RAM configuration, switching out CPUs. Things like that can have 20, 30, 40% benefits, but cost almost nothing. So it could be that we see a small server estate that’s very low utilized, but massively over-provisioned on RAM. Mostly because someone, some day, 10 years ago, bought a server and went “stick 2 TB in it.” And we’d ask, “how much are you using?” With the answer: “200 gigs.” “So you’ve got 10 times more RAM than you need, even at peak, can you just take out half your RAM, please.” It sounds really counterintuitive to take out that RAM and put it on the side. If you scale up again, you can just plug it back in again next week. But you know you’ve been using this for 8 to 10 years, and you haven’t needed anywhere near that. It’s just sitting there, drawing energy, doing nothing, providing no benefit, no speed, no improvement, no performance, just hogging energy. And we’d look at that and go “that’s not necessary.”

Chris Engelbert: Yeah, and I think because you brought up the example of RAM, most people will probably think that a little bit of extra RAM can’t be that much energy, but accumulated over a whole year it comes down to something.

Rich Kenny: Yeah, absolutely like RAM can be as much as 20 or 30% of the energy use of a server sometimes. From a configuration level. CPU is the main driver, of up to 65% of the energy use of a service. I mean, we’re talking non GPU-servers. When it gets to GPUs, we’re getting orders of magnitude. But RAM still uses up to 30% of the power on some of these servers. And if you’re only using 10% of that, you can literally eradicate almost 20% of the combined energy – just by decommissioning either certain aspects of that RAM or just removing it and putting it on the shelf until you need it next year, or the year after. The industry is so used to over-provisioning that they scale at day one, to give it scale at year five. It would be more sensible though to provision for year one and two, with an ability to upgrade, to grow with the organization. What you’ll see is that you’ll decrease your carbon energy footprint year on year, you won’t overpay month one for the asset, and then in year two you can buy some more RAM in year three you can buy some more RAM, and in year four you can change out the CPUs with a CPU you’re buying in year four. By the time you need to use it, you haven’t paid a 300% premium for buying the latest and greatest at day one. That said, it’s also about effective procurement. You know, you want 20 servers, that’s fine, but buy the servers you want for year one and year two, and then, year three, upgrade the components. Year four, upgrade. Year five, upgrade. You know, like incremental improvement. It means you’re not paying a really high sunk energy cost at year one. Also you’re saving on procurement cost, because you don’t buy it the second it’s new. Two years later it’s half the price. If you haven’t used it to its fullest potential in years one and two, you fundamentally get a 50% saving if you only buy it in year three. But nobody thinks like that. It’s more like “fire and forget.”

Chris Engelbert: Especially for CPUs. In three years time, you have quite some development. Maybe a new generation, same socket, lower TDP, something like that. Anyhow, you shocked me with the 30%. I think I have to look at my server in the basement.

Rich Kenny: It’s crazy. Especially now that we get persistent RAM, which actually doesn’t act like RAM, it more acts like store in some aspects and stores the data in the memory. That stuff is fairly energy intensive, because it’s sitting there, constantly using energy, even when the system isn’t doing anything. But realistically, yeah, your RAM is a relatively big energy user. We know, for every sort of degree of gigabytes, you’ve got an actual wattage figure. So it’s not inconsequential, and that’s a really easy one. That’s not exactly everything we look at, but there’s aspects of that.

Chris Engelbert: Alright, so we had CPUs, and we had RAM. You also mentioned graphics cards. I think if you have a server with a lot of graphic cards it’s obvious that it’ll use a lot of energy. You had RAM. Anything else that comes to mind? I think hard disk drives are probably worse than SSDs and NVMe drives.

Rich Kenny: Yeah, that’s an interesting one. So storage is a really fascinating one for me, because I think we’re moving back towards tape storage. As a carbon-efficient method of storage. And people always look at me and go “why would you say that?” Well, if you accept the fact that 60 to 70% of data is worthless, as in you may use it once but never again. That’s a pretty standard metric. I think it may be as high as 90%. I mean data that doesn’t get used again. However, 65% of the data will never get used. And what we have is loads of people moving that storage to the cloud and saying that they can now immediately access data whenever they want, but will never use or look at it again. So it sits there, on really high-available SSDs and I can retrieve this information I never want, instantly.

Well, the SSD wears over time. Every time you read or write, every time you pass information through it, it wears out a bit more. That’s just how flash memory works. HDDs have a much longer life cycle than SSDs, but lower performance. Your average hard drive uses around six watts an hour and an SSD uses four. So your thinking is “it is 34% more efficient to use SSDs.” And it is, except that there’s an embodied cost of the SSD. The creation of the SSD. Is 1015x higher than that of a hard drive. So if you’re storing data that you never use, no one’s ever using that six watts read and write. It just sits there with a really high sunk environmental cost until it runs out, and then you may be able to re use it. You might not. But realistically, you’re going to get through two or three life cycles of SSDs for every hard drive. If you never look at the data, it’s worthless. You’ve got no benefit there, but there’s a huge environmental cost for all materials and from a storage point of view. Consequently, take another great example. If you’ve got loads of storage on the cloud and you never read it, but you have to store it. Like medical data for a hundred years. Why are you storing that data on SSDs, for a hundred years, in the cloud and paying per gigabyte? You could literally save a million pounds worth of storage onto one tape and have someone like Iron Mountain run your archive as a service for you. You can say, if you need any data, they’ll retrieve it and pass it into your cloud instance. And there’s a really good company called Tes in the UK. Tes basically has this great archival system. And when I was talking to them, it really made sense of how we position systems of systems thinking. They run tape. So they take all your long term storage and put it on tape. But they give you an RCO of six hours. You just raise a ticket, telling them that you need the information on this patient, and they retrieve it, and put it into your cloud instance. You won’t have it immediately, but no one needs that data instantaneously. Anyhow, it’s sitting there on NVMe storage , which has a really high environmental energy cost, not to forget the financial cost, just to be readily available when you never need it. Consequently stick it in a vault on tape for 30 years and have someone bring it when you need it. You know you drop your cost by 99 times.

Chris Engelbert: That makes a lot of sense, especially with all data that needs to be stored for regulatory reasons or stuff like that. And I think some people kinda try to solve that or mitigate it a little bit by using some tearing technologies going from NVMe down to HDD, and eventually, maybe to something like S3, or even S3 Glacier. But I think that tape is still one step below that.

Rich Kenny: Yeah S3 Glacier storage. I heard a horror story of guys moving from S3 Glacier storage as an energy and cost saving mechanism, but not understanding that you pay per file, and not per terabyte or gigabyte. Ending with a cost of six figures to move the data over. Still they say, it’s going to save them three grand a year. But now the payback point is like 50 decades.

It’s like you don’t realize when you make these decisions. There’s a huge egress cost there, whereas how much would it have cost to take that data and just stick it onto a tape? 100? 200 quid. You know, you talk about significant cost savings and environmentally, you’re not looking after the systems. You’re not looking after the storage. You’re using an MSP to hold that storage for you, and then guarantee your retrieval within timescales you want. It’s a very clever business model that I think we need to revisit when tape is the best option, and for long term storage archival storage. From an energy point of view and a cost point of view, it’s very clever and sustainability wise. It’s a real win. So yeah. Tape as a service. It’s a thing. You heard it here first.

Chris Engelbert: So going from super old technology to a little bit newer stuff. What would drive sustainability in terms of new technologies? I hinted at a lower TDP for new CPUs. Probably the same goes for RAM. I think the chips get lower in wattage? Or watt-usage over time? Are there any other specific factors?

Rich Kenny: Yeah, I think the big one for me is the new DDR5 RAM is really good. It unlocks a lot of potential at CPU level, as in like the actual, most recent jump in efficiency is not coming from CPUs. Moore’s law slowed down in 2015. I still think it’s not hitting the level it was. But the next generation for us is ASICs based, as in applications specific interface chips. There’s not much further the CPU can go. We can still get some more juice out of it, but it’s not doubling every 2 years. So the CPU is not where it’s at. Whereas the ASICs is very much where it’s at now, like specific chips built for very specific functions. Just like Google’s TPUs. For example, they’re entirely geared towards encoding for Youtube. 100x more efficient than a CPU or a GPU at doing that task. We saw the rise of the asset through Bitcoin, right? Like specific mining assets. So I think specific chips are really good news, and new RAM is decent.

Additionally, the GPU wars is an interesting one for me, because we’ve got GPUs, but there’s no really definable benchmark for comparison of how good a GPU is, other than total work. So we have this thing where it’s like, how much total grunt do you have? But we don’t really have metrics of how much grunt per watt? GPUs have always been one of those things to power supercomputers with. So it does 1 million flops, and this many MIPS, and all the rest of it. But the question has to be “how good does it do it? How good is it doing its job?” It’s irrelevant how much total work it can do. So we need a rebalancing of that. That’s not there yet, but I think it will come soon, so we can understand what GPU specific functions are. The real big change for us is behavioral change. Now, I don’t think it’s technology. Understanding how we use our assets. Visualizing the use in terms of non economic measures. So basically being decent digital citizens, I think, is the next step. I don’t think it’s a technological revolution. I think it’s an ethical revolution. Where people are going to apply grown-up thinking to technology problems rather than expecting technology to solve every problem. So yeah, I mean, there are incremental changes. We’ve got some good stuff. But realistically, the next step of evolution is how we apply our human brains to solve technological problems rather than throw technology at problems and hope for the solution.

Chris Engelbert: I think it’s generally a really important thing that we try not to just throw technology at problems, or even worse, create technology in search of a problem all the time.

Rich Kenny: We’re a scale up business at Interact. We’re doing really, really well but we don’t act like a scale up. Last year I was mentoring some startup guys and some projects that we’ve been doing in the Uk. And 90% of people were applying technology to solve a problem that didn’t need solving. The question I would ask these people is “what does this do? What is this doing? And does the world need that?”

Well, it’s a problem. I feel like you’ve created a problem because you have the solution to a problem. It’s a bit like an automatic tin opener. Do we need a Diesel powered chainsaw tin opener to open tins? Or do we already kind of have tin openers? How far do we need to innovate before it’s fundamentally useless.

I think a lot of problems are like, “we’ve got AI, and we’ve got technology, so now we’ve got an app for that.” And it’s like, maybe we don’t need an app for that. Maybe we need to just look at the problem and go, “is it really a problem?” Have you solved something that didn’t need solving? And a lot of ingenuity and waste goes into solving problems that don’t exist. And then, conversely, there’s loads of stuff out there that solves really important problems. But they get lost in the mud, because they can’t articulate the problem it’s solving.

And in some cases you know, the ones that are winning are the ones that sound very attractive. I remember there was a med-tech one that was talking about stress management. And it was providing all these data points on what levels of stress you’re dealing with. And it’s really useful to know that I’m very stressed. But other than telling me all these psychological factors, I am feeling stressed. What? What is the solution on the product other than to give me data telling me that I’m really stressed? Well, there isn’t any. It doesn’t do anything. It just tells you that data. And it’s like, right? And now what? And then we can take that data. It’ll solve the problem later on. It’s like, no, you’re just creating a load of data to tell me things that I don’t really think has any benefit. If you’ve got the solution with this data, we can make this inference, we can, we can solve this problem that’s really useful. But actually, you’re just creating a load of data and going. And what do I do with that? And you go. Don’t know. It’s up to you. Okay, well, it tells me that it looks like I’m struggling today. Not really helpful. Do you know what I mean?

Chris Engelbert: Absolutely! Unfortunately, we’re out of time. I could chat about that for about another hour. You must have been so happy when the proof of work finally got removed from all the blockchain stuff. Anyway, thank you very much. It was very delightful.

I love chatting and just laughing, because you hear all the stories from people. Especially about things you normally are not part of, as with the RAM. Like I said, you completely shocked me with 30% up. Obviously, RAM takes some amount of energy. But I didn’t know that it takes that much.

Anyway, I hope that some other folks actually learned something, too. And apply the little bit of ethical bring thinking in the future. Whenever we create new startups, whenever we build new data centers, employ new hardware or and think about sustainability.

Rich Kenny: Thank you very much. Appreciate it.

The post Data center and application sustainability with Rich Kenny from Interact (interview) appeared first on simplyblock.

AWS EBS Pricing: A Comprehensive Guide

Chris Engelbert — Wed, 28 Feb 2024 12:13:26 +0000

In the vast landscape of cloud computing, Amazon Elastic Block Store (Amazon EBS) stands out as a crucial component for storage in AWS’ Amazon EKS (Elastic Kubernetes Service), as well as other AWS services.

As businesses increasingly migrate to the cloud, or build newer applications as cloud-native services, understanding the cloud cost becomes essential for cost-effective operations. With Amazon EBS often making up 50% or more of the cloud cost, it is important to grasp the intricacies of Amazon EBS pricing, explore the key concepts, and find the main factors that influence cost, as well as strategies to optimize expenses.

Understanding Amazon EBS

Amazon EBS provides scalable block-level storage volumes for use with Amazon EKS Persistent Volumes, EC2 instances, and other Amazon services. It offers various volume types, each designed for specific use cases, such as General Purpose (SSD), Provisioned IOPS (SSD), and HDD based. The choice of volume type significantly impacts performance and cost, making it vital to align storage configurations with application requirements.

Amazon EBS Pricing Breakdown

AWS pricing is complicated and requires a lot of studying the different regions, available options, as well as some good estimations of a service’s own behavior in terms of speed and capacity requirements.

Amazon EBS provides a set of different factors that influence availability, performance, capacity, and most prominently the cost.

Volume Type and Performance

Different workloads demand different levels of performance. Understanding the nature of your applications and selecting the appropriate volume type is crucial to balance cost and performance. The available volume types will be discussed further down in the blog post.

Volume Size

Amazon EBS volumes come in various sizes, and costs scale with the amount of provisioned storage per volume. Assessing the storage storage requirements and adjusting volume sizes accordingly to avoid over-provisioning can influence quite significantly.

Snapshot Costs

Creating snapshots for backup and disaster recovery is a common practice. However, snapshot costs can accumulate, especially as the frequency and volume of snapshots increase, the cost scales with the number and types of snapshots created. Additionally, there are two types of snapshots, standard, which is the default, and archive, which is cheaper on the storage side, but incurs cost when being restored. Implementing a snapshot management strategy to control expenses is crucial.

Throughput and I/O Operations

Throughput and I/O operations may or may not incur additional costs, depending on the selected volume type.

While data transfer is often easy to estimate, the necessary values for throughput and I/O operations per second (also known as IOPS ) are much harder. Especially IOPS can be a fair amount of the spending when running io-intensive workloads, such as databases, data warehouses, high-load webservers, or similar.

Be mindful of the amount of data transferred in and out of your EBS volumes, as well as the number of I/O operations performed.

Amazon EBS Volume Types

As mentioned above, Amazon EBS has quite the set of different volume types. Some are designed for specific use cases or to provide a cost-effective alternative, while others are older or newer generations for the same usage scenario.

An in-depth technical description of the different volume types can be found on AWS’ documentation .

Cheap Storage Volumes (st1 / Sc1)

The first category is designed for storage volumes that require large amounts of data storage which, at the same time, doesn’t need to provide the highest performance characteristics.

Being based upon HDD disks, the access latency is high and transfer speed is fairly low. The volume can be scaled up to 16TiB each though, reaching a high capacity at a cheap price.

Durability is typically given as 99.8% – 99.9%, meaning that the volume can be offline for roughly 9h per year. Warm ( throughput optimized) and cold volumes are available, relating to the types st1 and sc1 respectively.

General Purpose Volumes (gp2 / Gp3)

The second category is, what AWS calls, general purpose. It has the widest applicability and is the default option when looking for an Amazon EBS volume.

When creating volumes, gp2 should be avoided, being the old generation at the same price but with less features. That said, gp3 provides higher throughput and IOPS over st1 and sc1 volumes due to being SSD-based storage. Like the HDD-based services, durability is in the same range of 99.8% – 99.9%, leading to up to 9h per year unavailability. Likewise with capacity. Volumes can be scaled up to 16TiB each and therefore are perfect for a variety of use cases, such as boot volumes, simple transactional workloads, smaller databases, and similar.

Provisioned IOPS Volumes (io1 / Io2)

The third option are high-performance SSD (and NVMe) based volumes.

Amazon EBS Pricing

Prices for Amazon EBS volumes and additional upgrades depend on the region they are created in. For that reason, it is not possible to give an exact explanation of the pricing. There is, however, the chance to give an overview of what features have separate prices, and an example for one specific region.

The base Amazon EBS volume types normal price from cheapest to most expensive (GB-month):

HDD-based sc1 2. HDD-based st1 3. SSD-based gp2 4. SSD-based gp3 5. SSD-based io1 and io2

In addition to the base pricing, there are certain capabilities or aspects which can be increased for an additional cost, such as I/O Operations per Second (IOPS) Throughput

Amazon EBS Pricing example

And this is where it gets a bit more complicated. Every type of volume has its own set of base, and maximum available capabilities. Not all capabilities are available on all volume types though.

In our example, we want to create an Amazon EBS volume of type io2 in the US-EAST with 10 TB storage capacity. In addition we want to increase the available IOPS to 80,000 – just to make it complicated. For newer io2 volumes, the throughput scales proportionally with provisioned IOPS up to 4,000 MiB/s, meaning we don’t have to pay extra.

Base price for the io2 volume: The volume’s base cost is 0.125 USD/GB-month. That said, our 10 TB volume comes up to 1,250 USD per month.

Throughput capability pricing: The throughput of up to 4,000 MiB/s is automatically scaled proportionally to the provisioned IOPS, so all is good here. For other volume types, additional throughput (over the base amount) can be bought.

IOPS capability pricing: The pricing for IOPS can be as complicated as with io2 volumes. These have multiple “discount stages”. The prices are split at 32,000 and 64,000 IOPS.

With that in mind, the IOPS pricing can be broken down into: 0-32,000 IOPS * 0.065 USD/IOPS-month = 2,080 USD/month 32,001 – 64,000 IOPS * 0.046 USD/IOPS-month = 1,417.95 USD/month 64,001 – 80,000 IOPS * 0.032 USD/IOPS-month = 511.97 USD/month

Cost of the io2 volume: That means, including all cost factors (USD 1,250.00 + USD 2,080.00, USD 1,417.95, USD 511.97), the cost builds up to a monthly fee of USD 5,259.92 – for a single volume.

Strategies to Optimize Amazon EBS Spending

Amazon EBS volumes can be expensive as just shown. Therefore, it is important to keep the following strategies for cost reduction and optimization in mind.

Rightsize your Volumes

Regularly assess your storage requirements and resize volumes accordingly. Downsizing or upsizing volumes based on actual needs can result in significant cost savings. If auto-growing of volumes is enabled, keep the disk growth in check. Log files, or similar, running amok can blow your spend limit in hours.

Utilize Provisioned IOPS Wisely

Provisioned IOPS volumes offer high-performance storage but come at a high cost. Use them judiciously (and not ludicrously) for applications that require consistent and low-latency performance, and consider alternatives for less demanding workloads.

Implement Snapshot Lifecycle Policies

Set up lifecycle policies for snapshots to manage retention periods and reduce unnecessary storage costs. Periodically review and clean up outdated snapshots to optimize storage usage.

Leverage EBS-Optimized Instances

Use EC2 instances that are EBS-optimized for better performance. This ensures that the network traffic between EC2 instances and EBS volumes does not negatively impact overall system performance.

Conclusive Thoughts

As businesses continue to leverage AWS services, understanding and optimizing Amazon EBS spending is a key aspect of efficient cloud management. By carefully selecting the right volume types, managing sizes, and implementing cost-saving strategies, organizations can strike a balance between performance and cost-effectiveness in their cloud storage infrastructure. Regular monitoring and adjustment of storage configurations will contribute to a well-optimized and cost-efficient AWS environment.

If this feels too complicated or the requirements are hard to predict, simplyblock offers an easier, more scalable, and future-proof solution. Running right in your AWS account, providing you with the fastest and easiest way to build your own Amazon EBS alternative for Kubernetes, and save 60% and more on storage cost at the same time. Learn here how simplyblock works.

The post AWS EBS Pricing: A Comprehensive Guide appeared first on simplyblock.