Amazon Aurora Archives | simplyblock

Serverless Compute Need Serverless Storage

Chris Engelbert — Wed, 23 Oct 2024 11:37:27 +0000

The use of serverless infrastructures is steeply increasing. As the Datadog “State of Serverless 2023” survey shows, more than half of all cloud customers have already adopted a serverless environment on the three big hyperscalers—at least to some extent. The premise of saving cost while automatically and indefinitely scaling (up and down) increases the user base.

Due to this movement, other cloud operators, many database companies (such as Neon and Nile), and infrastructure teams at large enterprises are building serverless environments, either on their premises or in their private cloud platforms.

While there are great options for serverless compute, providing serverless storage to your serverless platform tends to be more challenging. This is often fueled by a lack of understanding of what serverless storage has to provide and its requirements.

What is a Serverless Architecture?

Serverless architecture is a software design pattern that leverages serverless computing resources to build and run applications without managing the underlying architecture. These serverless compute resources are commonly provided by cloud providers such as AWS Lambda, Google Cloud Functions, or Azure Functions and can be dynamically scaled up and down.

Simplified serverless architecture with clients and multiple functions

When designing a serverless architecture, you’ll encounter the so-called Function-as-a-Service (FaaS), meaning that the application’s core logic will be implemented in small, stateless functions that respond to events.

That said, typically, several FaaS make up the actual application, sending events between them. Since the underlying infrastructure is abstracted away, the functions don’t know how requests or responses are handled, and their implementations are designed for vendor lock-in and built against a cloud-provider-specific API.

Cloud-vendor-agnostic solutions exist, such as knative, but require at least parts of the team to manage the Kubernetes infrastructure. They can, however, take the burden away from other internal and external development teams.

What is Serverless Compute?

While a serverless architecture describes the application design that runs on top of a serverless compute infrastructure, serverless compute itself describes the cloud computing model in which the cloud provider dynamically manages the allocation and provisioning of server resources.

Simplified serverless platform architecture

It is essential to understand that serverless doesn’t mean “without servers” but “as a user, I don’t have to plan, provision, or manage the infrastructure.”

In essence, the cloud provider (or whoever manages the serverless infrastructure) takes the burden from the developer. Serverless compute environments fully auto-scale, starting or stopping instances of the functions according to the needed capacity. Due to their stateless nature, it’s easy to stop and restart them at any point in time. That means that function instances are often very short-lived.

Popular serverless compute platforms include AWS Lambda, Google Cloud Functions, and Azure Functions. For self-managed operations, there is knative (mentioned before), as well as OpenFaaS and OpenFunction (which seems to have less activity in the recent future).

They all enable developers to focus on writing code without managing the underlying infrastructure.

What is a Serverless Storage System?

Serverless storage refers to a cloud storage model where the underlying infrastructure, capacity planning, and scaling are abstracted away from the user. With serverless storage, customers don’t have to worry about provisioning or managing storage servers or volumes. Instead, they can store and retrieve data while the serverless storage handles all the backend infrastructure.

Serverless storage solutions come in different forms and shapes, beginning with an object storage interface, such as Amazon S3 or Google Cloud Storage. Object storage is excellent when storing unstructured data, such as documents or media.

Serverless storage options are available in GCP, AWS, and Azure

Another option that people love to use for serverless storage is serverless databases. Various options are available, depending on your needs: relational, NoSQL, time-series, and graph databases. This might be the easiest way to go, depending on how you need to access data. Examples of such serverless databases include Amazon Aurora Serverless, Google’s Cloud Datastore, and external companies such as Neon or Nile.

When self-managing your serverless infrastructure with knative or one of the alternative options, you can use Kubernetes CSI storage providers to provide storage into your functions. However, you may add considerable startup time if you choose the wrong CSI driver. I might be biased, but simplyblock is an excellent option with its neglectable provisioning and attachment times, as well as features such as multi-attach, where a volume can be attached to multiple functions (for example, to provide a shared set of data).

Why Serverless Architectures?

Most people think of cost-efficiency when it comes to serverless architectures. However, this is only one side of the coin. If your use cases aren’t a good fit for a serverless environment, it will hold true—more on when serverless makes sense later.

In serverless architectures, functions are triggered through an event, either from the outside world (like an HTTP request) or an event initiated by another function. If no function instance is up and running, a new instance will be started. The same goes for situations where all function instances are busy. If function instances idle, they’ll be shut down.

Serverless functions usually use a pay-per-use model. A function’s extremely short lifespan can lead to cost reductions over deployment models like containers and virtual machines, which tend to run longer.

Apart from that, serverless architectures have more benefits. Many are moving in the same direction as microservices architectures, but with the premise that they are easier to implement and maintain.

First and foremost, serverless solutions are designed for scalability and elasticity. They quickly and automatically scale up and down depending on the incoming workload. It’s all hands-free.

Another benefit is that development cycles are often shortened. Due to the limited size and functionality of a FaaS, changes are fast to implement and easy to test. Additionally, updating the function is as simple as deploying the new version. All existing function instances finish their current work and shut down. In the meantime, the latest version will be started up. Due to its stateless nature, this is easy to achieve.

What are the Complexities of Serverless Architecture?

Writing serverless solutions has the benefits of fast iteration, simplified deployments, and potential cost savings. However, they also come with their own set of complexities.

Designing real stateless code isn’t easy, at least when we’re not just talking about simple transformation functionality. That’s why a FaaS receives and passes context information along during its events.

What works great for small bits of context is challenging for larger pieces. In this situation, a larger context, or state, can mean lots of things, starting from simple cross-request information that should be available without transferring it with every request over more involved data, such as lookup information to enrich and cross-check, all the way to actual complex data, like when you want to implement a serverless database. And yes, a serverless database needs to store its data somewhere.

That’s where serverless storage comes in, and simply put, this is why all serverless solutions have state storage alternatives.

What is Serverless Storage?

Serverless storage refers to storage solutions that are fully integrated into serverless compute environments without manual intervention. These solutions scale and grow according to user demand and complement the pay-by-use payment model of serverless platforms.

Serverless storage lets you store information across multiple requests or functions.

As mentioned above, cloud environments offer a wide selection of serverless storage options. However, all of them are vendor-bound and lock you into their services.

However, when you design your serverless infrastructure or service, these services don’t help you. It’s up to you to provide the serverless storage. In this case, a cloud-native and serverless-supporting storage engine can simplify this talk immensely. Whether you want to provide object storage, a serverless database, or file-based storage, an underlying cloud-native block storage solution is the perfect building block underneath. However, this block storage solution needs to be able to scale and grow with your needs easily and quickly to provision and support snapshotting, cloning, and attaching to multiple function instances.

Why do Serverless Architectures Require Serverless Storage?

Serverless storage has particular properties designed for serverless environments. It needs to keep up with the specific requirements of serverless architectures, most specifically short lifetimes, extremely fast up and down scaling or restarts, easy use across multiple versions during updates, and easy integration through APIs utilized by the FaaS.

The most significant issues are that it must be used by multiple function instances simultaneously and is quickly available to new instances on other nodes, regardless of whether those are migrated over or used for scaling out. That means that the underlying storage technology must be prepared to handle these tasks easily.

These are just the most significant requirements, but there are more:

Stateless nature: Serverless functions spin up, execute, and terminate due to their stateless nature. Without fast, persistent storage that can be attached or accessed without any additional delay, this fundamental property of serverless functions would become a struggle.
Scalability needs: Serverless compute is built to scale automatically based on user demand. A storage layer needs to seamlessly support the growth and shrinking of serverless infrastructures and handle variations in I/O patterns, meaning that traditional storage systems with fixed capacity limits don’t align well with the requirements of serverless workloads.
Cost efficiency: One reason people engage with serverless compute solutions is cost efficiency. Serverless compute users pay by actual execution time. That means that serverless storage must support similar payment structures and help serverless infrastructure operators efficiently manage and scale their storage capacities and performance characteristics.
Management overhead: Serverless compute environments are designed to eliminate manual server management. Therefore, the storage solution needs to minimize its manual administrative tasks. Allocating and scaling storage requirements must be fully integratable and automated via API calls or fully automatic. Also, the integration must be seamless if multiple storage tiers are available for additional cost savings.
Performance requirements: Serverless functions require fast, if not immediate, access to data when they spin up. Traditional storage solutions introduce delays due to allocation and additional latency, negatively impacting serverless functions’ performance. As functions are paid by runtime, their operational cost increases.
Integration needs: Serverless architectures typically combine many services, as individual functions use different services. That said, the underlying storage solution of a serverless environment needs to support all kinds of services provided to users. Additionally, seamless integration with the management services of the serverless platform is required.

There are quite some requirements. For the alignment of serverless compute and serverless storage, storage solutions need to provide an efficient and manageable layer that seamlessly integrates with the overall management layer of the serverless platform.

Simplyblock for Serverless Storage

When designing a serverless environment, the storage layer must be designed to keep up with the pace. Simplyblock enables serverless infrastructures to provide dynamic and scalable storage.

To achieve this, simplyblock provides several characteristics that perfectly align with serverless principles:

Dynamic resource allocation: Simplyblock’s thin provisioning makes capacity planning irrelevant. Storage is allocated on-demand as data is written, similar to how serverless platforms allocate resources. That means every volume can be arbitrarily large to accommodate unpredictable future growth. Additionally, simplyblock’s logical volumes are resizable, meaning that the volume can be enlarged at any point in the future.
Automatic scaling: Simplyblock’s storage engine can indefinitely grow. To acquire additional backend storage, simplyblock can automatically acquire additional persistent disks (like Amazon EBS volumes) from cloud providers or attach additional storage nodes to its cluster when capacity is about to exceed, handling scaling without user intervention.
Abstraction of infrastructure: Users interact with simplyblock’s virtual drives like normal hard disks. This abstracts away the complexity of the underlying storage pooling and backend storage technologies.
Unified interface: Simplyblock provides a unified storage interface (NVMe) logical device that abstracts away underlying, diverging storage interfaces behind an easy-to-understand disk design. That enables services not specifically designed to talk to object storages or similar technologies to immediately benefit from them, just like PostgreSQL or MySQL.
Extensibility: Due to its disk-like storage interface, simplyblock is highly extensible in terms of solutions that can be run on top of it. Databases, object storage, file storage, and specific storage APIs, simplyblock provides scalable block storage to all of them, making it the perfect backend solution for serverless environments.
Crash-consistent and recoverable: Serverless storage must always be up and running. Simplyblock’s distributed erasure coding (parity information similar to RAID-5 or 6) enables high availability and fault tolerance on the storage level with a high storage efficiency, way below simple replication. Additionally, simplyblock provides storage cluster replication (sync / async), consistent snapshots across multiple logical volumes, and disaster recovery options.
Automated management: With features like automatic storage tiering to cheaper object storage (such as Amazon S3), automatic scaling, as well as erasure coding and backups for data protection, simplyblock eliminates manual management overhead and hands-on tasks. Simplyblock clusters are fully autonomous and manage the underlying storage backend automatically.
Flexible integration: Serverless platforms require storage to be seamlessly allocated and provisioned. Simplyblock achieves this through its API, which can be integrated into the standard provisioning flow of new customer sign-ups. If the new infrastructure runs on Kubernetes, integration is even easier with the Kubernetes CSI driver, allowing seamless integration with container-based serverless platforms such as knative.
Pay-per-use potential: Due to the automatic scalability, thin provisioning, and seamless resizing and integration, simplyblock enables you to provide your customers with an industry-loved pay-by-use model for managed service providers, perfectly aligning with the most common serverless pricing models.

Simplyblock is the perfect backend storage for all your serverless storage needs while future-proofing your infrastructure. As data grows and evolves, simplyblock’s flexibility and scalability ensure you can adapt without massive overhauls or migrations.

Remember, simplyblock offers powerful features like thin provisioning, storage pooling, and tiering, helping you to provide a cost-efficient, pay-by-use enabled storage solution. Get started now and find out how easy it is to operate services on top of simplyblock.

The post Serverless Compute Need Serverless Storage appeared first on simplyblock.

Production-grade Kubernetes PostgreSQL, Álvaro Hernández

Chris Engelbert — Fri, 05 Apr 2024 12:13:27 +0000

In this episode of the Cloud Commute podcast, Chris Engelbert is joined by Álvaro Hernández Tortosa, a prominent figure in the PostgreSQL community and CEO of OnGres. Álvaro shares his deep insights into running production-grade PostgreSQL on Kubernetes, a complex yet rewarding endeavor. The discussion covers the challenges, best practices, and innovations that make PostgreSQL a powerful database choice in cloud-native environments.

This interview is part of the simplyblock Cloud Commute Podcast, available on Youtube, Spotify, iTunes/Apple Podcasts, Pandora, Samsung Podcasts, and our show site.

Key Takeaways

Q: Should you deploy PostgreSQL in Kubernetes?

Deploying PostgreSQL in Kubernetes is a strategic move for organizations aiming for flexibility and scalability. Álvaro emphasizes that Kubernetes abstracts the underlying infrastructure, allowing PostgreSQL to run consistently across various environments—whether on-premise or in the cloud. This approach not only simplifies deployments but also ensures that the database is resilient and highly available.

Q: What are the main challenges of running PostgreSQL on Kubernetes?

Running PostgreSQL on Kubernetes presents unique challenges, particularly around storage and network performance. Network disks, commonly used in cloud environments, often lag behind local disks in performance, impacting database operations. However, these challenges can be mitigated by carefully choosing storage solutions and configuring Kubernetes to optimize performance. Furthermore, managing PostgreSQL’s ecosystem—such as backups, monitoring, and high availability—requires robust tooling and expertise, which can be streamlined with solutions like StackGres.

Q: Why should you use Kubernetes for PostgreSQL?

Kubernetes offers a powerful platform for running PostgreSQL due to its ability to abstract infrastructure details, automate deployments, and provide built-in scaling capabilities. Kubernetes facilitates the management of complex PostgreSQL environments, making it easier to achieve high availability and resilience without being locked into a specific vendor’s ecosystem.

Q: Can I use PostgreSQL on Kubernetes with PGO?

Yes, you can. Tools like the PostgreSQL Operator (PGO) for Kubernetes simplify the management of PostgreSQL clusters by automating routine tasks such as backups, scaling, and updates. These operators are essential for ensuring that PostgreSQL runs efficiently on Kubernetes while reducing the operational burden on database administrators.

In addition to highlighting the key takeaways, it’s essential to provide deeper context and insights that enrich the listener’s understanding of the episode. By offering this added layer of information, we ensure that when you tune in, you’ll have a clearer grasp of the nuances behind the discussion. This approach enhances your engagement with the content and helps shed light on the reasoning and perspective behind the thoughtful questions posed by our host, Chris Engelbert. Ultimately, this allows for a more immersive and insightful listening experience.

Key Learnings

Q: How does Kubernetes scheduler work with PostgreSQL?

Kubernetes uses its scheduler to manage how and where PostgreSQL instances are deployed, ensuring optimal resource utilization. However, understanding the nuances of Kubernetes’ scheduling can help optimize PostgreSQL performance, especially in environments with fluctuating workloads.

simplyblock Insight: Leveraging simplyblock’s solution, users can integrate sophisticated monitoring and management tools with Kubernetes, allowing them to automate the scaling and scheduling of PostgreSQL workloads, thereby ensuring that database resources are efficiently utilized and downtime is minimized. Q: What is the best experience of running PostgreSQL in Kubernetes?

The best experience comes from utilizing a Kubernetes operator like StackGres, which simplifies the deployment and management of PostgreSQL clusters. StackGres handles critical functions such as backups, monitoring, and high availability out of the box, providing a seamless experience for both seasoned DBAs and those new to PostgreSQL on Kubernetes.

simplyblock Insight: By using simplyblock’s Kubernetes-based solutions, you can further enhance your PostgreSQL deployments with features like dynamic scaling and automated failover, ensuring that your database remains resilient and performs optimally under varying loads. Q: How does disk access latency impact PostgreSQL performance in Kubernetes?

Disk access latency is a significant factor in PostgreSQL performance, especially in Kubernetes environments where network storage is commonly used. While network storage offers flexibility, it typically has higher latency compared to local storage, which can slow down database operations. Optimizing storage configurations in Kubernetes is crucial to minimizing latency and maintaining high performance.

simplyblock Insight: simplyblock’s advanced storage solutions for Kubernetes can help mitigate these latency issues by providing optimized, low-latency storage options tailored specifically for PostgreSQL workloads, ensuring your database runs at peak efficiency. Q: What are the advantages of clustering in PostgreSQL on Kubernetes?

Clustering PostgreSQL in Kubernetes offers several advantages, including improved fault tolerance, load balancing, and easier scaling. Kubernetes operators like StackGres enable automated clustering, which simplifies the process of setting up and managing a highly available PostgreSQL cluster.

simplyblock Insight: With simplyblock, you can easily deploy clustered PostgreSQL environments that automatically adjust to your workload demands, ensuring continuous availability and optimal performance across all nodes in your cluster.

Additional Nugget of Information

Q: What are the advantages of clustering in Postgres? A: Clustering in PostgreSQL provides several benefits, including improved performance, high availability, and better fault tolerance. Clustering allows multiple database instances to work together, distributing the load and ensuring that if one node fails, others can take over without downtime. This setup is particularly advantageous for large-scale applications that require high availability and resilience. Clustering also enables better scalability, as you can add more nodes to handle increasing workloads, ensuring consistent performance as demand grows.

Conclusion

Deploying PostgreSQL on Kubernetes offers powerful capabilities but comes with challenges. Álvaro Hernández Tortosa highlights how StackGres simplifies this process, enhancing performance, ensuring high availability, and making PostgreSQL more accessible. With the right tools and insights, you can confidently manage PostgreSQL in a cloud-native environment.

Full Video Transcript

Chris Engelbert: Welcome to this week’s episode of Cloud Commute podcast by simplyblock. Today, I have another incredible guest, a really good friend, Álvaro Hernández from OnGres. He’s very big in the Postgres community. So hello, and welcome, Álvaro.

Álvaro Hernández Tortosa: Thank you very much, first of all, for having me here. It’s an honor.

Chris Engelbert: Maybe just start by introducing yourself, who you are, what you’ve done in the past, how you got here. Well, except me inviting you.

Álvaro Hernández Tortosa: OK, well, I don’t know how to describe myself, but I would say, first of all, I’m a big nerd, big fan of open source. And I’ve been working with Postgres, I don’t know, for more than 20 years, 24 years now. So I’m a big Postgres person. There’s someone out there in the community that says that if you say Postgres three times, I will pop up there. It’s kind of like Superman or Batman or these superheroes. No, I’m not a superhero. But anyway, professionally, I’m the founder and CEO of a company called OnGres. Let’s guess what it means, On Postgres. So it’s pretty obvious what we do. So everything revolves around Postgres, but in reality, I love all kinds of technology. I’ve been working a lot with many other technologies. I know you because of being a Java programmer, which is kind of my hobby. I love programming in my free time, which almost doesn’t exist. But I try to get some from time to time. And everything related to technology in general, I’m also a big fan and supporter of open source. I have contributed and keep contributing a lot to open source. I also founded some open source communities, like for example, I’m a Spaniard. I live in Spain. And I founded Debian Spain, an association like, I don’t know, 20 years ago. More recently, I also founded a foundation, a nonprofit foundation also in Spain called Fundación PostgreSQL. Again, guess what it does? And I try to engage a lot with the open source communities. We, by the way, organized a conference for those who are interested in Postgres in the magnificent island of Ibiza in the Mediterranean Sea in September this year, 9th to 11th September for those who want to join. So yeah, that’s probably a brief intro about myself.

Chris Engelbert: All right. So you are basically the Beetlejuice of Postgres. That’s what you’re saying.

Álvaro Hernández Tortosa: Beetlejuice, right. That’s more upper bid than superheroes. You’re absolutely right.

Chris Engelbert: I’m not sure if he is a superhero, but he’s different at least. Anyway, you mentioned OnGres. And I know OnGres isn’t really like the first company. There were quite a few before, I think, El Toro, a database company.

Álvaro Hernández Tortosa: Yes, Toro DB.

Chris Engelbert: Oh, Toro DB. Sorry, close, close, very close. So what is up with that? You’re trying to do a lot of different things and seem to love trying new things, right?

Álvaro Hernández Tortosa: Yes. So I sometimes define myself as a 0.x serial entrepreneur, meaning that I’ve tried several ventures and sold none of them. But I’m still trying. I like to try to be resilient, and I keep pushing the ideas that I have in the back of my head. So yes, I’ve done several ventures, all of them, around certain patterns. So for example, you’re asking about Toro DB. Toro DB is essentially an open source software that is meant to replace MongoDB with, you guessed it, Postgres, right? There’s a certain pattern in my professional life. And Toro DB was. I’m speaking in the past because it no longer unfortunately maintained open source projects. We moved on to something else, which is OnGres. But the idea of Toro DB was to essentially replicate from Mongo DB live these documents and in the process, real time, transform them into a set of relational tables that got stored inside of a Postgres database. So it enabled you to do SQL queries on your documents that were MongoDB. So think of a MongoDB replica. You can keep your MongoDB class if you want, and then you have all the data in SQL. This was great for analytics. You could have great speed ups by normalizing data automatically and then doing queries with the power of SQL, which obviously is much broader and richer than query language MongoDB, especially for analytics. We got like 100 times faster on most queries. So it was an interesting project.

Chris Engelbert: So that means you basically generated the schema on the fly and then generated the table for that schema specifically? Interesting.

Álvaro Hernández Tortosa: Yeah, it was generating tables and columns on the fly.

OnGres StackGres: Operator for Production-Grade PostgreSQL on Kubernetes

Chris Engelbert: Right. Ok, interesting. So now you’re doing the OnGres thing. And OnGres has, I think, the main product, StackGres, as far as I know. Can you tell a little bit about that?

Álvaro Hernández Tortosa: Yes. So OnGres, as I said, means On Postgres. And one of our goals in OnGres is that we believe that Postgres is a fantastic database. I don’t need to explain that to you, right? But it’s kind of the Linux kernel, if I may use this parallel. It’s a bit bare bones. You need something around it. You need a distribution, right? So Postgres is a little bit the same thing. The core is small, it’s fantastic, it’s very featureful, it’s reliable, it’s trustable. But it needs tools around it. So our vision in OnGres is to develop this ecosystem around this Postgres core, right? And one of the things that we experience during our professional lifetime is that Postgres requires a lot of tools around it. It needs monitoring, it needs backups, it needs high availability, it needs connection pooling.

By the way, do not use Postgres without connection pooling, right? So you need a lot of tools around. And none of these tools come from a core. You need to look into the ecosystem. And actually, this is good and bad. It’s good because there’s a lot of options. It’s bad because there’s a lot of options. Meaning which one to choose, which one is good, which one is bad, which one goes with a good backup solution or the good monitoring solution and how you configure them all. So this was a problem that we coined as a stack problem. So when you really want to run Postgres in production, you need the stack on top of Postgres, right? To orchestrate all these components.

Now, the problem is that we’ve been doing this a lot of time for our customers. Typically, we love infrastructure scores, right? And everything was done with Ansible and similar tools and Terraform for infrastructure and Ansible for orchestrating these components. But the reality is that every environment into which we looked was slightly different. And we can just take our Ansible code and run it. You’ve got this stack. But now the storage is different. Your networking is different. Your entry point. Here, one is using virtual IPs. That one is using DNS. That one is using proxies. And then the compute is also somehow different. And it was not reusable. We were doing a lot of copy, paste, modify, something that was not very sustainable. At some point, we started thinking, is there a way in which we can pack this stack into a single deployable unit that we can take essentially anywhere? And the answer was Kubernetes. Kubernetes provides us this abstraction where we can abstract away this compute, this storage, this bit working and code against a programmable API that we can indeed create this package. So that’s a StackGres.

StackGres is the stack of components you need to run production Postgres, packaging a way that is uniform across any environment where you want to run it, cloud, on-prem, it doesn’t matter. And is production ready! It’s packaged at a very, very high level. So basically you barely need, I would say, you don’t need Postgres knowledge to run a production ready enterprise quality Postgres cluster introduction. And that’s the main goal of StackGres.

Chris Engelbert: Right, right. And as far as I know, I think it’s implemented as a Kubernetes operator, right?

Álvaro Hernández Tortosa: Yes, exactly.

Chris Engelbert: And there’s quite a few other operators as well. But I know that StackGres has some things which are done slightly differently. Can you talk a little bit about that? I don’t know how much you wanna actually make this public right now.

Álvaro Hernández Tortosa: No, actually everything is open source. Our roadmap is open source, our issues are open source. I’m happy to share everything. Well, first of all, what I would say is that the operator pattern is essentially these controllers that take actions on your cluster and the CRDs. We gave a lot of thought to these CRDs. I would say that a lot of operators, CRDs are kind of a byproduct. A second thought, “I have my objects and then some script generates the CRDs.” No, we said CRDs are our user-facing API. The CRDs are our extended API. And the goal of operators is to abstract the way and package business logic, right? And expose it with a simple user interface.

So we designed our CRDs to be very, very high level, very amenable to the user, so that again, you don’t require any Postgres expertise. So if you look at the CRDs, in practical terms, the YAMLs, right? The YAMLs that you write to deploy something on StackGres, they should be able to deploy, right? You could explain to your five-year-old kid and your five-year-old kid should be able to deploy Postgres into a production-quality cluster, right? And that’s our goal. And if we didn’t fulfill this goal, please raise an issue on our public issue tracker on GitLab because we definitely have failed if that’s not true. So instead of focusing on the Postgres usual user, very knowledgeable, very high level, most operators focused on low level CRDs and they require Postgres expertise, probably a lot. We want to make Postgres more mainstream than ever, right? Postgres increases in popularity every year and it’s being adopted by more and more organizations, but not everybody’s a Postgres expert. We want to make Postgres universally accessible for everyone. So one of the things is that we put a lot of effort into this design. And we also have, instead of like a big one, gigantic CRD. We have multiple. They actually can be attached like in an ER diagram between them. So you understand relationships, you create one and then you reference many times, you don’t need to restart or reconfigure the configuration files. Another area where I would say we have tried to do something is extensions. Postgres extensions is one of the most loved, if not the most loved feature, right?

And StackGres is the operator that arguably supports the largest number of extensions, over 200 extensions of now and growing. And we did this because we developed a custom solution, which is also open source by StackGres, where we can load extensions dynamically into the cluster. So we don’t need to build you a fat container with 200 images and a lot of security issues, right? But rather we deploy you a container with no extensions. And then you say, “I want this, this, this and that.” And then they will appear in your cluster automatically. And this is done via simple YAML. So we have a very powerful extension mechanism. And the other thing is that we not only expose the usual CRD YAML interface for interacting with StackGres, it’s more than fine and I love it, but it comes with a fully fledged web console. Not everybody also likes the command line or GitOps approach. We do, but not everybody does. And it’s a fully fledged web console which also supports single sign-on, where you can integrate with your AD, with your OIDC provider, anything that you want. Has detailed fine-grained permissions based on Kubernetes RBAC. So you can say, “Who can create clusters, who can view configurations, who can do anything?” And last but not least, there’s a REST API. So if you prefer to automate and integrate with another kind of solution, you can also use the REST API and create clusters and manage clusters via the REST API. And these three mechanisms, the YAML files, CRDs, the REST API and the web console are fully interchangeable. You can use one for one operation, the other one for everything goes back to the same. So you can use any one that you want.

And lately we also have added sharding. So sharding scales out with solutions like Citus, but we also support foreign interoperability, Postgres with partitioning and Apache ShardingSphere. Our way is to create a cluster of multiple instances. Not only one primary and one replica, but a coordinator layer and then shards, and it shares a coordinator of the replica. So typically dozens of instances, and you can create them with a simple YAML file and very high-level description, requires some knowledge and wires everything for you. So it’s very, very convenient to make things simple.

Chris Engelbert: Right. So the plugin mechanism or the extension mechanism, that was exactly what I was hinting at. That was mind-blowing. I’ve never seen anything like that when you showed it last year in Ibiza. The other thing that is always a little bit of like a hat-scratcher, I think, for a lot of people when they hear that a Kubernetes operator is actually written in Java. I think RedHat built the original framework. So it kind of makes sense that RedHat is doing that, I think the original framework was a Go library. And Java would probably not be like the first choice to do that. So how did that happen?

Álvaro Hernández Tortosa: Well, at first you’re right. Like the operator framework is written in Go and there was nothing else than Go at the time. So we were looking at that, but our team, we had a team of very, very senior Java programmers and none of them were Go programmers, right? But I’ve seen the Postgres community and all the communities that people who are kind of more in the DevOps world, then switching to Go programmers is a bit more natural, but at the same time, they are not senior from a Go programming perspective, right? The same would have happened with our team, right? They would switch from Java to Go. They would have been senior in Go, obviously, right? So it would have taken some time to develop those skills. On the other hand, we looked at what is the technology behind, what is an operator? An operator is no more than essentially an HTTP server that receives callbacks from Kubernetes and a client because it makes calls to Kubernetes. And HTTP clients and servers can read written in any language. So we look at the core, how complicated this is and how much does this operator framework bring to you? How we saw that it was not that much.

And actually something, for example, just mentioned before, the CRDs are kind of generated from your structures and we really wanted to do the opposite way. This is like the database. You use an ORM to read your database existing schema that we develop with all your SQL capabilities or you just create an object and let that generate a database. I prefer the format. So we did the same thing with the CRDs, right? And we wanted to develop them. So Java was more than okay to develop a Kubernetes operator and our team was expert in Java. So by doing it in Java, we were able to be very efficient and deliver a lot of value, a lot of features very, very fast without having to retrain anyone, learn a new language, or learn new skills. On top of this, there’s sometimes a concern that Java requires a JVM, which is kind of a heavy environment, right? And consumes memory and resources, and disk. But by default, StackGres uses a compilation technology and will build a whole project around it called GraalVM. And this allows you to generate native images that are indistinguishable from any other binary, Linux binary you can have with your system. And we deploy StackGres with native images. You can also switch JVM images if you prefer. We over expose both, but by default, there are native images. So at the end of the day, StackGres is several megabytes file, Linux binary and the container and that’s it.

Chris Engelbert: That makes sense. And I like that you basically pointed out that the efficiency of the existing developers was much more important than like being cool and going from a new language just because everyone does. So we talked about the operator quite a bit. Like what are your general thoughts on databases in the cloud or specifically in Kubernetes? What are like the issues you see, the problems running a database in such an environment? Well, it’s a wide topic, right? And I think one of the most interesting topics that we’re seeing lately is a concern about cost and performance. So there’s kind of a trade off as usual, right?

Álvaro Hernández Tortosa: There’s a trade off between the convenience I want to run a database and almost forget about it. And that’s why you switched to a cloud managed service which is not always true by the way, because forgetting about it means that nobody’s gonna then back your database, repack your tables, right? Optimize your queries, analyze if you haven’t used indexes. So if you’re very small, that’s more than okay. You can assume that you don’t need to touch your database even if you grow over a certain level, you’re gonna need the same DBAs, the same, at least to operate not the basic operations of the database which are monitoring, high availability and backups. So those are the three main areas that a managed service provides to you.

But so there’s convenience, but then there’s an additional cost. And this additional cost sometimes is quite notable, right? So it’s typically around 80% premium on a N+1/N number of instances because sometimes we need an extra even instance for many cloud services, right? And that multiply by 1.8 ends up being two point something in the usual case. So you’re overpaying that. So you need to analyze whether this is good for you from this perspective of convenience or if you want to have something else. On the other hand, almost all cloud services use network disks. And these network disks are very good and have improved performance a lot in the last years, but still they are far from the performance of a local drive, right? And running databases with local drives has its own challenges, but they can be addressed. And you can really, really move the needle by kind of, I don’t know if that’s the right term to call it self-hosting, but this trend of self-hosting, and if we could marry the simplicity and the convenience of managed services, right?

With the ability of running on any environment and running on any environment at a much higher performance, I think that’s kind of an interesting trend right now and a good sweet spot. And Kubernetes, to try to marry all the terms that you mentioned in the question, actually is one driver towards this goal because it enables us infrastructure independence and it enables both network disks and local disks and equally the same. And it’s kind of an enabler for this pattern that I see more trends, more trends as of now, more important and one that definitely we are looking forward to.

Chris Engelbert: Right, I like that you pointed out that there’s ways to address the local storage issues, just shameless plug, we’re actually working on something.

Álvaro Hernández Tortosa: I heard something.

The Biggest Trend in Containers?

Chris Engelbert: Oh, you heard something. (laughing) All right, last question because we’re also running out of time. What do you see as the biggest trend right now in containers, cloud, whatever? What do you think is like the next big thing? And don’t say AI, everyone says that.

Álvaro Hernández Tortosa: Oh, no. Well, you know what? Let me do a shameless plug here, right?

Chris Engelbert: All right. I did one. (laughing)

Álvaro Hernández Tortosa: So there’s a technology we’re working on right now that works for our use case, but will work for many use cases also, which is what we’re calling dynamic containers. So containers are essential as something that is static, right? You build a container, you have a build with your Dockerfile, whatever you use, right? And then that image is static. It is what it is. Contains the layers that you specified and that’s all. But if you look at any repository in Docker Hub, right? There’s plenty of tags. You have what, for example, Postgres. There’s Postgres based on Debian. There’s Postgres based on Alpine. There’s Postgres with this option. Then you want this extension, then you want this other extension. And then there’s a whole variety of images, right? And each of those images needs to be built independently, maintained, updated independently, right? But they’re very orthogonal. Like upgrading the Debian base OS has nothing to do with the Postgres layer, has nothing to do with the timescale extension, has nothing to do with whether I want the debug symbols or not. So we’re working on technology with the goal of being able to, as a user, express any combination of items I want for my container and get that container image without having to rebuild and maintain the image with the specific parameters that I want.

Chris Engelbert: Right, and let me guess, that is how the Postgres extension stuff works.

Álvaro Hernández Tortosa: It is meant to be, and then as a solution for the Postgres extensions, but it’s actually quite broad and quite general, right? Like, for example, I was discussing recently with some folks of the OpenTelemetry community, and the OpenTelemetry collector, which is the router for signals in the OpenTelemetry world, right? Has the same architecture, has like around 200 plugins, right? And you don’t want a container image with those 200 plugins, which potentially, because many third parties may have some security vulnerabilities, or even if there’s an update, you don’t want to update all those and restart your containers and all that, right? So why don’t you kind of get a container image with the OpenTelemetry collector with this source and this receiver and this export, right? So that’s actually probably more applicable. Yeah, I think that makes sense, right? I think that is a really good end, especially because the static containers in the past were in the original idea was that the static gives you some kind of consistency and some security on how the container looks, but we figured out over time, that is not the best solution. So I’m really looking forward to that being probably a more general thing. To be honest, actually the idea, I call it dynamic containers, but in reality, from a user perspective, they’re the same static as before. They are dynamic from the registry perspective.

Chris Engelbert: Right, okay, fair enough. All right, thank you very much. It was a pleasure like always talking to you. And for the other ones, I see, hear, or read you next week with my next guest. And thank you to Álvaro, thank you for being here. It was appreciated like always.

Álvaro Hernández Tortosa: Thank you very much.

The post Production-grade Kubernetes PostgreSQL, Álvaro Hernández appeared first on simplyblock.