PostgreSQL Archives | simplyblock

How to Build Scalable and Reliable PostgreSQL Systems on Kubernetes

Sanskar Gurdasani (Guest Author, CloudRaft) — Mon, 25 Nov 2024 15:31:37 +0000

This is a guest post by Sanskar Gurdasani, DevOps Engineer, from CloudRaft.

Maintaining highly available and resilient PostgreSQL databases is crucial for business continuity in today’s cloud-native landscape. The Cloud Native PostgreSQL Operator provides robust capabilities for managing PostgreSQL clusters in Kubernetes environments, particularly in handling failover scenarios and implementing disaster recovery strategies.

In this blog post, we’ll explore the key features of the Cloud Native PostgreSQL Operator for managing failover and disaster recovery. We’ll discuss how it ensures high availability, implements automatic failover, and facilitates disaster recovery processes. Additionally, we’ll look at best practices for configuring and managing PostgreSQL clusters using this operator in Kubernetes environments.

Why to run Postgres on Kubernetes?

Running PostgreSQL on Kubernetes offers several advantages for modern, cloud-native applications:

Stateful Workload Readiness: Contrary to old beliefs, Kubernetes is now ready for stateful workloads like databases. A 2021 survey by the Data on Kubernetes Community revealed that 90% of respondents believe Kubernetes is suitable for stateful workloads, with 70% already running databases in production.
Immutable Application Containers: CloudNativePG leverages immutable application containers, enhancing deployment safety and repeatability. This approach aligns with microservice architecture principles and simplifies updates and patching.
Cloud-Native Benefits: Running PostgreSQL on Kubernetes embraces cloud-native principles, fostering a DevOps culture, enabling microservice architectures, and providing robust container orchestration.
Automated Management: Kubernetes operators like CloudNativePG extend Kubernetes controllers to manage complex applications like PostgreSQL, handling deployments, failovers, and other critical operations automatically.
Declarative Configuration: CloudNativePG allows for declarative configuration of PostgreSQL clusters, simplifying change management and enabling Infrastructure as Code practices.
Resource Optimization: Kubernetes provides efficient resource management, allowing for better utilization of infrastructure and easier scaling of database workloads.
High Availability and Disaster Recovery: Kubernetes facilitates the implementation of high availability architectures across availability zones and enables efficient disaster recovery strategies.
Streamlined Operations with Operators: Using operators like CloudNativePG automates all the tasks mentioned above, significantly reducing operational complexity. These operators act as PostgreSQL experts in code form, handling intricate database management tasks such as failovers, backups, and scaling with minimal human intervention. This not only increases reliability but also frees up DBAs and DevOps teams to focus on higher-value activities, ultimately leading to more robust and efficient database operations in Kubernetes environments.

By leveraging Kubernetes for PostgreSQL deployments, organizations can benefit from increased automation, improved scalability, and enhanced resilience for their database infrastructure, with operators like CloudNativePG further simplifying and optimizing these processes.

List of Postgres Operators

Kubernetes operators represent an innovative approach to managing applications within a Kubernetes environment by encapsulating operational knowledge and best practices. These extensions automate the deployment and maintenance of complex applications, such as databases, ensuring smooth operation in a Kubernetes setup.

The Cloud Native PostgreSQL Operator is a prime example of this concept, specifically designed to manage PostgreSQL clusters on Kubernetes. This operator automates various database management tasks, providing a seamless experience for users. Some key features include direct integration with the Kubernetes API server for high availability without relying on external tools, self-healing capabilities through automated failover and replica recreation, and planned switchover of the primary instance to maintain data integrity during maintenance or upgrades.

Additionally, the operator supports scalable architecture with the ability to manage multiple instances, declarative management of PostgreSQL configuration and roles, and compatibility with Local Persistent Volumes and separate volumes for WAL files. It also offers continuous backup solutions to object stores like AWS S3, Azure Blob Storage, and Google Cloud Storage, ensuring data safety and recoverability. Furthermore, the operator provides full recovery and point-in-time recovery options from existing backups, TLS support with client certificate authentication, rolling updates for PostgreSQL minor versions and operator upgrades, and support for synchronous replicas and HA physical replication slots. It also offers replica clusters for multi-cluster PostgreSQL deployments, connection pooling through PgBouncer, a native customizable Prometheus metrics exporter, and LDAP authentication support.

By leveraging the Cloud Native PostgreSQL Operator, organizations can streamline their database management processes on Kubernetes, reducing manual intervention and ensuring high availability, scalability, and security in their PostgreSQL deployments. This operator showcases how Kubernetes operators can significantly enhance application management within a cloud-native ecosystem.

Here are the most popular PostgreSQL operators:

CloudNativePG (formerly known as Cloud Native PostgreSQL Operator)
Crunchy Data Postgres Operator (first released in 2017)
Zalando Postgres Operator (first released in 2017)
StackGres (released in 2020)
Percona Operator for PostgreSQL (released in 2021)
Kubegres (released in 2021)
Patroni (for HA PostgreSQL solutions using Python.)

Understanding Failover in PostgreSQL

Primary-Replica Architecture

In a PostgreSQL cluster, the primary-replica (formerly master-slave) architecture consists of:

Primary Node: Handles all write operations and read operations
Replica Nodes: Maintain synchronized copies of the primary node’s data

Automatic Failover Process

When the primary node becomes unavailable, the operator initiates the following process:

Detection: Continuous health monitoring identifies primary node failure
Election: A replica is selected to become the new primary
Promotion: The chosen replica is promoted to primary status
Reconfiguration: Other replicas are reconfigured to follow the new primary
Service Updates: Kubernetes services are updated to point to the new primary

Implementing Disaster Recovery

Backup Strategies

The operator supports multiple backup approaches:

1. Volume Snapshots

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: postgresql-cluster
spec:
  instances: 3
  backup:
    volumeSnapshot:
      className: csi-hostpath-snapclass
      enabled: true
      snapshotOwnerReference: true

2. Barman Integration

spec:
  backup:
    barmanObjectStore:
      destinationPath: 's3://backup-bucket/postgres'
      endpointURL: 'https://s3.amazonaws.com'
      s3Credentials:
        accessKeyId:
          name: aws-creds
          key: ACCESS\_KEY\_ID
        secretAccessKey:
          name: aws-creds
          key: ACCESS\_SECRET\_KEY

Disaster Recovery Procedures

Point-in-Time Recovery (PITR)
- Enables recovery to any specific point in time
- Uses WAL (Write-Ahead Logging) archives
- Minimizes data loss
Cross-Region Recovery
- Maintains backup copies in different geographical regions
- Enables recovery in case of regional failures

Demo

This section provides a step-by-step guide to setting up a CloudNative PostgreSQL cluster, testing failover, and performing disaster recovery.

1. Installation

Method 1: Direct Installation

kubectl apply --server-side -f \
https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/main/releases/cnpg-1.24.0.yaml

Method 2: Helm Installation

helm repo add cnpg https://cloudnative-pg.github.io/charts
helm upgrade --install cnpg \
  --namespace cnpg-system \
  --create-namespace \
  cnpg/cloudnative-pg

Verify the Installation

kubectl get deployment -n cnpg-system cnpg-controller-manager

Install CloudNativePG Plugin

CloudNativePG provides a plugin for kubectl to manage a cluster in Kubernetes. You can install the cnpg plugin using a variety of methods.

Via the installation script

curl -sSfL \
  https://github.com/cloudnative-pg/cloudnative-pg/raw/main/hack/install-cnpg-plugin.sh | \
  sudo sh -s -- -b /usr/local/bin

If you already have Krew installed, you can simply run:

kubectl krew install cnpg

2. Create S3 Credentials Secret

First, create an S3 bucket and an IAM user with S3 access. Then, create a Kubernetes secret with the IAM credentials:

kubectl create secret generic s3-creds \
  --from-literal=ACCESS_KEY_ID=your_access_key_id \
  --from-literal=ACCESS_SECRET_KEY=your_secret_access_key

3. Create PostgreSQL Cluster

Create a file named cluster.yaml with the following content:

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: example
spec:
  backup:
    barmanObjectStore:
      destinationPath: 's3://your-bucket-name/retail-master-db'
      s3Credentials:
        accessKeyId:
          name: s3-creds
          key: ACCESS_KEY_ID
        secretAccessKey:
          name: s3-creds
          key: ACCESS_SECRET_KEY
  instances: 2
  imageName: ghcr.io/clevyr/cloudnativepg-timescale:16-ts2
  postgresql:
    shared_preload_libraries:
      - timescaledb
  bootstrap:
    initdb:
      postInitTemplateSQL:
        - CREATE EXTENSION IF NOT EXISTS timescaledb;
  storage:
    size: 20Gi

Apply the configuration to create cluster:

kubectl apply -f cluster.yaml

Verify the cluster status:

kubectl cnpg status example

4. Getting Access

Deploying a cluster is one thing, actually accessing it is entirely another. CloudNativePG creates three services for every cluster, named after the cluster name. In our case, these are:

kubectl get service

example-rw: Always points to the Primary node
example-ro: Points to only Replica nodes (round-robin)
example-r: Points to any node in the cluster (round-robin)

5. Insert Data

Create a PostgreSQL client pod:

kubectl run pgclient --image=postgres:13 --command -- sleep infinity

Connect to the database:

kubectl exec -ti example-1 -- psql app

Create a table and insert data:

CREATE TABLE stocks_real_time (
  time TIMESTAMPTZ NOT NULL,
  symbol TEXT NOT NULL,
  price DOUBLE PRECISION NULL,
  day_volume INT NULL
);

SELECT create_hypertable('stocks_real_time', by_range('time'));
CREATE INDEX ix_symbol_time ON stocks_real_time (symbol, time DESC);
GRANT ALL PRIVILEGES ON TABLE stocks_real_time TO app;

INSERT INTO stocks_real_time (time, symbol, price, day_volume)
VALUES
  (NOW(), 'AAPL', 150.50, 1000000),
  (NOW(), 'GOOGL', 2800.75, 500000),
  (NOW(), 'MSFT', 300.25, 750000);

6. Failover Test

Force a backup:

kubectl cnpg backup example

Initiate failover by deleting the primary pod:

kubectl delete pod example-1

Monitor the cluster status:

kubectl cnpg status example

Key observations during failover:

Initial status: “Switchover in progress”
After approximately 2 minutes 15 seconds: “Waiting for instances to become active”
After approximately 3 minutes 30 seconds: Complete failover with new primary

Verify data integrity after failover through service:

Retrieve the database password:

kubectl get secret example-app -o \
  jsonpath="{.data.password}" | base64 --decode

Connect to the database using the password:

kubectl exec -it pgclient -- psql -h example-rw -U app

Execute the following SQL queries:

# Confirm the count matches the number of rows inserted earlier. It will show 3
SELECT COUNT(*) FROM stocks_real_time;

#Insert new data to test write capability after failover:
INSERT INTO stocks_real_time (time, symbol, price, day_volume)
VALUES (NOW(), 'NFLX', 500.75, 300000);


SELECT * FROM stocks_real_time ORDER BY time DESC LIMIT 1;

Check read-only service:

kubectl exec -it pgclient -- psql -h example-ro -U app

Once connected, execute:

SELECT COUNT(*) FROM stocks_real_time;

Review logs of both pods:

kubectl logs example-1
kubectl logs example-2

Examine the logs for relevant failover information.

Perform a final cluster status check:

kubectl cnpg status example

Confirm both instances are running and roles are as expected.

7. Backup and Restore Test

First, check the current status of your cluster:

kubectl cnpg status example

Note the current state, number of instances, and any important details.

Promote the example-1 node to Primary:

kubectl cnpg promote example example-1

Monitor the promotion process, which typically takes about 3 minutes to complete.

Check the updated status of your cluster, then create a new backup:

kubectl cnpg backup example –backup-name=example-backup-1

Verify the backup status:

kubectl get backups
NAME               AGE   CLUSTER   METHOD              PHASE       ERROR
example-backup-1   38m   example   barmanObjectStore   completed

Delete the Original Cluster then prepare for the recovery test:

kubectl delete cluster example

There are two methods available to perform a Cluster Recovery bootstrap from another cluster. For further details, please refer to the documentation. There are two ways to achieve this result in CloudNativePG:

Using a recovery object store, that is a backup of another cluster created by Barman Cloud and defined via the barmanObjectStore option in the externalClusters section (recommended)
Using an existing Backup object in the same namespace (this was the only option available before version 1.8.0).

Method 1: Recovery from an Object Store

You can recover from a backup created by Barman Cloud and stored on supported object storage. Once you have defined the external cluster, including all the required configurations in the barmanObjectStore section, you must reference it in the .spec.recovery.source option.

Create a file named example-object-restored.yaml with the following content:

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: example-object-restored
spec:
  instances: 2
  imageName: ghcr.io/clevyr/cloudnativepg-timescale:16-ts2
  postgresql:
    shared_preload_libraries:
      - timescaledb
  storage:
    size: 1Gi
  bootstrap:
    recovery:
      source: example
  externalClusters:
    - name: example
      barmanObjectStore:
        destinationPath: 's3://your-bucket-name'
        s3Credentials:
          accessKeyId:
            name: s3-creds
            key: ACCESS_KEY_ID
          secretAccessKey:
            name: s3-creds
            key: ACCESS_SECRET_KEY

Apply the restored cluster configuration:

kubectl apply -f example-object-restored.yaml

Monitor the restored cluster status:

kubectl cnpg status example-object-restored

Retrieve the database password:

kubectl get secret example-object-restored-app \
  -o jsonpath="{.data.password}" | base64 --decode

Connect to the restored database:

kubectl exec -it pgclient -- psql -h example-object-restored-rw -U app

Verify the restored data by executing the following SQL queries:

# it should show 4
SELECT COUNT(*) FROM stocks_real_time;
SELECT * FROM stocks_real_time;

The successful execution of these steps to recover from an object store confirms the effectiveness of the backup and restore process.

Delete the example-object-restored Cluster then prepare for the backup object restore test:

kubectl delete cluster example-object-restored

Method 2: Recovery from Backup Object

In case a Backup resource is already available in the namespace in which the cluster should be created, you can specify its name through .spec.bootstrap.recovery.backup.name

Create a file named example-restored.yaml:

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: example-restored
spec:
  instances: 2
  imageName: ghcr.io/clevyr/cloudnativepg-timescale:16-ts2
  postgresql:
    shared_preload_libraries:
      - timescaledb
  storage:
    size: 1Gi
  bootstrap:
    recovery:
      backup:
        name: example-backup-1

Apply the restored cluster configuration:

kubectl apply -f example-restored.yaml

Monitor the restored cluster status:

kubectl cnpg status example-restored

Retrieve the database password:

kubectl get secret example-restored-app \
  -o jsonpath="{.data.password}" | base64 --decode

Connect to the restored database:

kubectl exec -it pgclient -- psql -h example-restored-rw -U app

Verify the restored data by executing the following SQL queries:

SELECT COUNT(*) FROM stocks_real_time;
SELECT * FROM stocks_real_time;

The successful execution of these steps confirms the effectiveness of the backup and restore process.

Kubernetes Events and Logs

1. Failover Events

Monitor events using:

# Watch cluster events
kubectl get events --watch | grep postgresql

# Get specific cluster events
kubectl describe cluster example | grep -A 10 Events

Key events to monitor:
- Primary selection process
- Replica promotion events
- Connection switching events
- Replication status changes

2. Backup Status

Monitor backup progress:

# Check backup status
kubectl get backups

# Get detailed backup info
kubectl describe backup example-backup-1

Key metrics:
- Backup duration
- Backup size
- Compression ratio
- Success/failure status

3. Recovery Process

Monitor recovery status:

# Watch recovery progress
kubectl cnpg status example-restored

# Check recovery logs
kubectl logs example-restored-1 \-c postgres

Important recovery indicators:
- WAL replay progress
- Timeline switches
- Recovery target status

Conclusion

The Cloud Native PostgreSQL Operator significantly simplifies the management of PostgreSQL clusters in Kubernetes environments. By following these practices for failover and disaster recovery, organizations can maintain highly available database systems that recover quickly from failures while minimizing data loss. Remember to regularly test your failover and disaster recovery procedures to ensure they work as expected when needed. Continuous monitoring and proactive maintenance are key to maintaining a robust PostgreSQL infrastructure.

Everything fails, all the time. ~ Werner Vogels, CTO, Amazon Web services

Editoral: And if you are looking for looking for a distributed, scalable, reliable, and durable storage for your PostgreSQL cluster in Kubernetes or any other Kubernetes storage need, simplyblock is the solution you’re looking for.

The post How to Build Scalable and Reliable PostgreSQL Systems on Kubernetes appeared first on simplyblock.

Best Open Source Tools For PostgreSQL

Rahil Parekh — Thu, 24 Oct 2024 23:09:36 +0000

What is PostgreSQL?

PostgreSQL is a powerful, open-source object-relational database system that uses and extends the SQL language combined with many features that safely store and scale the most complicated data workloads. The origins of PostgreSQL date back to 1986 as part of the POSTGRES project at the University of California at Berkeley and has more than 35 years of active development on the core platform.

PostgreSQL has earned a strong reputation for its proven architecture, reliability, data integrity, robust feature set, extensibility, and the dedication of the open-source community behind the software to consistently deliver performant and innovative solutions. PostgreSQL runs on all major operating systems, has been ACID-compliant since 2001, and has powerful add-ons such as the popular PostGIS geospatial database extender. It is no surprise that PostgreSQL has become the open source relational database of choice for many people and organisations.

What are the best open-source tools for your PostgreSQL setup?

As PostgreSQL continues to be the go-to choice for developers and organizations around the world, the demand for robust and reliable open-source tools has surged. Database administrators and developers are consistently seeking tools that can help them manage their PostgreSQL databases more effectively. In this post, we will explore nine must-know open-source tools that can help you optimize your PostgreSQL environment.

1. pgAdmin

pgAdmin is the most widely used open-source management tool for PostgreSQL. It provides a comprehensive graphical interface for managing PostgreSQL databases, making it easier to design schemas, run queries, and manage database operations. With pgAdmin, you can manage multiple PostgreSQL servers, perform backups, and even debug your queries, all within a user-friendly interface.

2. PostGIS

PostGIS is an extension that transforms PostgreSQL into a spatial database for geographic information system (GIS) applications. It adds support for geographic objects, allowing you to run spatial queries, store geographic data, and integrate with various mapping tools. PostGIS is crucial for any project involving geospatial data, enabling complex geographical queries with high efficiency.

3. pgBackRest

Kubectl is the command-line tool that allows you to interact with your Kubernetes clusters. It is indispensable for managing cluster resources, deploying applications, and troubleshooting issues. Whether you’re scaling your applications or inspecting logs, Kubectl provides the commands you need to get the job done.

4. pgbouncer

pgbouncer is a lightweight connection pooler for PostgreSQL, designed to manage and optimize database connections. As your applications scale, managing database connections becomes crucial, and pgbouncer helps by reusing existing connections and reducing the overhead associated with creating new ones. This tool is especially useful in high-traffic environments where performance is paramount.

5. pg_stat_statements

pg_stat_statements is an essential extension for performance tuning in PostgreSQL. It tracks the execution statistics of SQL statements, enabling you to identify and optimize long-running queries. By providing insights into query performance, pg_stat_statements helps you make informed decisions to enhance database efficiency and reduce resource consumption.

6. WAL-G

WAL-G is an open-source tool that focuses on the Write-Ahead Logging (WAL) mechanism in PostgreSQL, providing an efficient solution for archival and restoration. It supports full and incremental backups and integrates with various cloud storage services, making it a powerful tool for implementing Point-In-Time Recovery (PITR) and disaster recovery strategies.

7. pgAudit

pgAudit enhances PostgreSQL’s native logging capabilities by providing detailed audit logs for all database activities. This tool is invaluable for organizations that need to comply with security policies and regulatory requirements. pgAudit allows you to track data access and modifications, ensuring a secure and compliant database environment..

8. pgLoader

pgLoader simplifies the process of migrating data into PostgreSQL from various sources, including MySQL, SQLite, and CSV files. This open-source tool automates data migration, ensuring data integrity and minimizing downtime during the migration process. pgLoader is essential for organizations looking to transition to PostgreSQL smoothly and efficiently.

9. Patroni

Patroni is an open-source tool that automates the management of PostgreSQL clusters, ensuring high availability. It leverages PostgreSQL’s streaming replication feature to keep your database available even in the event of a server failure. Patroni integrates seamlessly with tools like etcd and Consul, providing a robust solution for maintaining high availability in PostgreSQL environments.

Why Choose simplyblock for PostgreSQL?

While PostgreSQL provides robust data management capabilities, optimizing storage performance and ensuring efficient management of WAL files and tablespaces is crucial. This is where simplyblock’s intelligent storage optimization creates unique value:

Optimized PostgreSQL I/O Management

Simplyblock enhances PostgreSQL’s storage efficiency through intelligent volume management. By separating WAL files and tablespaces onto different logical volumes with tailored performance characteristics, simplyblock ensures optimal I/O patterns. WAL files benefit from ultra-low latency NVMe storage for fast commit times, while tablespaces can leverage tiered storage, automatically moving cold data to cost-effective S3 storage. This architecture is particularly valuable for time-series data or partitioned tables where storage costs can grow significantly.

PostgreSQL High Availability Enhancement

Simplyblock streamlines PostgreSQL’s replication capabilities through its multi-attach features and unified storage approach. Instead of maintaining separate EBS volumes for primary and standby instances, simplyblock enables efficient storage sharing while maintaining data consistency. The platform’s NVMe over TCP protocol ensures minimal replication lag, while its pooled storage approach eliminates the complexity of managing individual volumes. This is especially beneficial for organizations running PostgreSQL with streaming replication or using tools like Patroni for automated failover.

Enterprise-Grade Disaster Recovery

Simplyblock strengthens PostgreSQL’s disaster recovery capabilities through sophisticated backup and point-in-time recovery features. The platform’s ability to stream write-ahead logs to S3 provides near-zero RPO disaster recovery without impacting database performance. By maintaining consistent snapshots across multiple volumes, simplyblock ensures that backups preserve referential integrity across database clusters, particularly crucial for organizations running multiple PostgreSQL instances alongside other databases. This approach significantly enhances PostgreSQL’s native backup capabilities while simplifying the operational complexity of managing disaster recovery procedures.

How to Optimize PostgreSQL Storage with Open-source Tools

This guide explored nine essential open-source tools for PostgreSQL, from pgAdmin’s management interface to Patroni’s high-availability features. While these tools excel at different aspects – PostGIS for spatial data, pgbouncer for connection pooling, and WAL-G for backup management – proper implementation is crucial. Tools like pg_stat_statements enable performance monitoring, while pgAudit provides comprehensive auditing. Each tool offers unique approaches to managing and optimizing PostgreSQL deployments.

If you’re looking to further streamline your PostgreSQL operations, simplyblock offers comprehensive solutions that integrate seamlessly with these tools, helping you get the most out of your PostgreSQL environment.

Ready to take your PostgreSQL management to the next level? Contact simplyblock today to learn how we can help you simplify and enhance your PostgreSQL journey.

The post Best Open Source Tools For PostgreSQL appeared first on simplyblock.

PostgreSQL for Everything | Mike Freedman

Rahil Parekh — Fri, 09 Aug 2024 01:24:30 +0000

Introduction:

This interview is part of the simplyblock Cloud Frontier Podcast, available on Youtube , Spotify , iTunes/Apple Podcasts , and our show site .

In this episode of simplyblock’s Cloud Commute podcast, Chris Engelbert hosts Mike Freedman, co-founder and CTO of Timescale, to explore how TimescaleDB extends PostgreSQL for time-series data. Mike delves into how TimescaleDB optimizes Postgres for high-ingestion workloads, real-time analytics, and event-driven applications. Whether you’re handling IoT data or large-scale event logging, this episode offers valuable insights into the capabilities of TimescaleDB for modern data management.

Key Takeaways

What is TimescaleDB, and how does it Extend Postgres for Time-series Data?

TimescaleDB is an open-source time-series database that builds on PostgreSQL, offering powerful capabilities for managing time-series data, event data, and real-time analytics. Unlike traditional Postgres tables, TimescaleDB introduces hyper tables that optimize data storage and querying for time-series data, while maintaining full SQL compatibility. This allows developers to work within the familiar Postgres environment while gaining access to features like automated partitioning, compression, and scaling for time-series workloads.

How does TimescaleDB Handle Time-series Data Storage and Performance Scaling?

TimescaleDB automates data partitioning using hyper tables, breaking time-series data into chunks based on time intervals. It also offers compression, transforming data from row-based to columnar format to reduce storage costs while optimizing query performance. For even more efficiency, TimescaleDB employs tiered storage, shifting older data to slower, cheaper storage without sacrificing accessibility. This architecture allows TimescaleDB to handle massive amounts of time-series data while maintaining high performance for both recent and historical data queries.

What are the Key Benefits of using TimescaleDB for Real-time Analytics?

TimescaleDB excels at real-time analytics, enabling fast data ingestion and high-performance querying over large datasets. With features like automated partitioning, compression, and tiered storage, TimescaleDB can efficiently process real-time data, making it ideal for applications in IoT, financial services, and infrastructure monitoring, where quick insights are critical. Its ability to scale seamlessly as data grows also ensures that businesses can continue to derive value from their data without hitting performance bottlenecks.

In addition to highlighting the key takeaways, it’s essential to provide deeper context and insights that enrich the listener’s understanding of the episode. By offering this added layer of information, we ensure that when you tune in, you’ll have a clearer grasp of the nuances behind the discussion. This approach enhances your engagement with the content and helps shed light on the reasoning and perspective behind the thoughtful questions posed by our host, Chris Engelbert. Ultimately, this allows for a more immersive and insightful listening experience.

Key Learnings

What is Time-series Data, and how can it be used in Applications?

Time-series data refers to data points collected or recorded at successive points in time, often used to track trends, monitor systems, or log events. Applications for time-series data range from IoT devices recording sensor readings, to financial systems tracking stock prices, to IT infrastructure logging performance metrics. Time-series data allows organizations to analyze patterns and trends over time, making it crucial for real-time analytics and forecasting.

Simplyblock Insight:

Handling time-series data effectively requires an infrastructure capable of ingesting, processing, and storing vast amounts of data. Simplyblock’s cloud storage platform is designed to manage these demands, offering high-throughput storage access for data pipelines that scale with your needs. By leveraging simplyblock, businesses can ensure their time-series applications run smoothly, even as data volumes grow.

What are the best Practices for Managing Large-scale Time-series Data in Postgres?

Managing large-scale time-series data in Postgres involves techniques such as partitioning, indexing, and using efficient data types. Partitioning data based on time intervals can significantly speed up queries, while indexing frequently queried columns improves performance. Additionally, compressing old data helps to reduce storage costs without affecting query performance.

Simplyblock Insight:

When dealing with large datasets, performance optimization is key. Simplyblock’s storage solutions provide the scalability needed to manage large time-series databases while ensuring low-latency access to frequently queried data. With automated scaling and cost-efficient storage, simplyblock makes it easier to implement best practices for managing time-series data in Postgres.

How can Developers Build Real-time APIs using Postgres and Time-series Data?

Developers can build real-time APIs using Postgres by leveraging time-series databases like TimescaleDB, which are optimized for handling high-ingestion workloads. With Postgres’ powerful querying capabilities and TimescaleDB’s optimizations for time-series data, developers can create APIs that deliver real-time insights by querying the most recent data and providing up-to-the-minute results to users.

Simplyblock Insight:

Building real-time APIs requires an infrastructure capable of handling continuous data ingestion and rapid query execution. Simplyblock’s high-performance storage resources ensure that APIs can deliver real-time results, even when handling massive amounts of time-series data. With simplyblock, developers can scale their API services seamlessly, ensuring reliable performance under heavy workloads.

What are the use Cases for Time-series Data in IoT, Finance, and Analytics?

In IoT, time-series data is used to monitor and control devices, track sensor readings, and optimize system performance. In finance, it powers real-time trading systems, risk management, and fraud detection by analyzing stock prices, market trends, and transaction logs. Time-series data also plays a key role in analytics, helping organizations visualize trends, monitor KPIs, and predict future outcomes.

Simplyblock Insight:

IoT and financial applications demand both high data ingestion rates and near-instantaneous analysis. Simplyblock’s cloud storage solution is designed to handle the unique challenges of these industries, providing scalable storage and fast access to real-time data. With simplyblock, businesses can ensure that their IoT and financial analytics applications perform optimally, delivering timely insights and actions.

Additional Nugget of Information

What are the Key Benefits of Real-time Analytics for Data-driven Industries?

Real-time analytics enables organizations to make immediate, data-driven decisions by processing and analyzing data as it’s ingested. This is particularly valuable in industries like finance, healthcare, and logistics, where instant access to insights can improve operational efficiency, reduce risks, and enhance customer experiences. By monitoring data in real-time, businesses can identify trends, detect anomalies, and respond to critical events as they happen.

Conclusion

TimescaleDB brings the power of PostgreSQL to time-series data, making it an excellent choice for applications that require real-time analytics and event data management. Its features, such as hyper tables, automated partitioning, and compression, make it easy to handle high-ingestion workloads and large datasets while maintaining fast query performance.

Simplyblock’s cloud infrastructure provides the ideal environment for scaling TimescaleDB applications, offering elastic storage, high-throughput data processing, and reliable performance. Whether you’re working with IoT data, financial transactions, or monitoring real-time analytics, simplyblock ensures your applications can scale seamlessly as data grows, delivering the performance and cost-efficiency you need.

For more insights on cloud-native databases and the future of data-driven applications, be sure to tune in to future episodes of the Cloud Commute podcast!

The post PostgreSQL for Everything | Mike Freedman appeared first on simplyblock.

How I designed PostgreSQL High Availability with Shaun Thomas from Tembo (video + interview)

Chris Engelbert — Thu, 20 Jun 2024 12:08:26 +0000

This interview is part of the simplyblock’s Cloud Commute Podcast, available on Youtube, Spotify, iTunes/Apple Podcasts, Pandora, Samsung Podcasts, and our show site.

In this installment, we’re talking to Shaun Thomas (Twitter/X , personal blog), affectionately known as “Mr. High Availability” in the Postgres community, to discuss his journey from a standard DBA to a leading expert in high availability solutions for Postgres databases. Shaun shares his experiences working in financial services, where he redefined high availability using tools like Pacemaker and DRBD, and the path that led him to authoring a comprehensive book on the subject. Shaun also talks about his current work at Tembo, an organization dedicated to advancing open-source Postgres, and their innovative approaches to high availability, including the use of Kubernetes and containerized deployments.

Chris Engelbert: Hello, welcome back to this week’s episode of simplyblock’s Cloud Commute podcast. This week I have – no, I’m not saying that. I’m not saying I have another incredible guest, even though I have. He’s already shaking his head. Nah, I’m not incredible. He’s just known as Mr. High Availability in the Postgres space for a very specific reason. I bet he’ll talk about that in a second.

So hello, Shaun. Shaun Thomas, thank you for being here. And maybe just introduce yourself real quick. Who are you? Well, where are you from? How did you become Mr. High Availability?

Shaun Thomas: Yeah, so glad to be here. Kind of hang out with you. We talked a little bit. It’s kind of fun. My background is I was just a standard DBA, kind of working on programming stuff at a company I was at and our DBA quit, so I kind of had to pick it up to make sure we kept going. And that was back in the Oracle days. So I just kind of read a bunch of Oracle books to kind of get ready for it. And then they had some layoffs, so our whole division got cut. And then my next job was as a DBA. And I just kind of latched onto it from there.

And as far as how I got into high availability and where I kind of made that my calling card was around 2010, I started working for a company that was in financial services. And they had to keep their systems online at all times because every second they were down, they were losing millions of dollars.

So they actually already had a high availability stack, but it was using a bunch of proprietary tools. So when I started working there, I basically reworked everything. We ended up using the standard stack at the time, which was Pacemaker with Corosync and DRBD for distributed replicating block device because we didn’t really trust replication back then; it was still too new.

We were also running Enterprise DB at the time, so there were a bunch of beta features they had kind of pushed into 9.2 at the time, I think. Because of that whole process and not really having any kind of guide to follow, since there were not a lot of high availability tools back in 2010, 2011, I basically wrote up our stack and the process I used. I presented it at the second Postgres Open that was in Chicago. I did a live demo of the entire stack, and that video is probably online somewhere. My slides, I think, are also on the Postgres Wiki. But after that, I was approached by Packt, the publisher. They wanted me to write a book on it. So I did. I did it mainly because I didn’t have a book to follow. Somebody else in this position really needs to have some kind of series or a book or some kind of step-by-step thing because high availability in Postgres is really important. You don’t want your database to go down in a lot of situations. Until there’s a lot more tools out there to cover your bases, being able to do it is important. Now there’s tons of tools for it, so it’s not a big problem. But back then, man, oof.

Chris Engelbert: Yeah, yeah. I mean, you just mentioned Pacemaker. I’m not sure when I heard that thing the last time. Is that even still a thing?

Shaun Thomas: There’s still a couple of companies using it. Yeah, you would be surprised. I think DFW does in a couple of spots.

Chris Engelbert: All right. I haven’t heard about that in at least a decade, I think. Everything I’ve worked with had different– or let’s say other tools, not different tools. Wow. Yeah, cool. So you wrote that book. And you said you came from an Oracle world, right? So how did the transition to Postgres happen? Was that a choice?

Shaun Thomas: For me, it wasn’t really much of a transition because, like I said, our DBA quit at the company I was at. And it was right before a bunch of layoffs that took out that entire division. But at the time, I was like, ooh, Oracle. I should learn all this stuff. So the company just had a bunch of old training materials lying around. And there were like three or four of the huge Oracle books lying around. So I spent the next three or four weeks just reading all of them back to back.

I was testing in a cluster that we had available, and I set the local version up on my computer just to see if it worked and to learn all the stuff I was trying to understand at the time. But then the layoffs hit, so I was like, what do I do now?

I got another job at a company that needed a DBA. And that was MySQL and Postgres. But that was back when Postgres was still 6.5. Back when it crashed if you looked at it funny. So I got kind of mad at it. And I basically stopped using it from like 2005 to 2010. Or no, that was, sorry, from 2001 to 2005. From 2005, I switched to a company that they were all Postgres. So I got the purple Postgres book. The one that everyone used back then was I think it was 8.1 or 8.2. And then I revised their entire stack also because they were having problems with vacuum. Because back then, the settings were all wrong. So you would end up loading yourself out of your disk space. I ended up vacuuming their systems down from I think it was 20 gigs down to like 5. And back then, that was a lot of disk space.

Chris Engelbert: I was just about to say that in 2005, 20 gigabytes of disk space was a lot.

Shaun Thomas: But back then, the problem with vacuum was you actually had to set the size of the free space map. And the default was way too small. So what would happen is vacuum would actually only keep track of the last 200,000 unused reusable rows by default. But by default, it only kept track of the first 200,000.

So if you had more than that, even if you were vacuuming constantly, it would still bloat like a little bit every day until your whole disk was used. So I actually had to clean all that up or their system was going to crash. They were days away from going down when I joined. They had already added all the disks they could. And back then, you couldn’t just add virtual disk space.

Chris Engelbert: I know those situations, not in the Postgres or database space, but in the software development space where– same thing, I literally joined days before it all would fall apart. Let’s say those are not the best days to join.

Shaun Thomas: Hey, that’s why they hired you, right?

Chris Engelbert: Exactly. All right. So let’s talk a little bit about these days. Right now, you’re with Tembo. And you just have this very nice blog post that blew up on Hacker News for all the wrong reasons.

Shaun Thomas: Well, I mean, we created it for all the right reasons. And so let me just start on Tembo a little bit. So Tembo is like they are all in on Postgres. We are ridiculously all in. Basically, everything we do is all open sourced. You can go to Tembo.io on GitHub. And basically, our entire stack is there. And we even just released our on-prem. So you can actually use our stack on your local system and basically have a Kubernetes cloud management thing for all the clusters you want to manage. And it’ll just be our stack of tools. And the main calling card of Tembo is probably our– if you go to trunk, I think it’s called PGT.dev . We just keep track of a bunch of extensions. And it’s got a command line tool to install them, kind of like a PGXN. And we’re so kind of into this that we actually hired the guy who basically maintained PGXN, David Wheeler. Because we were like, we need to kind of hit the extension drum. And we’re very glad he’s re-standardizing PGXN 2. He’s starting a whole initiative. And he’s got a lot of buy-in from tons of different committers and devs and people who are really pushing it. Maybe we’ll create the gold standard of extension networks. Because the idea is to get it all so that it’s packaged, right? Kind of like a Debian or an RPM or whatever package system you want to use. It’ll just install the package on your Postgres wherever it is. Like the source install, if it’s like a package install, or if it’s something with on your Mac, whatever.

So he’s working on that really. And he’s done some demos that are very impressive. And it looks like it’ll actually be a great advancement. But Tembo is – it’s all about open source Postgres. And our tools kind of show that. Like if you’ve ever heard of Adam Hendel, he goes by Chuck. But if you heard of PGMQ or PG Vectorize, which kind of makes PG Vector a little easier to use, those tools are all coming from us, basically. So we’re putting our money where our mouth is, right?

All right. That’s why I joined him. Because I kept seeing them pop up on Twitter. And I’m like, man, these guys really– they’re really dedicated to this whole thing.

Chris Engelbert: Yeah, cool. So back to PG and high availability. Why would I need that? I mean, I know. But maybe just give the audience a little bit of a clue.

Shaun Thomas: So high availability– and I kind of implied this when I was talking about the financial company, right? The whole idea is to make sure Postgres never goes down. But there’s so much more to it. I’ve done conferences. And I’ve done webinars. And I’ve done trainings. And I’ve done the book. Just covering that topic is it’s essentially an infinite font of just all the different ways you can do it, all the different prerequisites you need to fulfill, all the different things you need to set up to make it work properly. But the whole point is keep your Postgres up. But you also have to define what that means. Where do you put your Postgres instances? Where do you put your replicas? How do you get to them? Do you need an intermediate abstraction layer so that you can connect to that? And it’ll kind of decide where to send you afterwards so you don’t have any outages as far as routing is concerned?

It’s a very deep topic. And it’s easy to get wrong. And a lot of the tools out there, they don’t necessarily get it wrong. But they expect the user to get it right. One of the reasons my book did so well in certain circles is because if you want to set up EFM or repmgr or Patroni or some other tool, you have to follow very closely and know how the tool works extremely well. You have to be very familiar with the documentation. You can’t just follow step by step and then expect it to work in a lot of cases.

Now, there’s a lot of edge cases you have to account for. You have to know why and the theories behind the high availability and how it works a certain way to really deploy it properly.

So even as a consultant when I was working at EDB and a second quadrant, it’s easy to give a stack to a customer and they can implement it with your recommendations. And you can even set it up for them. There’s always some kind of edge case that you didn’t think of.

So the issue with Postgres, in kind of my opinion, is it gives you a lot of tools to build it yourself, but it expects you to build it yourself. And even the other stack tools, like I had mentioned earlier, like repmgr or EFM or Patroni, those are pg auto_failover, another one that came out recently. They work, but you’ve got to install them. And you really do need access to an expert that can come in if something goes wrong. Because if something goes wrong, you’re kind of on your own in a lot of ways.

Postgres doesn’t really have an inherent integral way of managing itself as a cluster. It’s more of like a database that just happens to be able to talk to other nodes to keep them up to date with sync and whatnot. So it’s important, but it’s also hard to do right.

Chris Engelbert: I think you mentioned one important thing. It is important to upfront define your goals. How much uptime do you really need? Because one thing that not only with Postgres, but in general, whenever we talk about failure tolerance systems, high availability, all those kinds of things, what a lot of people seem to forget is that high availability or fault tolerance is a trade-off between how much time and money do I invest and how much money do I lose if something really, well, you could say, s***t hits the fan, right?

Shaun Thomas: Exactly. And that’s the thing. Companies like the financial company I worked at, they took high availability to a fault. They had two systems in their main data center and two more in their disaster recovery data center, all fully synced and up to date. They maintained daily backups on local systems, with copies sent to another system locally holding seven days’ worth. Additionally, backups were sent to tape, which was then sent to Glacier for seven years as per SEC rules.

So, someone could come into our systems and maliciously erase everything, and we’d be back up in an hour. It was very resilient, a result of our design and the amount of money we dedicated toward it because that was a very expensive deployment. That’s atleast 10 servers right there.

Chris Engelbert: But then, when you say you could be back up in an hour, the question is, how much money do you lose in that hour?

Shaun Thomas: Well, like I said, that scenario is like someone walking in and literally smashing all the servers. We’d have to rebuild everything from scratch. In most cases, we’d be up – and this is where your RTO and RPO come in, the recovery time objective and your recovery point objective. Basically, how much do you want to spend to say I want to be down for one minute or less? Or if I am down for that one minute, how much data will I lose? Because the amount of money you spend or the amount of resources you dedicate toward that thing will determine the end result of how much data you might lose or how much money you’ll need to spend to ensure you’re down for less than a minute.

Chris Engelbert: Exactly, that kind of thing. I think that becomes more important in the cloud age. So perfect bridge to cloud, Postgres and cloud, perfect. You said setting up HA is complicated because you have to install the tools, you have to configure them. These days, when you go and deploy Postgres on something like Kubernetes, you would have an operator claiming at least doing all the magic for you. What is your opinion on the magic?

Shaun Thomas: Yeah, so my opinion on that is it evolved a lot. Back when I first started seeing containerized systems like Docker and that kind of thing, my opinion was, I don’t know if I’d run a production system in a container, right? Because it just seems a little shady. But that was 10 years ago or more. Now that Kubernetes tools and that kind of thing have matured a lot, what you get out of this now is you get a level of automation that just is not possible using pretty much anything else. And I think what really sold it to me was – so you may have heard of Gabriele Bartolini. He basically heads up the team that writes and maintains Cloud Native Postgres, the Cloud Native PG operator. We’ll talk about operators probably a bit later. But the point of that was back when—and 2ndQuadrant was before they were bought by EDB—we were selling our BDR tool for bi-directional application for Postgres, right? So multi-master. And we needed a way to put that in a Cloud service for obvious purposes so we could sell it to customers. And that meant we needed an operator. Well, before Cloud Native Postgres existed, there was the BDR operator that we were cycling internally for customers.

And one day while we were in Italy—because every employee who worked at 2ndQuadrant got sent to Italy for a couple of weeks to get oriented with the team, that kind of thing. During that time when I was there in 2020, I think I was there for February, for the first two weeks of February. He demoed that, and it kind of blew me away. We were using other tools to deploy containers. And it was basically Ansible to automate the deployment with Terraform. And then you kind of set everything up and then deploy everything. It takes minutes to set up all the packages and get everything deployed and reconfigure everything. Then you have to wait for syncs and whatnot to make sure everything’s proper.

On someone’s laptop, they set up Kubernetes Docker deployment. Kind, I think we were using at that point, Kubernetes in Docker. And in less than a minute, he had on his laptop set up a full Kubernetes cluster of three replicating, bidirectional replicating, so three multi-master nodes of Postgres on his laptop in less than a minute. And I was just like, my mind was blown. And the thing is, basically, it’s a new concept. The data is what matters. The nodes themselves are completely unimportant. And that’s why, to kind of bring this back around, when Cloud Native Postgres was released by Enterprise DB kind of as an open-source tool for Postgres and not the bidirectional replication stuff for just Postgres.

The reason that was important was because it’s an ethos. The point is your compute nodes—throw them away. They don’t matter. If one goes down, you provision a new one. If you need to upgrade your tooling or the packages, you throw away the old container image, you bring up a new one. The important part is your data. And as long as your data is on your persistent volume claim or whatever you provision that as, the container itself, the version of Postgres you’re running, those aren’t nearly as important. So it complicates debugging to a certain extent. And we can kind of talk about that maybe later. But the important part is it brings high availability to a level that can’t really be described using the old methods. Because the old method was you create two or three replicas. And if one goes down, you’ve got a monitoring system that switches over to one of the alternates. And then the other one might come back or might not. And then you rebuild it if it does, that kind of thing.

With the Kubernetes approach or the container approach, as long as your storage wasn’t corrupted, you can just bring up a new container to represent that storage. And you can actually have a situation where the primary goes down because maybe it got OOM killed for some reason. It can actually go down, get a new container provisioned, and come back up before the monitors even notice that there was an outage and the switch to a replica and promote it. There’s a whole mechanism of systems in there to kind of reduce the amount of timeline switches and other kind of complications behind the scenes. So you have a cohesive, stable timeline. You maximize your uptime. They’ve got layers to redirect connections from the outside world through either traffic or some other kind of proxy to get into your actual cluster. You always get an endpoint somehow. And that’s something that was horribly wrong, but that’s true for anything. But the ethos of your machines aren’t important. It spoke to me a little bit because it brings you to a level that sure, their hardware is great. And I actually prefer it. I’ve got servers in my basement specifically for testing clusters and Postgres and whatnot. But if you have the luxury of provisioning what you need at the time, if I want more compute nodes, like I said, show my image, bring up a new one that’s got more resources allocated to it, suddenly I’ve grown vertically. And that’s something you can’t really do with bare hardware, at least not very easily.

So then I was like, well, maybe this whole container thing isn’t really a problem, right? So yeah, it’s all because of my time in 2ndQuadrant and Gabriele’s team that high availability does belong in the cloud. And you can run production in the cloud on Kubernetes and containers. And in fact, I encourage it.

Chris Engelbert: I love that. I love that. I also think high availability in cloud, and especially cloud native are concepts that are perfectly in line and perfectly in sync. Unfortunately, we’re out of time. I didn’t want to stop you, but I think we have to invite you again and keep talking about that. But one last question. One last question. By the way, I love when you said that containers were a new thing like 10 years ago, except for you came from the Solaris or BSD world where those things were –

Shaun Thomas: Jails!

Chris Engelbert: But it’s still different, right? You didn’t have this orchestration layer on top. The whole ecosystem evolved very differently in the Linux space. Anyway, last question. What do you think is the next big thing? What is upcoming in the Postgres, the Linux, the container world, what do you think is amazing on the horizon?

Shaun Thomas: I mean, I hate to be cliche here, but it’s got to be AI. If you look at pgvector, it’s basically allowing you to do vectorized similar researches right in Postgres. And I think Timescale even released pgvectorscale, which is an extension that makes pgvector even better. It makes it apparently faster than dedicated vector databases like Pinecone. And it’s just an area that if you’re going to do any kind of result, augmented generation, like RAG searches, or if you’re doing any LLM work at all, if you’re building chatbots, or if you’re just doing, like I said, augmented searches, any of that kind of work, you’re going to be wanting your data that’s in Postgres already, right? You’re going to want to make that available to your AI. And the easiest way to do that is with pgvector.

Tembo even wrote an extension we call pg_vectorize, which automatically maintains your embeddings, which is how you kind of interface your searches with the text. And then you can feed that back into an LLM. It also has the ability to do that for you. Like it can send messages directly to OpenAI. We can also interface with arbitrary paths so you can set up an Ollama or something on a server or locally. And then you can set that to be the end target. So you can even keep your messages from hitting external resources like Microsoft or OpenAI or whatever, just do it all locally. And that’s all very important. So that I think is going to be– it’s whatever one– not either one, but a lot of people are focusing on it. And a lot of people find it annoying. It’s another AI thing, right? But I wrote two blog posts on this where I wrote a RAG app using some Python and pgvector. And then I wrote a second one where I used pg_vectorize and I cut my Python code by like 90%. And it just basically talks to Postgres. Postgres is doing it all. And that’s because of the extension ecosystem, right? And that’s one of the reasons Postgres is kind of on the top of everyone’s mind right now because it’s leading the charge. And it’s bringing a lot of people in that may not have been interested before.

Chris Engelbert: I love that. And I think that’s a perfect sentence to end the show. The Postgres ecosystem or extension system is just incredible. And there’s so much stuff that we’ve seen so far and so much more stuff to come. I couldn’t agree more.

Shaun Thomas: Yeah, it’s just the beginning, man.

Chris Engelbert: Yeah, let’s hope that AI is not going to try to build our HA systems. And I’m happy.

Shaun Thomas: Maybe not yet, yeah.

Chris Engelbert: Yeah, not yet at least. Exactly. All right, thank you for being here. It was a pleasure. As I said, I think I have to invite you again somewhere in the future.

Shaun Thomas: More than willing.

Chris Engelbert: And to the audience, thank you for listening in again. I hope you come back next week. And thank you very much. Take care.

The post How I designed PostgreSQL High Availability with Shaun Thomas from Tembo (video + interview) appeared first on simplyblock.

How to choose your Kubernetes Postgres Operator?

Chris Engelbert — Thu, 06 Jun 2024 12:09:42 +0000

A Postgres Operator for Kubernetes eases the management of PostgreSQL clusters inside the Kubernetes (k8s) environment. It watches changes (additions, updates, deletes) of PostgreSQL related CRD (custom resource definitions) and applies changes to the running clusters accordingly.

Therefore, a Postgres Operator is a critical component of the database setup. In addition to the “simply” management of the Postgres cluster, it often also provides additional integration with external tools like pgpool II or pgbouncer (for connection pooling), pgbackrest or Barman (for backup and restore), as well as Postgres extension management.

Critical components should be chosen wisely. Oftentimes, it’s hard to exchange one tool for another later on. Additionally, depending on your requirements, you may need a tool with commercial support available, while others prefer the vast open source ecosystem. If both, even better.

But how do you choose your Postgres Operator of choice? Let’s start with a quick introduction of the most common options.

Stolon

Stolon is one of the oldest operators available. Originally released in November 2015 it even predates the term Kubernetes Operator .

The project is well known and has over 4.5k stars on GitHub. However the age shows. Many features aren’t cloud-native, it doesn’t support CRDs (custom resource definition), hence configuration isn’t according to Kubernetes. All changes are handled through a command line tool. Apart from that, the last release 0.17.0 is from September 2021, and there isn’t really any new activity on the repository. You can read that as “extremely stable” or “almost abandoned”. I guess both views on the matter are correct. Anyhow, from my point of view, the lack of activity is concerning for a tool that’s potentially being used for 5-10 years, especially in a fast-paced world like the cloud-native one.

Personally, I don’t recommend using it for new setups. I still wanted to mention it though since it deserves it. It did and does a great job for many people. Still, I left it out of the comparison table further below.

CloudNativePG

CloudNativePG is the new kid on the block, or so you might think. Its official first commit happened in March 2022. However, CloudNativePG was originally developed and sponsored by EDB (EnterpriseDB), one of the oldest companies built around PostgreSQL.

As the name suggests, CloudNativePG is designed to bring the full cloud-native feeling across. Everything is defined and configured using CRDs. No matter if you need to create a new Postgres cluster, want to define backup schedules, configure automatic failover, or scale up and down. It’s all there. Fully integrated into your typical Kubernetes way of doing things. Including a plugin for kubectl.

On GitHub , CloudNativePG skyrocketed in terms of stargazers and collected over 3.6k stars over the course of its lifetime. And while it’s not officially a CNCF (Cloud Native Computing Foundation) project, the CNCF stands pretty strongly behind it. And not to forget, the feature set is on par with older projects. No need to hide here.

All in all, CloudNativePG is a strong choice for a new setup. Fully in line with your Kubernetes workflows, feature rich, and a strong and active community.

Editor’s note: Btw, we had Jimmy Angelakos as a guest on the Cloud Commute podcast , and he talks about CloudNativePG. You should listen in.

Crunchy Postgres Operator (PGO)

PGO, or the Postgres Operator from Crunchy Data , has been around since March 2017 and is a very beloved choice by a good chunk of people. On GitHub, PGO has over 3.7k stars, an active community, quick response times to issues, and overall a pretty stable timeline activity.

Postgres resources are defined and managed through CRDs, as are Postgres users. Likewise, PGO provides integration with a vast tooling ecosystem, such as patroni (for automatic failover), pgbackrest (for backup management), pgbouncer (connection pooling), and more.

A good number of common extensions are provided out of the box, however, adding additional extensions is a little bit more complicated.

Overall, Crunchy PGO is a solid, production-proven option, also into the future. While not as fresh and hip as CloudNativePG anymore, it checks all the necessary marks, and it does so for many people for many years.

OnGres StackGres

StackGres by OnGres is also fairly new and tries to do a few things differently. While all resources can be managed through CRDs, they can also be managed through a CLI, and even a web interface. Up to the point that you can create a resource through a CRD, change a small property through the CLI, and scale the Postgres cluster using the webui. All management interfaces are completely interchangeable.

Same goes for extension management. StackGres has the biggest amount of supported Postgres extensions available, and it’s hard to find an extension which isn’t supported out of the box.

In terms of tooling integration, StackGres supports all the necessary tools to build a highly available, fault tolerant, automatically backed up and restored, scalable cluster.

In comparison to some of the other operators I really like the independent CRD types, giving a better overview of a specific resource instead of bundling all of the complexity of the Postgres and tooling ecosystem configuration in one major CRD with hundreds and thousands of lines of code.

StackGres is my personal favorite, even though it only accumulated around 900 starts on GitHub so far.

While still young, the team behind it is a group of veteran PostgreSQL folks. They just know what they do. If you prefer a bit of a bigger community, and less of a company driven project, you’ll be better off with CloudNativePG, but apart from that, StackGres is the way to go.

Editor’s note: We had Álvaro Hernández, the founder and CEO of OnGres , in our Cloud Commute podcast, talking about StackGres and why you should use it. Don’t miss the first hand information.

AppsCode KubeDB

KubeDB by AppsCode is different. In development since 2017, it’s well known in the community, but an all-in commercial product. That said, commercial support is great and loved.

Additionally, KubeDB isn’t just PostgreSQL. It supports Elasticsearch, MongoDB, Redis, Memcache, MySQL, and more. It’s a real database-in-a-box deployment solution.

All boxes are checked for KubeDB and there isn’t much to say about it other than, if you need a commercially supported operator for Postgres, look no further than KubeDB.

Zalando Postgres Operator

Last but not least, the Postgres Operator from Zalando . Yes, that Zalando, the one that sells shoes and clothes.

Zalando is a big Postgres user and started early with the cloud-native journey. Their operator has been around since 2017 and has a sizable fanbase. On GitHub the operator managed to collect over 4k stars, has a very active community, a stable release cadence, and is a great choice.

In terms of integrations with the tooling ecosystem, it provides less flexibility and is slightly opinionated towards how things are done. It was developed for Zalando’s own infrastructure first and foremost though.

Anyhow, the Zalando operator has been and still is a great choice. I actually used it myself for my previous startup and it just worked.

Which Postgres Kubernetes Operator should I Use?

You already know the answer, it depends. I know we all hate that answer, but it is true. As hinted at in the beginning, if you need commercial support certain options are already out of the scope.

It also depends if you already have another Postgres cluster with an operator running. If it works, is there really a reason to change it or introduce another one for the new cluster?

Anyhow, below is a quick comparison table of features and supported versions that I think are important.

	CloudNativePG	Crunchy Postgres for Kubernetes	OnGres StackGres	KubeDB	Zalando Postgres Operator
Tool version	1.23.1	5.5.2	1.10.0	v2024.4.27	1.12.0
Release date	2024-04-30	2024-05-23	2024-04-29	2024-04-30	2024-05-31
License	Apache 2	Apache 2	AGPL3	Commercial	MIT
Commercial support	✔	✔	✔	✔	✘

Supported PostgreSQL Features

	CloudNativePG	Crunchy Postgres for Kubernetes	OnGres StackGres	KubeDB	Zalando Postgres Operator
Supported versions	12, 13, 14, 15, 16	11, 12, 13, 14, 15, 16	12, 13, 14, 15, 16	9.6, 10, 11, 12, 13, 14	11, 12, 13, 14, 15, 16
Postgres Clusters	✔	✔	✔	✔	✔
Streaming replication	✔	✔	✔	✔	✔
Supports Extensions	✔	✔	✔	✔	✔

High Availability and Backup Features

	CloudNativePG	Crunchy Postgres for Kubernetes	OnGres StackGres	KubeDB	Zalando Postgres Operator
Hot Standby	✔	✔	✔	✔	✔
Warm Standby	✔	✔	✔	✔	✔
Automatic Failover	✔	✔	✔	✔	✔
Continuous Archiving	✔	✔	✔	✔	✔
Restore from WAL archive	✔	✔	✔	✔	✔
Supports PITR	✔	✔	✔	✔	✔
Manual backups	✔	✔	✔	✔	✔
Scheduled backups	✔	✔	✔	✔	✔

Kubernetes Specific Features

	CloudNativePG	Crunchy Postgres for Kubernetes	OnGres StackGres	KubeDB	Zalando Postgres Operator
Backups via Kubernetes	✔	✘	✔	✔	✘
Custom resources	✔	✔	✔	✔	✔
Uses default PG images	✘	✔	✔	✘	✘
CLI access	✔	✔	✔	✔	✘
WebUI	✘	✘	✔	✔	✘
Tolerations	✔	✔	✔	✔	✔
Node affinity	✔	✔	✔	✔	✔

How to choose your Postgres Operator?

The graph below shows the GitHub stars of the above Postgres Operators. What we see is a clear domination of Stolon, PGO, and Zalando’s PG Operator, with CloudNativePG rushing in from behind. StackGres, while around longer than CloudNativePG doesn’t have the community backing behind it, yet. But GitHub stars aren’t everything.

All of the above tools are great options, with the exception of Stolon, which isn’t a bad tool, but I’m concerned about the lack of activity. Make of it what you like.

Before closing, I want to quickly give some honorable mentions to two further tools.

Percona has an operator for PostgreSQL, but the community is very small right now. Let’s see if they manage to bring it on par with the other tools. If you use other Percona tools, it’s certainly worth giving it a look: Percona Operator for PostgreSQL .

The other one is the External PostgreSQL Server Operator by MoveToKube . It didn’t really fit the topic of this blog post as it’s less of a Postgres Operator but more of a database (in Postgres’ relational entity sense) and users management tool. Meaning, it uses CRDs to add, update, remove databases in external PG servers, as it does for Postgres users. Anyhow, this tool also works with services like Timescale Cloud, Amazon RDS, and many more. Worth mentioning and maybe you can make use of it in the future.

The post How to choose your Kubernetes Postgres Operator? appeared first on simplyblock.

PostgreSQL mistakes and how to avoid them with Jimmy Angelakos

Chris Engelbert — Thu, 02 May 2024 12:12:35 +0000

This interview is part of the simplyblock’s Cloud Commute Podcast, available on Youtube , Spotify , iTunes/Apple Podcasts , Pandora , Samsung Podcasts, and our show site .

In this installment of podcast, we’re joined by Jimmy Angelakos (X/Twitter) , a freelance consultant, talks about his experiences with customers running PostgreSQL on bare-metal, in the cloud, and on Kubernetes. He also talks about his new book ” PostgreSQL Mistakes and How to Avoid Them “.

Chris Engelbert: Welcome back everyone. Welcome to the next episode of simplyblock’s Cloud Commute podcast. Today I have a very interesting guest, very different from the other ones before, because he’s actually an author, writing a book right now. Well, I think he already published one or two at least. But he’ll talk about that himself. Welcome, Jimmy.

Jimmy Angelakos: Hi, very nice to be here.

Chris Engelbert: Very nice. Thank you for being here. Maybe we just start simple with the basic stuff. Who are you? Where are you from? What do you do for a living? Except for writing a book.

Jimmy Angelakos: My name is Jimmy Angelakos, which is obviously a Greek name. I live in Edinburgh in Scotland. I’ve been working with Postgres for maybe around 16 years now, exclusively. I haven’t used any other database in 16 years in a professional capacity. Naturally, the time came to share my experiences and I wrote a couple of books on this. Well, I actually co-wrote the ” PostgreSQL16 Administration Cookbook ” with my lovely co-authors Boris Mejías, Gianni Ciolli, Vibhor Kumar, and the sadly departed Simon Riggs, who was an awesome fellow. I’d like to pay a little tribute to him as a person, as a mentor to the entire Postgres community. He will be greatly missed.

Chris Engelbert: Thank you very much. I appreciate you sharing that because I think it was last week at the time of recording. It is a sad story for the Postgres community as a whole. Thank you for sharing that. From your professional life, for the last couple of years next to writing books, I think you’re mostly working as a consultant with a couple of different companies and customers. What do you think is the most common task? I mean, you’re probably coming in to help them optimize Postgres, optimize queries.

Jimmy Angelakos: Right. I’ve done all sorts of things in the past few years, like training customers to use Postgres in general, training them to use Postgres in a specific way that is suited to their needs. I have provided support to customers who ran Postgres, and also professional services like consulting. I can’t really say what the thing they use the most is or they request the most, but I can tell you a few of the things. Some customers come in and say, “My queries aren’t running well. What can I do?” It’s like the most frequent thing you hear. Some other people say, “Tell me what hardware to buy for Postgres.” You tell them, “I can’t really give you a response because it really depends on your workload,” which is the most important factor, I think, with databases. Everyone uses them differently. If it’s a database that is widely used as Postgres with so many use cases and so many different ways to use it, you can do analytics on it. To an extent, you can use it for transaction processing (OLTP), you can use it as a document database, with JSONB. There’s all sorts of things you can do. There’s no good answer to the things that people ask like, “Give me the best tuning parameters for Postgres,” or “How to write a query the right way.” It really depends on the amount of data you have, the type of data you have, and the sort of queries you’re going to be running.

Chris Engelbert: Yeah, that makes a lot of sense. It’s not only for the Postgres community or for Postgres. That is very true for a lot of things. From my own personal background, with a lot of programming languages or runtime environments, people ask, “What is the optimized or the optimal way of configuring it?” And they’re like, “I don’t know. Can’t give you the answer.” So, yeah, I hear where you’re coming from. All right, so… Sorry, I’m still having a little bit of a flu. So, from your personal background, you said you’ve co-written one book, but I also hinted on the fact that you’re writing another book right now, and I looked a little bit into it because it’s on Manning and it has Early Access, which is nice. But maybe you can give us a little bit of an insight of what you’re writing about.

Jimmy Angelakos: Right. So, the book that is under construction is called PostgreSQL Mistakes and how you can avoid them. So, it’s a bit of an anti-how-to. So, for people that are used to how-to books, like, “How do I partition? How do I do this? How do I do that?” It’s a bit of the other way around. I was trying to do this, but things went wrong. So, it’s experiences that I’ve collected from the things I’ve seen our customers do or the things I’ve done in the past.

Chris Engelbert: Right.

Jimmy Angelakos: And it’s really important to learn from mistakes. Everyone makes mistakes. And Postgres is very particular in how it wants things done. So if you get it right, the database is fantastic. It works very well with excellent performance. And when you start to do things a different way, you can see different results. And that’s basically the whole idea. There’s three chapters. Three chapters up on the web now. And there’s a huge fourth chapter that’s being published as we speak. That has anti-patterns that are not really restricted to Postgres. It’s things like, don’t improvise, don’t create your own distributed systems. There’s people that have spent hundreds of thousands of hours working on these problems, and you don’t need to reinvent the wheel.

Chris Engelbert: I hear you. As you said, there’s three chapters out right now. I haven’t seen the fourth one yet, so I think I have to look into that right after the recording.

Jimmy Angelakos: Manning are in the process of publishing it as we speak.

Chris Engelbert: All right, cool. But so far, I really like the second chapter and you bringing up all of the SQL code examples and showing the execution plans. And I think just by saying the word execution plan or the term execution plan, I probably lost half of the audience right now. So maybe you can give them a little bit of a feeling of what is an execution plan? Why is it so important to understand those things?

Jimmy Angelakos: Yeah, so Postgres has a quasi-intelligent query planner, which basically examines the way your query is written and produces a plan on how it’s going to get executed by the database server. It’s like, oh, they wrote this, where, this, and that, and it looks like a join. So I’m going to perform a join of these tables and then I’m going to order the results in this. So that’s the execution plan. It’s basically telling you how the database is going to execute your SQL query. Now, the planner takes into account things such as how much memory do you have or how fast are your disks that you’ve already specified in the Postgres configuration. It also takes into account things like what’s the nature of the data? What’s the cardinality, let’s say, in your tables? And these are things that are updated automatically by Postgres itself in its statistics tables. So it produces, most of the time, a really good plan. And what is a good plan? It’s the cheapest plan in terms of arbitrary cost. And arbitrary cost is calculated using those factors that I just mentioned. And it iterates through many plans for the execution, chooses the cheapest one, which will probably end up being the fastest one to execute in real-world terms. And seeing the execution plans is key to understand why your queries are running well or why they’re running slowly. Because then you can see, ah, this is what Postgres was trying to do. So maybe I should force its hand by writing this slightly differently.

Chris Engelbert: Yeah, that’s true. I think my personal favorite example is a common table expression, which ends up being a join because the query planner understands now a join is actually better. I don’t need to do the temporary heap table to store the intermediate result. So we kind of hinted where people can find the early access version. It’s at Manning. Do you want to add anything more to that? Maybe have a specific simple URL or something where people can find it.

Jimmy Angelakos: I can share the URL, but I certainly cannot spell it out.

Chris Engelbert: Ok, that’s fair enough. We’re going to put it in the show notes. That’s totally fine.

Jimmy Angelakos: Thanks very much. Yeah, I think it’s going to be an interesting book because it’s real-world use cases. And where it isn’t a real-world use case, it’s close enough. And I will tell you so in the text.

Chris Engelbert: That is true. And I agree. As I said, I’ve well kind of read through the first three. I read as much as I had time, but I really enjoyed it. And many of those code examples you brought up, as I said, especially in the second chapter, they were like, yes, either I’ve been there or I had people helping with that as well. I’ve worked for a Postgres based startup in the past. And we had people asking pretty much the same questions over and over again. So yes, for everyone using Postgres or starting using Postgres, it’s probably a pretty, pretty good pick.

Jimmy Angelakos: Thank you. I appreciate that. Yeah, as you know, people are familiar with other databases because Postgres has most recently exploded in popularity. It was kind of a niche database for a few years. And now it looks like all the enterprises are using it, all the hyperscalers are starting to use it, like AWS, Google, Azure. This means that they have recognized the value that Postgres brings to the table.

Chris Engelbert: Yeah, I agree. And I think it’s kind of interesting because you kind of hinted at that earlier. But you can do a lot of things with Postgres. There is a lot of stuff in Postgres itself. If you want document database, you have XML and JSON. If you want key value, you have hstore. But there is also a really good extensibility to Postgres, giving you the chance to plug everything else in, like time series, graph databases. I don’t know what else. You probably could define Postgres as the actual really only in the world multimodal database.

Jimmy Angelakos: Right, yeah. And we were actually considering of changing the description of Postgres on the website, where you go in and it says it’s an object relational database, which is kind of a formal, traditional way to put it. But nowadays, you’re right. I think it’s more of a multimodal database. And I think that is also the term that Simon Riggs preferred. Because it does all of these things and can also let you do. Things that the developers of Postgres hadn’t even thought of because of the extension system. Like a very famous extension is PostGIS, which is the GIS (geospatial) capabilities for Postgres, and is now considered the gold standard in geographical databases.

Chris Engelbert: True.

Jimmy Angelakos: From an open-source extension to an open-source database. And there’s like thousands of people that are professionally employed to use this extension in their day jobs, which is amazing.

Chris Engelbert: True. I agree. So let me see. Let me flip back a little bit. I mean, we’re officially a cloud podcast. We talked a lot about the cool Postgres world. And I was part of a Postgres world. I was part of the Java world. So that is mostly the guests I had so far. But because we’re a cloud podcast, what do you think, like working with all the different customers, what is your feeling? Like how many people are actually deploying Postgres in the cloud, in Kubernetes, in EC2, or anything like that?

Jimmy Angelakos: Well, the company I’m working with right now are using it on RDS. They’re using RDS Postgres because it suits their use case better in the sense that they don’t have a team that wants to worry about replication and backups and things like that. And availability zones, they want that handled as a service. And that fits their use case quite well. When you want more flexibility, you can still use the cloud. You can run, for example, Postgres on Azure boxes or EC2 boxes or whatever you want. But then you have to take care of these things yourself.

Chris Engelbert: Right.

Jimmy Angelakos: But it still gives you freedom from having to worry about hard drives and hardware and purchase orders and things like that. You just send off a check every month and you’re done. Now, Kubernetes is an interesting case. There’s a couple of operators for Postgres. The most recent one is Cloud Native PG, which is starting to get supported and getting traction from the Cloud Native Computing Foundation, which is great. And they are trying to do things in a different way that is totally cloud-native. So everything is defined as a resource in Kubernetes. But the resources map to things that are well known in Postgres, like clusters and nodes and backups and actual things so that you don’t have to perform black magic like running it in a pod, but also having to configure the pod manually to talk to another pod that is your replicant, things like that. And there are other operators that have evolved over time to approximate this ease of use. I think the Crunchy Data Operator comes to mind. It started off being very imperative. They had a command-line utility that created clusters and so on. And now they’ve turned it into a declarative, which is more cloud-native, more preferred by the Kubernetes world.I think these two are the major Postgres things that I’ve seen in Kubernetes, at least that I’ve seen in use the past few years. There are still things that haven’t been sorted because, as we said, Postgres is super flexible. And this flexibility and the ease of use of Kubernetes, where everything is taken care of automatically, comes at a cost. You have reduced flexibility when you’re on Kubernetes. So there’s things that haven’t been totally worked out yet, like how do you one-click migrate from a cluster that is outside Kubernetes to something that is running in Kubernetes? Or can you take a backup that was produced elsewhere and create a cluster in Kubernetes, a Postgres cluster from that backup? Now, once they have these things sorted and also hardware support is very important when you’re talking to databases, I think we’ll see many more people going to Postgres on Kubernetes in production. But specifically hardware and specifically disk performance and throughput and latency, you have to get into the hardware nitty-gritty of Kubernetes to take maximum advantage of Postgres because as a database, it loves fast disks. Generally speaking, the faster your disk, the faster Postgres will go.

Chris Engelbert: That is true. And just like a shameless plug, we’re working on something. But because we’re running out of time already, 20 minutes is always so super short. What do you think is going to be the next thing for Postgres, the database world, the cloud world, whatever you like. What do you think is the next thing?

Jimmy Angelakos: I can’t give you an answer, but you can go search on YouTube and you can find Simon’s last contribution to the Postgres community. He gave a talk at PostgreSQL Conference Europe last December where he said ” Postgres, the next 20 years ” or something to that effect. And he predicted things, how things will go for Postgres in the future and future directions. That’s a very interesting talk for anyone who wants to watch that. I wouldn’t want to hazard a guess because I’ve seen people just blindly accept the thing that AI is the next big thing. And everything in Postgres and databases and Java and Python is going to revolve around AI in the future. That remains to be seen.

Chris Engelbert: I like that because normally I start to say, please don’t say AI. Everyone says that. And I think AI will be a big part of the future, but I agree with you. It remains to be seen how exactly. Yeah, thank you very much. We’re going to put the video link in the show notes as well for everyone interested. And yeah, Jimmy, thank you very much. It was a pleasure having you.

Jimmy Angelakos: Thanks very much. I appreciate the invitation.

Chris Engelbert: My pleasure. And for the audience, we’re going to see or hear us next week. And thank you very much for being here.

The post PostgreSQL mistakes and how to avoid them with Jimmy Angelakos appeared first on simplyblock.

Production-grade Kubernetes PostgreSQL, Álvaro Hernández

Chris Engelbert — Fri, 05 Apr 2024 12:13:27 +0000

In this episode of the Cloud Commute podcast, Chris Engelbert is joined by Álvaro Hernández Tortosa, a prominent figure in the PostgreSQL community and CEO of OnGres. Álvaro shares his deep insights into running production-grade PostgreSQL on Kubernetes, a complex yet rewarding endeavor. The discussion covers the challenges, best practices, and innovations that make PostgreSQL a powerful database choice in cloud-native environments.

This interview is part of the simplyblock Cloud Commute Podcast, available on Youtube, Spotify, iTunes/Apple Podcasts, Pandora, Samsung Podcasts, and our show site.

Key Takeaways

Q: Should you deploy PostgreSQL in Kubernetes?

Deploying PostgreSQL in Kubernetes is a strategic move for organizations aiming for flexibility and scalability. Álvaro emphasizes that Kubernetes abstracts the underlying infrastructure, allowing PostgreSQL to run consistently across various environments—whether on-premise or in the cloud. This approach not only simplifies deployments but also ensures that the database is resilient and highly available.

Q: What are the main challenges of running PostgreSQL on Kubernetes?

Running PostgreSQL on Kubernetes presents unique challenges, particularly around storage and network performance. Network disks, commonly used in cloud environments, often lag behind local disks in performance, impacting database operations. However, these challenges can be mitigated by carefully choosing storage solutions and configuring Kubernetes to optimize performance. Furthermore, managing PostgreSQL’s ecosystem—such as backups, monitoring, and high availability—requires robust tooling and expertise, which can be streamlined with solutions like StackGres.

Q: Why should you use Kubernetes for PostgreSQL?

Kubernetes offers a powerful platform for running PostgreSQL due to its ability to abstract infrastructure details, automate deployments, and provide built-in scaling capabilities. Kubernetes facilitates the management of complex PostgreSQL environments, making it easier to achieve high availability and resilience without being locked into a specific vendor’s ecosystem.

Q: Can I use PostgreSQL on Kubernetes with PGO?

Yes, you can. Tools like the PostgreSQL Operator (PGO) for Kubernetes simplify the management of PostgreSQL clusters by automating routine tasks such as backups, scaling, and updates. These operators are essential for ensuring that PostgreSQL runs efficiently on Kubernetes while reducing the operational burden on database administrators.

Key Learnings

Q: How does Kubernetes scheduler work with PostgreSQL?

Kubernetes uses its scheduler to manage how and where PostgreSQL instances are deployed, ensuring optimal resource utilization. However, understanding the nuances of Kubernetes’ scheduling can help optimize PostgreSQL performance, especially in environments with fluctuating workloads.

simplyblock Insight: Leveraging simplyblock’s solution, users can integrate sophisticated monitoring and management tools with Kubernetes, allowing them to automate the scaling and scheduling of PostgreSQL workloads, thereby ensuring that database resources are efficiently utilized and downtime is minimized. Q: What is the best experience of running PostgreSQL in Kubernetes?

The best experience comes from utilizing a Kubernetes operator like StackGres, which simplifies the deployment and management of PostgreSQL clusters. StackGres handles critical functions such as backups, monitoring, and high availability out of the box, providing a seamless experience for both seasoned DBAs and those new to PostgreSQL on Kubernetes.

simplyblock Insight: By using simplyblock’s Kubernetes-based solutions, you can further enhance your PostgreSQL deployments with features like dynamic scaling and automated failover, ensuring that your database remains resilient and performs optimally under varying loads. Q: How does disk access latency impact PostgreSQL performance in Kubernetes?

Disk access latency is a significant factor in PostgreSQL performance, especially in Kubernetes environments where network storage is commonly used. While network storage offers flexibility, it typically has higher latency compared to local storage, which can slow down database operations. Optimizing storage configurations in Kubernetes is crucial to minimizing latency and maintaining high performance.

simplyblock Insight: simplyblock’s advanced storage solutions for Kubernetes can help mitigate these latency issues by providing optimized, low-latency storage options tailored specifically for PostgreSQL workloads, ensuring your database runs at peak efficiency. Q: What are the advantages of clustering in PostgreSQL on Kubernetes?

Clustering PostgreSQL in Kubernetes offers several advantages, including improved fault tolerance, load balancing, and easier scaling. Kubernetes operators like StackGres enable automated clustering, which simplifies the process of setting up and managing a highly available PostgreSQL cluster.

simplyblock Insight: With simplyblock, you can easily deploy clustered PostgreSQL environments that automatically adjust to your workload demands, ensuring continuous availability and optimal performance across all nodes in your cluster.

Additional Nugget of Information

Q: What are the advantages of clustering in Postgres? A: Clustering in PostgreSQL provides several benefits, including improved performance, high availability, and better fault tolerance. Clustering allows multiple database instances to work together, distributing the load and ensuring that if one node fails, others can take over without downtime. This setup is particularly advantageous for large-scale applications that require high availability and resilience. Clustering also enables better scalability, as you can add more nodes to handle increasing workloads, ensuring consistent performance as demand grows.

Conclusion

Deploying PostgreSQL on Kubernetes offers powerful capabilities but comes with challenges. Álvaro Hernández Tortosa highlights how StackGres simplifies this process, enhancing performance, ensuring high availability, and making PostgreSQL more accessible. With the right tools and insights, you can confidently manage PostgreSQL in a cloud-native environment.

Full Video Transcript

Chris Engelbert: Welcome to this week’s episode of Cloud Commute podcast by simplyblock. Today, I have another incredible guest, a really good friend, Álvaro Hernández from OnGres. He’s very big in the Postgres community. So hello, and welcome, Álvaro.

Álvaro Hernández Tortosa: Thank you very much, first of all, for having me here. It’s an honor.

Chris Engelbert: Maybe just start by introducing yourself, who you are, what you’ve done in the past, how you got here. Well, except me inviting you.

Álvaro Hernández Tortosa: OK, well, I don’t know how to describe myself, but I would say, first of all, I’m a big nerd, big fan of open source. And I’ve been working with Postgres, I don’t know, for more than 20 years, 24 years now. So I’m a big Postgres person. There’s someone out there in the community that says that if you say Postgres three times, I will pop up there. It’s kind of like Superman or Batman or these superheroes. No, I’m not a superhero. But anyway, professionally, I’m the founder and CEO of a company called OnGres. Let’s guess what it means, On Postgres. So it’s pretty obvious what we do. So everything revolves around Postgres, but in reality, I love all kinds of technology. I’ve been working a lot with many other technologies. I know you because of being a Java programmer, which is kind of my hobby. I love programming in my free time, which almost doesn’t exist. But I try to get some from time to time. And everything related to technology in general, I’m also a big fan and supporter of open source. I have contributed and keep contributing a lot to open source. I also founded some open source communities, like for example, I’m a Spaniard. I live in Spain. And I founded Debian Spain, an association like, I don’t know, 20 years ago. More recently, I also founded a foundation, a nonprofit foundation also in Spain called Fundación PostgreSQL. Again, guess what it does? And I try to engage a lot with the open source communities. We, by the way, organized a conference for those who are interested in Postgres in the magnificent island of Ibiza in the Mediterranean Sea in September this year, 9th to 11th September for those who want to join. So yeah, that’s probably a brief intro about myself.

Chris Engelbert: All right. So you are basically the Beetlejuice of Postgres. That’s what you’re saying.

Álvaro Hernández Tortosa: Beetlejuice, right. That’s more upper bid than superheroes. You’re absolutely right.

Chris Engelbert: I’m not sure if he is a superhero, but he’s different at least. Anyway, you mentioned OnGres. And I know OnGres isn’t really like the first company. There were quite a few before, I think, El Toro, a database company.

Álvaro Hernández Tortosa: Yes, Toro DB.

Chris Engelbert: Oh, Toro DB. Sorry, close, close, very close. So what is up with that? You’re trying to do a lot of different things and seem to love trying new things, right?

Álvaro Hernández Tortosa: Yes. So I sometimes define myself as a 0.x serial entrepreneur, meaning that I’ve tried several ventures and sold none of them. But I’m still trying. I like to try to be resilient, and I keep pushing the ideas that I have in the back of my head. So yes, I’ve done several ventures, all of them, around certain patterns. So for example, you’re asking about Toro DB. Toro DB is essentially an open source software that is meant to replace MongoDB with, you guessed it, Postgres, right? There’s a certain pattern in my professional life. And Toro DB was. I’m speaking in the past because it no longer unfortunately maintained open source projects. We moved on to something else, which is OnGres. But the idea of Toro DB was to essentially replicate from Mongo DB live these documents and in the process, real time, transform them into a set of relational tables that got stored inside of a Postgres database. So it enabled you to do SQL queries on your documents that were MongoDB. So think of a MongoDB replica. You can keep your MongoDB class if you want, and then you have all the data in SQL. This was great for analytics. You could have great speed ups by normalizing data automatically and then doing queries with the power of SQL, which obviously is much broader and richer than query language MongoDB, especially for analytics. We got like 100 times faster on most queries. So it was an interesting project.

Chris Engelbert: So that means you basically generated the schema on the fly and then generated the table for that schema specifically? Interesting.

Álvaro Hernández Tortosa: Yeah, it was generating tables and columns on the fly.

OnGres StackGres: Operator for Production-Grade PostgreSQL on Kubernetes

Chris Engelbert: Right. Ok, interesting. So now you’re doing the OnGres thing. And OnGres has, I think, the main product, StackGres, as far as I know. Can you tell a little bit about that?

Álvaro Hernández Tortosa: Yes. So OnGres, as I said, means On Postgres. And one of our goals in OnGres is that we believe that Postgres is a fantastic database. I don’t need to explain that to you, right? But it’s kind of the Linux kernel, if I may use this parallel. It’s a bit bare bones. You need something around it. You need a distribution, right? So Postgres is a little bit the same thing. The core is small, it’s fantastic, it’s very featureful, it’s reliable, it’s trustable. But it needs tools around it. So our vision in OnGres is to develop this ecosystem around this Postgres core, right? And one of the things that we experience during our professional lifetime is that Postgres requires a lot of tools around it. It needs monitoring, it needs backups, it needs high availability, it needs connection pooling.

By the way, do not use Postgres without connection pooling, right? So you need a lot of tools around. And none of these tools come from a core. You need to look into the ecosystem. And actually, this is good and bad. It’s good because there’s a lot of options. It’s bad because there’s a lot of options. Meaning which one to choose, which one is good, which one is bad, which one goes with a good backup solution or the good monitoring solution and how you configure them all. So this was a problem that we coined as a stack problem. So when you really want to run Postgres in production, you need the stack on top of Postgres, right? To orchestrate all these components.

Now, the problem is that we’ve been doing this a lot of time for our customers. Typically, we love infrastructure scores, right? And everything was done with Ansible and similar tools and Terraform for infrastructure and Ansible for orchestrating these components. But the reality is that every environment into which we looked was slightly different. And we can just take our Ansible code and run it. You’ve got this stack. But now the storage is different. Your networking is different. Your entry point. Here, one is using virtual IPs. That one is using DNS. That one is using proxies. And then the compute is also somehow different. And it was not reusable. We were doing a lot of copy, paste, modify, something that was not very sustainable. At some point, we started thinking, is there a way in which we can pack this stack into a single deployable unit that we can take essentially anywhere? And the answer was Kubernetes. Kubernetes provides us this abstraction where we can abstract away this compute, this storage, this bit working and code against a programmable API that we can indeed create this package. So that’s a StackGres.

StackGres is the stack of components you need to run production Postgres, packaging a way that is uniform across any environment where you want to run it, cloud, on-prem, it doesn’t matter. And is production ready! It’s packaged at a very, very high level. So basically you barely need, I would say, you don’t need Postgres knowledge to run a production ready enterprise quality Postgres cluster introduction. And that’s the main goal of StackGres.

Chris Engelbert: Right, right. And as far as I know, I think it’s implemented as a Kubernetes operator, right?

Álvaro Hernández Tortosa: Yes, exactly.

Chris Engelbert: And there’s quite a few other operators as well. But I know that StackGres has some things which are done slightly differently. Can you talk a little bit about that? I don’t know how much you wanna actually make this public right now.

Álvaro Hernández Tortosa: No, actually everything is open source. Our roadmap is open source, our issues are open source. I’m happy to share everything. Well, first of all, what I would say is that the operator pattern is essentially these controllers that take actions on your cluster and the CRDs. We gave a lot of thought to these CRDs. I would say that a lot of operators, CRDs are kind of a byproduct. A second thought, “I have my objects and then some script generates the CRDs.” No, we said CRDs are our user-facing API. The CRDs are our extended API. And the goal of operators is to abstract the way and package business logic, right? And expose it with a simple user interface.

So we designed our CRDs to be very, very high level, very amenable to the user, so that again, you don’t require any Postgres expertise. So if you look at the CRDs, in practical terms, the YAMLs, right? The YAMLs that you write to deploy something on StackGres, they should be able to deploy, right? You could explain to your five-year-old kid and your five-year-old kid should be able to deploy Postgres into a production-quality cluster, right? And that’s our goal. And if we didn’t fulfill this goal, please raise an issue on our public issue tracker on GitLab because we definitely have failed if that’s not true. So instead of focusing on the Postgres usual user, very knowledgeable, very high level, most operators focused on low level CRDs and they require Postgres expertise, probably a lot. We want to make Postgres more mainstream than ever, right? Postgres increases in popularity every year and it’s being adopted by more and more organizations, but not everybody’s a Postgres expert. We want to make Postgres universally accessible for everyone. So one of the things is that we put a lot of effort into this design. And we also have, instead of like a big one, gigantic CRD. We have multiple. They actually can be attached like in an ER diagram between them. So you understand relationships, you create one and then you reference many times, you don’t need to restart or reconfigure the configuration files. Another area where I would say we have tried to do something is extensions. Postgres extensions is one of the most loved, if not the most loved feature, right?

And StackGres is the operator that arguably supports the largest number of extensions, over 200 extensions of now and growing. And we did this because we developed a custom solution, which is also open source by StackGres, where we can load extensions dynamically into the cluster. So we don’t need to build you a fat container with 200 images and a lot of security issues, right? But rather we deploy you a container with no extensions. And then you say, “I want this, this, this and that.” And then they will appear in your cluster automatically. And this is done via simple YAML. So we have a very powerful extension mechanism. And the other thing is that we not only expose the usual CRD YAML interface for interacting with StackGres, it’s more than fine and I love it, but it comes with a fully fledged web console. Not everybody also likes the command line or GitOps approach. We do, but not everybody does. And it’s a fully fledged web console which also supports single sign-on, where you can integrate with your AD, with your OIDC provider, anything that you want. Has detailed fine-grained permissions based on Kubernetes RBAC. So you can say, “Who can create clusters, who can view configurations, who can do anything?” And last but not least, there’s a REST API. So if you prefer to automate and integrate with another kind of solution, you can also use the REST API and create clusters and manage clusters via the REST API. And these three mechanisms, the YAML files, CRDs, the REST API and the web console are fully interchangeable. You can use one for one operation, the other one for everything goes back to the same. So you can use any one that you want.

And lately we also have added sharding. So sharding scales out with solutions like Citus, but we also support foreign interoperability, Postgres with partitioning and Apache ShardingSphere. Our way is to create a cluster of multiple instances. Not only one primary and one replica, but a coordinator layer and then shards, and it shares a coordinator of the replica. So typically dozens of instances, and you can create them with a simple YAML file and very high-level description, requires some knowledge and wires everything for you. So it’s very, very convenient to make things simple.

Chris Engelbert: Right. So the plugin mechanism or the extension mechanism, that was exactly what I was hinting at. That was mind-blowing. I’ve never seen anything like that when you showed it last year in Ibiza. The other thing that is always a little bit of like a hat-scratcher, I think, for a lot of people when they hear that a Kubernetes operator is actually written in Java. I think RedHat built the original framework. So it kind of makes sense that RedHat is doing that, I think the original framework was a Go library. And Java would probably not be like the first choice to do that. So how did that happen?

Álvaro Hernández Tortosa: Well, at first you’re right. Like the operator framework is written in Go and there was nothing else than Go at the time. So we were looking at that, but our team, we had a team of very, very senior Java programmers and none of them were Go programmers, right? But I’ve seen the Postgres community and all the communities that people who are kind of more in the DevOps world, then switching to Go programmers is a bit more natural, but at the same time, they are not senior from a Go programming perspective, right? The same would have happened with our team, right? They would switch from Java to Go. They would have been senior in Go, obviously, right? So it would have taken some time to develop those skills. On the other hand, we looked at what is the technology behind, what is an operator? An operator is no more than essentially an HTTP server that receives callbacks from Kubernetes and a client because it makes calls to Kubernetes. And HTTP clients and servers can read written in any language. So we look at the core, how complicated this is and how much does this operator framework bring to you? How we saw that it was not that much.

And actually something, for example, just mentioned before, the CRDs are kind of generated from your structures and we really wanted to do the opposite way. This is like the database. You use an ORM to read your database existing schema that we develop with all your SQL capabilities or you just create an object and let that generate a database. I prefer the format. So we did the same thing with the CRDs, right? And we wanted to develop them. So Java was more than okay to develop a Kubernetes operator and our team was expert in Java. So by doing it in Java, we were able to be very efficient and deliver a lot of value, a lot of features very, very fast without having to retrain anyone, learn a new language, or learn new skills. On top of this, there’s sometimes a concern that Java requires a JVM, which is kind of a heavy environment, right? And consumes memory and resources, and disk. But by default, StackGres uses a compilation technology and will build a whole project around it called GraalVM. And this allows you to generate native images that are indistinguishable from any other binary, Linux binary you can have with your system. And we deploy StackGres with native images. You can also switch JVM images if you prefer. We over expose both, but by default, there are native images. So at the end of the day, StackGres is several megabytes file, Linux binary and the container and that’s it.

Chris Engelbert: That makes sense. And I like that you basically pointed out that the efficiency of the existing developers was much more important than like being cool and going from a new language just because everyone does. So we talked about the operator quite a bit. Like what are your general thoughts on databases in the cloud or specifically in Kubernetes? What are like the issues you see, the problems running a database in such an environment? Well, it’s a wide topic, right? And I think one of the most interesting topics that we’re seeing lately is a concern about cost and performance. So there’s kind of a trade off as usual, right?

Álvaro Hernández Tortosa: There’s a trade off between the convenience I want to run a database and almost forget about it. And that’s why you switched to a cloud managed service which is not always true by the way, because forgetting about it means that nobody’s gonna then back your database, repack your tables, right? Optimize your queries, analyze if you haven’t used indexes. So if you’re very small, that’s more than okay. You can assume that you don’t need to touch your database even if you grow over a certain level, you’re gonna need the same DBAs, the same, at least to operate not the basic operations of the database which are monitoring, high availability and backups. So those are the three main areas that a managed service provides to you.

But so there’s convenience, but then there’s an additional cost. And this additional cost sometimes is quite notable, right? So it’s typically around 80% premium on a N+1/N number of instances because sometimes we need an extra even instance for many cloud services, right? And that multiply by 1.8 ends up being two point something in the usual case. So you’re overpaying that. So you need to analyze whether this is good for you from this perspective of convenience or if you want to have something else. On the other hand, almost all cloud services use network disks. And these network disks are very good and have improved performance a lot in the last years, but still they are far from the performance of a local drive, right? And running databases with local drives has its own challenges, but they can be addressed. And you can really, really move the needle by kind of, I don’t know if that’s the right term to call it self-hosting, but this trend of self-hosting, and if we could marry the simplicity and the convenience of managed services, right?

With the ability of running on any environment and running on any environment at a much higher performance, I think that’s kind of an interesting trend right now and a good sweet spot. And Kubernetes, to try to marry all the terms that you mentioned in the question, actually is one driver towards this goal because it enables us infrastructure independence and it enables both network disks and local disks and equally the same. And it’s kind of an enabler for this pattern that I see more trends, more trends as of now, more important and one that definitely we are looking forward to.

Chris Engelbert: Right, I like that you pointed out that there’s ways to address the local storage issues, just shameless plug, we’re actually working on something.

Álvaro Hernández Tortosa: I heard something.

The Biggest Trend in Containers?

Chris Engelbert: Oh, you heard something. (laughing) All right, last question because we’re also running out of time. What do you see as the biggest trend right now in containers, cloud, whatever? What do you think is like the next big thing? And don’t say AI, everyone says that.

Álvaro Hernández Tortosa: Oh, no. Well, you know what? Let me do a shameless plug here, right?

Chris Engelbert: All right. I did one. (laughing)

Álvaro Hernández Tortosa: So there’s a technology we’re working on right now that works for our use case, but will work for many use cases also, which is what we’re calling dynamic containers. So containers are essential as something that is static, right? You build a container, you have a build with your Dockerfile, whatever you use, right? And then that image is static. It is what it is. Contains the layers that you specified and that’s all. But if you look at any repository in Docker Hub, right? There’s plenty of tags. You have what, for example, Postgres. There’s Postgres based on Debian. There’s Postgres based on Alpine. There’s Postgres with this option. Then you want this extension, then you want this other extension. And then there’s a whole variety of images, right? And each of those images needs to be built independently, maintained, updated independently, right? But they’re very orthogonal. Like upgrading the Debian base OS has nothing to do with the Postgres layer, has nothing to do with the timescale extension, has nothing to do with whether I want the debug symbols or not. So we’re working on technology with the goal of being able to, as a user, express any combination of items I want for my container and get that container image without having to rebuild and maintain the image with the specific parameters that I want.

Chris Engelbert: Right, and let me guess, that is how the Postgres extension stuff works.

Álvaro Hernández Tortosa: It is meant to be, and then as a solution for the Postgres extensions, but it’s actually quite broad and quite general, right? Like, for example, I was discussing recently with some folks of the OpenTelemetry community, and the OpenTelemetry collector, which is the router for signals in the OpenTelemetry world, right? Has the same architecture, has like around 200 plugins, right? And you don’t want a container image with those 200 plugins, which potentially, because many third parties may have some security vulnerabilities, or even if there’s an update, you don’t want to update all those and restart your containers and all that, right? So why don’t you kind of get a container image with the OpenTelemetry collector with this source and this receiver and this export, right? So that’s actually probably more applicable. Yeah, I think that makes sense, right? I think that is a really good end, especially because the static containers in the past were in the original idea was that the static gives you some kind of consistency and some security on how the container looks, but we figured out over time, that is not the best solution. So I’m really looking forward to that being probably a more general thing. To be honest, actually the idea, I call it dynamic containers, but in reality, from a user perspective, they’re the same static as before. They are dynamic from the registry perspective.

Chris Engelbert: Right, okay, fair enough. All right, thank you very much. It was a pleasure like always talking to you. And for the other ones, I see, hear, or read you next week with my next guest. And thank you to Álvaro, thank you for being here. It was appreciated like always.

Álvaro Hernández Tortosa: Thank you very much.

The post Production-grade Kubernetes PostgreSQL, Álvaro Hernández appeared first on simplyblock.