Persistent Volume Claim Archives | simplyblock

Kubernetes Storage 201: Concepts and Practical Examples

Rob Pankow — Mon, 23 Dec 2024 09:08:57 +0000

What is Kubernetes Storage?

Kubernetes storage is a sophisticated ecosystem designed to address the complex data management needs of containerized applications. At its core, Kubernetes storage provides a flexible mechanism to manage data across dynamic, distributed computing environments. It allows your containers to store, access, and persist data with unprecedented flexibility.

Storage Types in Kubernetes

Fundamentally, Kubernetes provides two types of storage: ephemeral volumes are bound to the container’s lifecycle, and persistent volumes survive a container restart or termination.

Ephemeral (Non-Persistent) Storage

Ephemeral storage represents the default storage mechanism in Kubernetes. It provides a temporary storage solution, existing only for the duration of a container’s lifecycle. Therefore, when a container is terminated or removed, all data stored in this temporary storage location is permanently deleted.

This type of storage is ideal for transient data that doesn’t require long-term preservation, such as temporary computation results or cache files. Most stateless workloads utilize ephemeral storage for these kinds of temporary data. That said, a “stateless workload” doesn’t necessarily mean no data is stored temporarily. It means there is no issue if this storage disappears from one second to the next.

Persistent Storage

Persistent storage is a critical concept in Kubernetes that addresses one of the fundamental challenges of containerized applications: maintaining data integrity and accessibility across dynamic and ephemeral computing environments.

Unlike ephemeral storage, which exists only for the lifetime of a container, persistent storage is not bound to the lifetime of a container. Hence, persistent storage provides a robust mechanism for storing and managing data that must survive container restarts, pod rescheduling, or even complete cluster redesigns. You enable persistent Kubernetes storage through the concepts of Persistent Volumes (PV) as well as Persistent Volume Claims (PVC).

Fundamental Kubernetes Storage Entities

Figure 1: The building blocks of Kubernetes Storage

Storage in Kubernetes is built up from multiple entities, depending on how storage is provided and if it is ephemeral or persistent.

Persistent Volumes (PV)

A Persistent Volume (PV) is a slice of storage in the Kubernetes cluster that has been provisioned by an administrator or dynamically created through a StorageClass. Think of a PV as a virtual storage resource that exists independently of any individual pod’s lifecycle. Consequently, this abstraction allows for several key capabilities:

Lifecycle Independence: A PV exists even when the owner-pod is deleted, recreated, or moved to different nodes.
Storage Type Abstraction: A PV is not bound to a specific storage technology. Hence, it can represent various storage types, including:

Persistent Volume Claims (PVC): Requesting Storage Resources

Persistent Volume Claims act as a user’s request for storage resources. Image your PVC as a demand for storage with specific requirements, similar to how a developer requests computing resources.

When a user creates a PVC, Kubernetes attempts to find and bind an appropriate Persistent Volume that meets the specified criteria. If no existing volume is found but a storage class is defined or a cluster-default one is available, the persistent volume will be dynamically allocated.

Key PersistentVolumeClaim Characteristics:

Size Specification: Defines a user storage capacity request
Access Modes: Defines how the volume can be accessed
- ReadWriteOnce (RWO): Allows all pods on a single node to mount the volume in read-write mode.
- ReadWriteOncePod: Allows a single pod to read-write mount the volume on a single node.
- ReadOnlyMany (ROX): Allows multiple pods on multiple nodes to read the volume. Very practical for a shared configuration state.
- ReadWriteMany (RWO): Allows multiple pods on multiple nodes to read and write to the volume. Remember, this could be dangerous for databases and other applications that don’t support a shared state.
StorageClass: Allows requesting specific types of storage based on performance, redundancy, or other characteristics

The Container Storage Interface (CSI)

The Container Storage Interface (CSI) represents a pivotal advancement in Kubernetes storage architecture. Before CSI, integrating storage devices with Kubernetes was a complex and often challenging process that required a deep understanding of both storage systems and container orchestration.

The Container Storage Interface introduces a standardized approach to storage integration. Storage providers (commonly referred to as CSI drivers) are so-called out-of-process entities that communicate with Kubernetes via an API. The integration of CSI into the Kubernetes ecosystem provides three major benefits:

CSI provides a vendor-neutral, extensible plugin architecture
CSI simplifies the process of adding new storage systems to Kubernetes
CSI enables third-party storage providers to develop and maintain their own storage plugins without modifying Kubernetes core code

Volumes: The Basic Storage Units

In Kubernetes, volumes are fundamental storage entities that solve the problem of data persistence and sharing between containers. Unlike traditional storage solutions, Kubernetes volumes are not limited to a single type of storage medium. They can represent:

Volumes provide a flexible abstraction layer that allows applications to interact with storage resources without being directly coupled to the underlying storage infrastructure.

StorageClasses: Dynamic Storage Provisioning

StorageClasses represent a powerful abstraction that enables dynamic and flexible storage provisioning because they allow cluster administrators to define different types of storage services with varying performance characteristics, such as:

High-performance SSD storage
Economical magnetic drive storage
Geo-redundant cloud storage solutions

When a user requests storage through a PVC, Kubernetes tries to find an existing persistent volume. If none was found, the appropriate StorageClass defines how to automatically provision a suitable storage resource, significantly reducing administrative overhead.

Figure 2: Table with features for ephemeral storage and persistent storage

Best Practices for Kubernetes Storage Management

Resource Limitation
- Implement strict resource quotas
- Control storage consumption across namespaces
- Set clear boundaries for storage requests
Configuration Recommendations
- Always use Persistent Volume Claims in container configurations
- Maintain a default StorageClass
- Use meaningful and descriptive names for storage classes
Performance and Security Considerations
- Implement quality of service (QoS) controls
- Create isolated storage environments
- Enable multi-tenancy through namespace segregation

Practical Storage Provisioning Example

While specific implementations vary, here’s a conceptual example of storage provisioning using Helm:

helm install storage-solution storage-provider/csi-driver \
  --set storage.size=100Gi \
  --set storage.type=high-performance \
  --set access.mode=ReadWriteMany

Kubernetes Storage with Simplyblock CSI: Practical Implementation Guide

Simplyblock is a storage platform for stateful workloads such as databases, message queues, data warehouses, file storage, and similar. Therefore, simplyblock provides many features tailored to the use cases, simplifying deployments, improving performance, or enabling features such as instant database clones.

Basic Installation Example

When deploying storage in a Kubernetes environment, organizations need a reliable method to integrate storage solutions seamlessly. The Simplyblock CSI driver installation process begins by adding the Helm repository, which allows teams to easily access and deploy the storage infrastructure. By creating a dedicated namespace called simplyblock-csi, administrators ensure clean isolation of storage-related resources from other cluster components.

The installation command specifies critical configuration parameters that connect the Kubernetes cluster to the storage backend. The unique cluster UUID identifies the specific storage cluster, while the API endpoint provides the connection mechanism. The secret token ensures secure authentication, and the pool name defines the initial storage pool where volumes will be provisioned. This approach allows for a standardized, secure, and easily repeatable storage deployment process.

Here’s an example of installing the Simplyblock CSI driver:

helm repo add simplyblock-csi https://raw.githubusercontent.com/simplyblock-io/simplyblock-csi/master/charts

helm repo update

helm install -n simplyblock-csi --create-namespace \
  simplyblock-csi simplyblock-csi/simplyblock-csi \
  --set csiConfig.simplybk.uuid=[random-cluster-uuid] \
  --set csiConfig.simplybk.ip=[cluster-ip] \
  --set csiSecret.simplybk.secret=[random-cluster-secret] \
  --set logicalVolume.pool_name=[cluster-name]

Advanced Configuration Scenarios

1. Performance-Optimized Storage Configuration

Modern applications often require precise control over storage performance, making custom StorageClasses invaluable.

Firstly, by creating a high-performance storage class, organizations can define exact performance characteristics for different types of workloads. The configuration sets a specific IOPS (Input/Output Operations Per Second) limit of 5000, ensuring that applications receive consistent and predictable storage performance.

Secondly, bandwidth limitations of 500 MB/s prevent any single application from monopolizing storage resources, promoting fair resource allocation. The added encryption layer provides an additional security measure, protecting sensitive data at rest. This approach allows DevOps teams to create storage resources that precisely match application requirements, balancing performance, security, and resource management.

# Example StorageClass configuration
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: high-performance-storage
provisioner: csi.simplyblock.io
parameters:
  qos_rw_iops: "5000"    # High IOPS performance
  qos_rw_mbytes: "500"   # Bandwidth limit
  encryption: "True"      # Enable encryption

2. Multi-Tenant Storage Setup

As a large organization or cloud provider, you require a robust environment and workload separation mechanism. For that reason, teams organize workloads between development, staging, and production environments by creating a dedicated namespace for production applications.

Therefore, the custom storage class for production workloads ensures critical applications have access to dedicated storage resources with specific performance and distribution characteristics.

The distribution configuration with multiple network domain controllers (NDCs) provides enhanced reliability and performance. Indeed, this approach supports complex enterprise requirements by enabling granular control over storage resources, improving security, and ensuring that production workloads receive the highest quality of service.

# Namespace-based storage isolation
apiVersion: v1
kind: Namespace
metadata:
  name: production-apps

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: encrypted-volume
  annotations:
    simplybk/secret-name: encrypted-volume-keys
spec:
  storageClassName: encrypted-storage
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 50Gi

Multipath Storage Configuration

Network resilience is a critical consideration in enterprise storage solutions. Hence, multipath storage configuration provides redundancy by allowing multiple network paths for storage communication. By enabling multipathing and specifying a default network interface, organizations can create more robust storage infrastructures that can withstand network interruptions.

The caching node creation further enhances performance by providing an intelligent caching layer that can improve read and write operations. Furthermore, this configuration supports load balancing and reduces potential single points of failure in the storage network.

cachingnode:
  create: true
  multipathing: true
  ifname: eth0  # Default network interface

Best Practices for Kubernetes Storage with Simplyblock

Always specify a unique pool name for each storage configuration
Implement encryption for sensitive workloads
Use QoS parameters to control storage performance
Leverage multi-tenancy features for environment isolation
Regularly monitor storage node capacities and performance

Deletion and Cleanup

# Uninstall the CSI driver
helm uninstall "simplyblock-csi" --namespace "simplyblock-csi"

# Remove the namespace
kubectl delete namespace simplyblock-csi

The examples demonstrate the flexibility of Kubernetes storage, showcasing how administrators can fine-tune storage resources to meet specific application requirements while maintaining performance, security, and scalability. Try simplyblock for the most flexible Kubernetes storage solution on the market today.

The post Kubernetes Storage 201: Concepts and Practical Examples appeared first on simplyblock.

Avoiding Storage Lock-in: Block Storage Migration with Simplyblock

Rob Pankow — Tue, 27 Aug 2024 00:08:26 +0000

Storage and particularly block storage is generally easy to migrate. It doesn’t create vendor lock-in, which is very different from most database systems. Therefore, it’s worth to briefly line out where this difference comes from.

Why is Vendor Lock-In Dangerous?

For most companies, data is the most crucial part of their business. Therefore, it is dangerous to forfeit the control on how to store this incredibly important good and storage vendor lock-in poses a significant risk to these businesses. When a company becomes overly reliant on a single storage provider or technology, it can find itself trapped in an inflexible situation with far-reaching consequences.

The dangers of vendor lock-in manifest in several ways:

Limited flexibility when requirements change or scalability needs grow.
Potential cost increases when vendors decide for sudden price rise, knowing that customers depend on their services and migration is complicated or tortuous.
Innovation constraints where other vendors provide advanced features.
Data migration challenges when moving large amounts of data from one system to another which can be complex and expensive.
Reduced bargaining power due to limited alternatives and complex migrations.
Business continuity risks if a vendor faces issues or goes out of business.
Compatibility problems due to proprietary formats or API which limited interoperability and compatibility.

These factors can lead to increased operational costs, decreased competitiveness, and potential disruptions to business continuity. As such, it’s crucial for organizations to carefully consider their storage strategies and implement measures to mitigate risks of vendor lock-in.

Migrating Block Storage on Linux

The interfaces provided by a database system are extremely complex compared to block storage. While some of it is standardized in SQL, there are a lot of system specifics in data and administrative interfaces. Migrating a database system from one to another—or even upgrading a release—requires entire projects.

On the other hand, the block storage interface on Linux is extremely simple in its essence, it allows you to write, read, or delete (trim) a range of blocks. The NVMe protocol itself is a bit more complicated, but is fully standardized (industry standard, managed by NVM Express, Inc.) and the majority of advanced features are neither required nor used. Most commonly they aren’t even accessible through the Linux block storage interface.

In essence, to migrate block storage under Linux, just follow a few simple steps, which have to be performed volume-by-volume: Take your volume offline Create your new volume of the same size or larger Copy or replicate the data on block-level (under Linux just use dd) Potentially resize the filesystem (if necessary) Verify the results In some cases, it is even possible to lower or eliminate the down-time and perform online replication. For example, the Linux Volume Manager (LVM) provides a feature to move data between physical volumes for a particular logical volume under the hood and while the volume is online (pvmove).

When operating in a Kubernetes-based environment, this simple migration is still perfectly available when using file or block storage.

Migration to Simplyblock

Figure 1: Migrating to simplyblock is easy. Any software that writes to raw disks will work as a standard block device.

Migrating from any block storage to a simplyblock logical volume (virtual NVMe block device) is simple and supported through sbcli (the simplyblock command line interface).

Plain Linux

Within a plain linux environment, it is possible to use sbcli migrate with an input of a list of block storage volumes. The necessary and corresponding simplyblock logical volumes are created first. Those volumes may be of the same size or larger. The source volumes are then unmounted, and volume level replication takes place. Finally, source volumes may be deleted and replicated volumes are mounted instead.

Kubernetes

To migrate existing PVCs (Persistent Volume Claim) from any underlying storage, we need to first replicate them into simplyblock. Simplyblock’s internal Kubernetes job sbcli ‑ migrate can automatically select all PVs (Persistent Volume) of a particular type, storage class, or label. During the migration, PVCs may still be active, meaning that PVs can be mounted, but pods must be stopped.

Simplyblock will then create corresponding volumes and PVCs. Afterwards it will replicate the source’s content over to the new volumes, and deploy them under the same mount points.

Optionally, it is possible to resize the volumes during this process and to automatically delete the originals when the process finishes successfully.

Migration from Simplyblock

Migrating a specific volume away from simplyblock is just as easy. Outside of Kubernetes, using dd is the easiest way with the source and destination volumes being unmounted and just copied.

Inside a Kubernetes environment, the process of migrating block and file storage is straight-forward, too.

Individual PVs can simply be backed up after deleting the PVC. Make sure that the lifecycle of the PV and PVC aren’t bound, otherwise the PV will be deleted by Kubernetes in the process. Afterwards, the PV can be restored to new volumes and eventually re-mounted as a new PVC.

Velero is a tool that greatly helps to simplify this process.

Simplyblock: Storage without Vendor Lock-in

Utilizing block storage brings the best of all worlds. Easy migration options, compatibility due to standardized interfaces, and the possibility to choose the best tool for the job by mixing different block storage options.

Simplyblock embraces the fact that there is no one-fits-all solution and enables full interoperability and compatibility with the default standard interfaces in modern computing systems, such as block storage and the NVMe protocol. Hence, simplyblock’s logical volumes provide an easy migration path from and to simplyblock.

However, simplyblock logical volumes provide additional features that make users want to stay.

Simplyblock volumes are full copy-on-write block storage devices which enable immediate snapshots and clones. Those can be used for fast backups, or to implement features such as database branching, enabling fast turn-around times when implementing customer facing functionality.

Furthermore, multi-tenancy (including encryption keys per volume) and thin provisioning enable storage virtualization with overprovisioning. Making use of the fact that a typical storage utilization is around 30% brings down bundled storage requirements by 70% and provides a great way to optimize for cost efficiency. Additional features such as deduplication can decrease storage needs even further.

All this and more makes simplyblock, the intelligent storage orchestrator, the perfect storage solution for database operators and everyone who operates stateful Kubernetes workloads that require high performance and low latency block or file storage.

The post Avoiding Storage Lock-in: Block Storage Migration with Simplyblock appeared first on simplyblock.