Persistent Archives | simplyblock https://www.simplyblock.io/blog/tags/persistent/ NVMe-First Kubernetes Storage Platform Mon, 03 Feb 2025 13:47:24 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.1 https://www.simplyblock.io/wp-content/media/cropped-icon-rgb-simplyblock-32x32.png Persistent Archives | simplyblock https://www.simplyblock.io/blog/tags/persistent/ 32 32 Kubernetes Storage 201: Concepts and Practical Examples https://www.simplyblock.io/blog/kubernetes-storage-concepts/ Mon, 23 Dec 2024 09:08:57 +0000 https://www.simplyblock.io/?p=4731 What is Kubernetes Storage? Kubernetes storage is a sophisticated ecosystem designed to address the complex data management needs of containerized applications. At its core, Kubernetes storage provides a flexible mechanism to manage data across dynamic, distributed computing environments. It allows your containers to store, access, and persist data with unprecedented flexibility. Storage Types in Kubernetes […]

The post Kubernetes Storage 201: Concepts and Practical Examples appeared first on simplyblock.

]]>
What is Kubernetes Storage?

Kubernetes storage is a sophisticated ecosystem designed to address the complex data management needs of containerized applications. At its core, Kubernetes storage provides a flexible mechanism to manage data across dynamic, distributed computing environments. It allows your containers to store, access, and persist data with unprecedented flexibility.

Kubernetes Storage 201: Concepts and Practical Examples

Storage Types in Kubernetes

Fundamentally, Kubernetes provides two types of storage: ephemeral volumes are bound to the container’s lifecycle, and persistent volumes survive a container restart or termination.

Ephemeral (Non-Persistent) Storage

Ephemeral storage represents the default storage mechanism in Kubernetes. It provides a temporary storage solution, existing only for the duration of a container’s lifecycle. Therefore, when a container is terminated or removed, all data stored in this temporary storage location is permanently deleted.

This type of storage is ideal for transient data that doesn’t require long-term preservation, such as temporary computation results or cache files. Most stateless workloads utilize ephemeral storage for these kinds of temporary data. That said, a “stateless workload” doesn’t necessarily mean no data is stored temporarily. It means there is no issue if this storage disappears from one second to the next.

Persistent Storage

Persistent storage is a critical concept in Kubernetes that addresses one of the fundamental challenges of containerized applications: maintaining data integrity and accessibility across dynamic and ephemeral computing environments.

Unlike ephemeral storage, which exists only for the lifetime of a container, persistent storage is not bound to the lifetime of a container. Hence, persistent storage provides a robust mechanism for storing and managing data that must survive container restarts, pod rescheduling, or even complete cluster redesigns. You enable persistent Kubernetes storage through the concepts of Persistent Volumes (PV) as well as Persistent Volume Claims (PVC).

Fundamental Kubernetes Storage Entities

The building blocks of Kubernetes Storage (Persistent Volume, Persistent Volume Claim, Container Storage Interface, Volume, Storage Class)
Figure 1: The building blocks of Kubernetes Storage

Storage in Kubernetes is built up from multiple entities, depending on how storage is provided and if it is ephemeral or persistent.

Persistent Volumes (PV)

A Persistent Volume (PV) is a slice of storage in the Kubernetes cluster that has been provisioned by an administrator or dynamically created through a StorageClass. Think of a PV as a virtual storage resource that exists independently of any individual pod’s lifecycle. Consequently, this abstraction allows for several key capabilities:

Persistent Volume Claims (PVC): Requesting Storage Resources

Persistent Volume Claims act as a user’s request for storage resources. Image your PVC as a demand for storage with specific requirements, similar to how a developer requests computing resources.

When a user creates a PVC, Kubernetes attempts to find and bind an appropriate Persistent Volume that meets the specified criteria. If no existing volume is found but a storage class is defined or a cluster-default one is available, the persistent volume will be dynamically allocated.

Key PersistentVolumeClaim Characteristics:

  • Size Specification: Defines a user storage capacity request
  • Access Modes: Defines how the volume can be accessed
    • ReadWriteOnce (RWO): Allows all pods on a single node to mount the volume in read-write mode.
    • ReadWriteOncePod: Allows a single pod to read-write mount the volume on a single node.
    • ReadOnlyMany (ROX): Allows multiple pods on multiple nodes to read the volume. Very practical for a shared configuration state.
    • ReadWriteMany (RWO): Allows multiple pods on multiple nodes to read and write to the volume. Remember, this could be dangerous for databases and other applications that don’t support a shared state.
  • StorageClass: Allows requesting specific types of storage based on performance, redundancy, or other characteristics

The Container Storage Interface (CSI)

The Container Storage Interface (CSI) represents a pivotal advancement in Kubernetes storage architecture. Before CSI, integrating storage devices with Kubernetes was a complex and often challenging process that required a deep understanding of both storage systems and container orchestration.

The Container Storage Interface introduces a standardized approach to storage integration. Storage providers (commonly referred to as CSI drivers) are so-called out-of-process entities that communicate with Kubernetes via an API. The integration of CSI into the Kubernetes ecosystem provides three major benefits:

  1. CSI provides a vendor-neutral, extensible plugin architecture
  2. CSI simplifies the process of adding new storage systems to Kubernetes
  3. CSI enables third-party storage providers to develop and maintain their own storage plugins without modifying Kubernetes core code

Volumes: The Basic Storage Units

In Kubernetes, volumes are fundamental storage entities that solve the problem of data persistence and sharing between containers. Unlike traditional storage solutions, Kubernetes volumes are not limited to a single type of storage medium. They can represent:

Volumes provide a flexible abstraction layer that allows applications to interact with storage resources without being directly coupled to the underlying storage infrastructure.

StorageClasses: Dynamic Storage Provisioning

StorageClasses represent a powerful abstraction that enables dynamic and flexible storage provisioning because they allow cluster administrators to define different types of storage services with varying performance characteristics, such as:

  • High-performance SSD storage
  • Economical magnetic drive storage
  • Geo-redundant cloud storage solutions

When a user requests storage through a PVC, Kubernetes tries to find an existing persistent volume. If none was found, the appropriate StorageClass defines how to automatically provision a suitable storage resource, significantly reducing administrative overhead.

Table with features for ephemeral storage and persistent storage
Figure 2: Table with features for ephemeral storage and persistent storage

Best Practices for Kubernetes Storage Management

  1. Resource Limitation
    • Implement strict resource quotas
    • Control storage consumption across namespaces
    • Set clear boundaries for storage requests
  2. Configuration Recommendations
    • Always use Persistent Volume Claims in container configurations
    • Maintain a default StorageClass
    • Use meaningful and descriptive names for storage classes
  3. Performance and Security Considerations
    • Implement quality of service (QoS) controls
    • Create isolated storage environments
    • Enable multi-tenancy through namespace segregation

Practical Storage Provisioning Example

While specific implementations vary, here’s a conceptual example of storage provisioning using Helm:

helm install storage-solution storage-provider/csi-driver \
  --set storage.size=100Gi \
  --set storage.type=high-performance \
  --set access.mode=ReadWriteMany

Kubernetes Storage with Simplyblock CSI: Practical Implementation Guide

Simplyblock is a storage platform for stateful workloads such as databases, message queues, data warehouses, file storage, and similar. Therefore, simplyblock provides many features tailored to the use cases, simplifying deployments, improving performance, or enabling features such as instant database clones.

Basic Installation Example

When deploying storage in a Kubernetes environment, organizations need a reliable method to integrate storage solutions seamlessly. The Simplyblock CSI driver installation process begins by adding the Helm repository, which allows teams to easily access and deploy the storage infrastructure. By creating a dedicated namespace called simplyblock-csi, administrators ensure clean isolation of storage-related resources from other cluster components.

The installation command specifies critical configuration parameters that connect the Kubernetes cluster to the storage backend. The unique cluster UUID identifies the specific storage cluster, while the API endpoint provides the connection mechanism. The secret token ensures secure authentication, and the pool name defines the initial storage pool where volumes will be provisioned. This approach allows for a standardized, secure, and easily repeatable storage deployment process.

Here’s an example of installing the Simplyblock CSI driver:

helm repo add simplyblock-csi https://raw.githubusercontent.com/simplyblock-io/simplyblock-csi/master/charts

helm repo update

helm install -n simplyblock-csi --create-namespace \
  simplyblock-csi simplyblock-csi/simplyblock-csi \
  --set csiConfig.simplybk.uuid=[random-cluster-uuid] \
  --set csiConfig.simplybk.ip=[cluster-ip] \
  --set csiSecret.simplybk.secret=[random-cluster-secret] \
  --set logicalVolume.pool_name=[cluster-name]

Advanced Configuration Scenarios

1. Performance-Optimized Storage Configuration

Modern applications often require precise control over storage performance, making custom StorageClasses invaluable.

Firstly, by creating a high-performance storage class, organizations can define exact performance characteristics for different types of workloads. The configuration sets a specific IOPS (Input/Output Operations Per Second) limit of 5000, ensuring that applications receive consistent and predictable storage performance.

Secondly, bandwidth limitations of 500 MB/s prevent any single application from monopolizing storage resources, promoting fair resource allocation. The added encryption layer provides an additional security measure, protecting sensitive data at rest. This approach allows DevOps teams to create storage resources that precisely match application requirements, balancing performance, security, and resource management.

# Example StorageClass configuration
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: high-performance-storage
provisioner: csi.simplyblock.io
parameters:
  qos_rw_iops: "5000"    # High IOPS performance
  qos_rw_mbytes: "500"   # Bandwidth limit
  encryption: "True"      # Enable encryption

2. Multi-Tenant Storage Setup

As a large organization or cloud provider, you require a robust environment and workload separation mechanism. For that reason, teams organize workloads between development, staging, and production environments by creating a dedicated namespace for production applications.

Therefore, the custom storage class for production workloads ensures critical applications have access to dedicated storage resources with specific performance and distribution characteristics.

The distribution configuration with multiple network domain controllers (NDCs) provides enhanced reliability and performance. Indeed, this approach supports complex enterprise requirements by enabling granular control over storage resources, improving security, and ensuring that production workloads receive the highest quality of service.

# Namespace-based storage isolation
apiVersion: v1
kind: Namespace
metadata:
  name: production-apps

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: encrypted-volume
  annotations:
    simplybk/secret-name: encrypted-volume-keys
spec:
  storageClassName: encrypted-storage
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 50Gi

Multipath Storage Configuration

Network resilience is a critical consideration in enterprise storage solutions. Hence, multipath storage configuration provides redundancy by allowing multiple network paths for storage communication. By enabling multipathing and specifying a default network interface, organizations can create more robust storage infrastructures that can withstand network interruptions.

The caching node creation further enhances performance by providing an intelligent caching layer that can improve read and write operations. Furthermore, this configuration supports load balancing and reduces potential single points of failure in the storage network.

cachingnode:
  create: true
  multipathing: true
  ifname: eth0  # Default network interface

Best Practices for Kubernetes Storage with Simplyblock

  1. Always specify a unique pool name for each storage configuration
  2. Implement encryption for sensitive workloads
  3. Use QoS parameters to control storage performance
  4. Leverage multi-tenancy features for environment isolation
  5. Regularly monitor storage node capacities and performance

Deletion and Cleanup

# Uninstall the CSI driver
helm uninstall "simplyblock-csi" --namespace "simplyblock-csi"

# Remove the namespace
kubectl delete namespace simplyblock-csi

The examples demonstrate the flexibility of Kubernetes storage, showcasing how administrators can fine-tune storage resources to meet specific application requirements while maintaining performance, security, and scalability. Try simplyblock for the most flexible Kubernetes storage solution on the market today.

The post Kubernetes Storage 201: Concepts and Practical Examples appeared first on simplyblock.

]]>
kubernetes-storage-concepts-and-practical-examples-hero building-blocks-of-kubernetes-storage table-features-ephemeral-storage-and-persistent-storage
Kubernetes CSI: Container Attached Storage https://www.simplyblock.io/blog/kubernetes-csi-container-attached-storage-and-container-storage-interface/ Wed, 27 Mar 2024 12:13:27 +0000 https://www.simplyblock.io/?p=306 Containerized services must be stateless, a doctrine that was widely used in the early days of containerization, which came hand-in-hand with microservices. While it makes elasticity easy, these days, we containerize many types of services, such as databases, which cannot be stateless—at least, without losing their meaning. This is where the Kubernetes container storage interface […]

The post Kubernetes CSI: Container Attached Storage appeared first on simplyblock.

]]>
Containerized services must be stateless, a doctrine that was widely used in the early days of containerization, which came hand-in-hand with microservices. While it makes elasticity easy, these days, we containerize many types of services, such as databases, which cannot be stateless—at least, without losing their meaning. This is where the Kubernetes container storage interface (CSI) comes in.

Docker, initially released in 2013, brought containerized applications to the vast majority of users (outside of the Solaris and BSD world), making them a commodity to the masses. Kubernetes, however, eased the process of orchestrating complex container-based systems. Both systems enable data storage options, ephemeral (temporary) or persistent. Let’s dive into concepts of container-attached storage and Kubernetes CSI.

What is Container Attached Storage (CAS)?

When containerized services need disk storage, whether ephemeral or persistent, container-attached storage (or CAS) provides the requested “virtual disk” to the container.

High level architecture diagram for container attached storage (CSI) with Kubernetes

The CAS resources are managed alongside other container resources and are directly attached to the container’s own lifecycle. That means that storage resources are automatically provisioned and potentially de-provisioned. To achieve this functionality, the management of container-attached storage resources isn’t provided by the host operating system but directly integrated into the container runtime environment, hence systems such as Kubernetes, Docker, and others.

Since the storage resource is attached to the container, it isn’t used by the host operating system or other containers. Detaching storage and compute resources provides one of the building blocks of loosely coupled services, which small and independent development teams can easily manage.

From my perspective, five main principles are important to CAS:

  1. Native: Storage resources are a first-class citizen of containerized environments. Therefore, the overall container runtime environment seamlessly integrates with and fully manages it.
  2. Dynamic: Storage resources are (normally) coupled to their container’s lifecycle. This allows for on-demand provisioning of storage volumes whose size and performance profile are tailored to the applications’ needs. The dynamic nature and automatic resource management prevent manual intervention of volumes and devices.
  3. Decoupled: Storage resources are decoupled from the underlying infrastructure, meaning the container doesn’t know (and care) where the provided storage comes from. That makes it easy to provide different storage options, like high performance or highly resilient storage, to different containers. For super-high performance but ephemeral storage, even RAM disks would be an option.
  4. Efficient: By eliminating the need for traditional storage, e.g., local storage, it is easy to optimize resource utilization using special storage clusters, thin provisioning, and over-commitment. It also makes it easy to provide multi-regional backups and enables immediate re-attachment in case the container needs to be rescheduled on another cluster node.
  5. Agnostic: The storage provider can be easily exchanged due to the decoupling of storage resources and container runtime. This prevents vendor lock-in or provides the option to utilize multiple different storage options, depending on the needs of specific applications. A database running in a container will have very different storage requirements from a normal REST API service.

Given the five features above, we have the chance to provide each and every container with the exact storage option necessary. Some may need only ephemeral storage. Hence, temporary storage can be discarded when the container itself stops, while others need persistent storage, which either lives until the container is deleted or, in specific cases, will even survive this to be reattached to a new container (for example, in the case of container migration).

What is a Container Storage Interface (Kubernetes CSI)?

Like everything in Kubernetes, the container-attached storage functionality is provided by a set of microservices orchestrated by Kubernetes itself, making it modular by design. That said, services internally provided and extended by vendors make up the container storage interface (or Kubernetes CSI). Together, they create a well-defined interface for any type of storage option to be plugged into Kubernetes.

The container storage interface defines a standard set of functionalities, some mandatory and some optional, to be implemented by the Kubernetes CSI drivers. Those drivers are commonly provided by the different vendors of storage systems.

Hence, the CSI drivers build bridges between Kubernetes and the actual storage implementation, which can be physical, software-defined, or fully virtual (like an implementation sending all data “stored” to /dev/null). On the other hand, it allows vendors to implement their storage solution as efficiently as possible, providing a minimal set of operations towards provisioning and general management. That way, vendors can choose how to implement storage, with the two main categories being hyperconverged (compute and storage sharing the same cluster nodes), disaggregated, meaning that the actual storage environment is fully separated from the Kubernetes workloads using them, bringing a clear separation of storage and compute resources.

Internal architecture of the container storage interface (CSI) from the storage backend over the CSI driver to the client pod utilizing the persistent volume.

Just like Kubernetes, the container storage interface is developed as a collaborative effort inside the Cloud Native Computing Foundation (better known as CNCF) by members from all sides of the industry, vendors, and users.

The main goal of Kubernetes CSI is to deliver on the premise of being fully vendor-neutral. In addition, it enables parallel deployment of multiple different drivers, offering storage classes for each of them. This provides us, as users, with the ability to choose the best storage technology for each container, even in the same Kubernetes cluster.

As mentioned, the Kubernetes CSI driver interface provides a standard storage (or volume) operation set. These include creation or provisioning, resizing, snapshotting, cloning, and volume deletion. The operations can either be performed directly or through Kubernetes’ container resource descriptors (CRD), integrating into the consistent approach to managing container resources.

Editor’s note: We also have a deep dive into how the Kubernetes Container Storage Interface works.

Kubernetes and Stateful Workloads

For many people, containerized workloads should be fully stateless; in the past, it was the most commonly used mantra. With the rise of orchestration platforms, such as Kubernetes, it also became more typical to deploy more stateful workloads, often due to the simplified deployment. Orchestrators offer features like automatic elasticity, restarting containers after crashes, automatic migration of containers for rolling upgrades, as well as many more typical operational procedures. Having them built-in into an orchestration platform takes a lot of the burden, hence people started to deploy more and more databases.

Databases aren’t the only stateful workloads, though. Other applications and services may also require storage of some kind of state, sometimes as a local cache, using ephemeral storage, and sometimes in a more persistent fashion, as databases.

Benjamin Wootton (then working for Contino, now at Ensemble) wrote a great blog post about the difference between stateless and stateful containers and why the latter is needed. You should read it, but only after this one.

Your Kubernetes Storage with Simplyblock

The container storage interface in Kubernetes serves as the bridge between Kubernetes and external storage systems. It provides a standardized and modular approach to provisioning and managing container-attached storage resources.

By decoupling storage functionality from the Kubernetes core, Kubernetes CSI promotes interoperability, flexibility, and extensibility. This enables organizations to seamlessly leverage a wide range of storage solutions in their Kubernetes environments, tailoring the storage to the needs of each container individually.

With the evolving ecosystem and changing Kubernetes workloads towards databases and other IO-intensive or low-latency applications, storage becomes increasingly important. Simplyblock is your distributed, disaggregated, high-performance, predictable, low-latency, and resilient storage solution. Simplyblock is tightly integrated with Kubernetes through the CSI driver and available as a StorageClass. It enables storage virtualization with overcommitment, thin provisioning, NVMe over TCP access, copy-on-write snapshots, and many more features.

If you want to learn more about simplyblock, read “Why simplyblock?” If you want to get started, we believe in simple pricing.

The post Kubernetes CSI: Container Attached Storage appeared first on simplyblock.

]]>
container-attach-storage-high-level-architecture-diagram internal-architecture-csi-driver-container-pod