The post Serverless Compute Need Serverless Storage appeared first on simplyblock.
]]>Due to this movement, other cloud operators, many database companies (such as Neon and Nile), and infrastructure teams at large enterprises are building serverless environments, either on their premises or in their private cloud platforms.
While there are great options for serverless compute, providing serverless storage to your serverless platform tends to be more challenging. This is often fueled by a lack of understanding of what serverless storage has to provide and its requirements.
Serverless architecture is a software design pattern that leverages serverless computing resources to build and run applications without managing the underlying architecture. These serverless compute resources are commonly provided by cloud providers such as AWS Lambda, Google Cloud Functions, or Azure Functions and can be dynamically scaled up and down.
When designing a serverless architecture, you’ll encounter the so-called Function-as-a-Service (FaaS), meaning that the application’s core logic will be implemented in small, stateless functions that respond to events.
That said, typically, several FaaS make up the actual application, sending events between them. Since the underlying infrastructure is abstracted away, the functions don’t know how requests or responses are handled, and their implementations are designed for vendor lock-in and built against a cloud-provider-specific API.
Cloud-vendor-agnostic solutions exist, such as knative, but require at least parts of the team to manage the Kubernetes infrastructure. They can, however, take the burden away from other internal and external development teams.
While a serverless architecture describes the application design that runs on top of a serverless compute infrastructure, serverless compute itself describes the cloud computing model in which the cloud provider dynamically manages the allocation and provisioning of server resources.
It is essential to understand that serverless doesn’t mean “without servers” but “as a user, I don’t have to plan, provision, or manage the infrastructure.”
In essence, the cloud provider (or whoever manages the serverless infrastructure) takes the burden from the developer. Serverless compute environments fully auto-scale, starting or stopping instances of the functions according to the needed capacity. Due to their stateless nature, it’s easy to stop and restart them at any point in time. That means that function instances are often very short-lived.
Popular serverless compute platforms include AWS Lambda, Google Cloud Functions, and Azure Functions. For self-managed operations, there is knative (mentioned before), as well as OpenFaaS and OpenFunction (which seems to have less activity in the recent future).
They all enable developers to focus on writing code without managing the underlying infrastructure.
Serverless storage refers to a cloud storage model where the underlying infrastructure, capacity planning, and scaling are abstracted away from the user. With serverless storage, customers don’t have to worry about provisioning or managing storage servers or volumes. Instead, they can store and retrieve data while the serverless storage handles all the backend infrastructure.
Serverless storage solutions come in different forms and shapes, beginning with an object storage interface, such as Amazon S3 or Google Cloud Storage. Object storage is excellent when storing unstructured data, such as documents or media.
Another option that people love to use for serverless storage is serverless databases. Various options are available, depending on your needs: relational, NoSQL, time-series, and graph databases. This might be the easiest way to go, depending on how you need to access data. Examples of such serverless databases include Amazon Aurora Serverless, Google’s Cloud Datastore, and external companies such as Neon or Nile.
When self-managing your serverless infrastructure with knative or one of the alternative options, you can use Kubernetes CSI storage providers to provide storage into your functions. However, you may add considerable startup time if you choose the wrong CSI driver. I might be biased, but simplyblock is an excellent option with its neglectable provisioning and attachment times, as well as features such as multi-attach, where a volume can be attached to multiple functions (for example, to provide a shared set of data).
Most people think of cost-efficiency when it comes to serverless architectures. However, this is only one side of the coin. If your use cases aren’t a good fit for a serverless environment, it will hold true—more on when serverless makes sense later.
In serverless architectures, functions are triggered through an event, either from the outside world (like an HTTP request) or an event initiated by another function. If no function instance is up and running, a new instance will be started. The same goes for situations where all function instances are busy. If function instances idle, they’ll be shut down.
Serverless functions usually use a pay-per-use model. A function’s extremely short lifespan can lead to cost reductions over deployment models like containers and virtual machines, which tend to run longer.
Apart from that, serverless architectures have more benefits. Many are moving in the same direction as microservices architectures, but with the premise that they are easier to implement and maintain.
First and foremost, serverless solutions are designed for scalability and elasticity. They quickly and automatically scale up and down depending on the incoming workload. It’s all hands-free.
Another benefit is that development cycles are often shortened. Due to the limited size and functionality of a FaaS, changes are fast to implement and easy to test. Additionally, updating the function is as simple as deploying the new version. All existing function instances finish their current work and shut down. In the meantime, the latest version will be started up. Due to its stateless nature, this is easy to achieve.
Writing serverless solutions has the benefits of fast iteration, simplified deployments, and potential cost savings. However, they also come with their own set of complexities.
Designing real stateless code isn’t easy, at least when we’re not just talking about simple transformation functionality. That’s why a FaaS receives and passes context information along during its events.
What works great for small bits of context is challenging for larger pieces. In this situation, a larger context, or state, can mean lots of things, starting from simple cross-request information that should be available without transferring it with every request over more involved data, such as lookup information to enrich and cross-check, all the way to actual complex data, like when you want to implement a serverless database. And yes, a serverless database needs to store its data somewhere.
That’s where serverless storage comes in, and simply put, this is why all serverless solutions have state storage alternatives.
Serverless storage refers to storage solutions that are fully integrated into serverless compute environments without manual intervention. These solutions scale and grow according to user demand and complement the pay-by-use payment model of serverless platforms.
Serverless storage lets you store information across multiple requests or functions.
As mentioned above, cloud environments offer a wide selection of serverless storage options. However, all of them are vendor-bound and lock you into their services.
However, when you design your serverless infrastructure or service, these services don’t help you. It’s up to you to provide the serverless storage. In this case, a cloud-native and serverless-supporting storage engine can simplify this talk immensely. Whether you want to provide object storage, a serverless database, or file-based storage, an underlying cloud-native block storage solution is the perfect building block underneath. However, this block storage solution needs to be able to scale and grow with your needs easily and quickly to provision and support snapshotting, cloning, and attaching to multiple function instances.
Serverless storage has particular properties designed for serverless environments. It needs to keep up with the specific requirements of serverless architectures, most specifically short lifetimes, extremely fast up and down scaling or restarts, easy use across multiple versions during updates, and easy integration through APIs utilized by the FaaS.
The most significant issues are that it must be used by multiple function instances simultaneously and is quickly available to new instances on other nodes, regardless of whether those are migrated over or used for scaling out. That means that the underlying storage technology must be prepared to handle these tasks easily.
These are just the most significant requirements, but there are more:
There are quite some requirements. For the alignment of serverless compute and serverless storage, storage solutions need to provide an efficient and manageable layer that seamlessly integrates with the overall management layer of the serverless platform.
When designing a serverless environment, the storage layer must be designed to keep up with the pace. Simplyblock enables serverless infrastructures to provide dynamic and scalable storage.
To achieve this, simplyblock provides several characteristics that perfectly align with serverless principles:
Simplyblock is the perfect backend storage for all your serverless storage needs while future-proofing your infrastructure. As data grows and evolves, simplyblock’s flexibility and scalability ensure you can adapt without massive overhauls or migrations.
Remember, simplyblock offers powerful features like thin provisioning, storage pooling, and tiering, helping you to provide a cost-efficient, pay-by-use enabled storage solution. Get started now and find out how easy it is to operate services on top of simplyblock.
The post Serverless Compute Need Serverless Storage appeared first on simplyblock.
]]>The post Avoiding Storage Lock-in: Block Storage Migration with Simplyblock appeared first on simplyblock.
]]>For most companies, data is the most crucial part of their business. Therefore, it is dangerous to forfeit the control on how to store this incredibly important good and storage vendor lock-in poses a significant risk to these businesses. When a company becomes overly reliant on a single storage provider or technology, it can find itself trapped in an inflexible situation with far-reaching consequences.
The dangers of vendor lock-in manifest in several ways:
These factors can lead to increased operational costs, decreased competitiveness, and potential disruptions to business continuity. As such, it’s crucial for organizations to carefully consider their storage strategies and implement measures to mitigate risks of vendor lock-in.
The interfaces provided by a database system are extremely complex compared to block storage. While some of it is standardized in SQL, there are a lot of system specifics in data and administrative interfaces. Migrating a database system from one to another—or even upgrading a release—requires entire projects.
On the other hand, the block storage interface on Linux is extremely simple in its essence, it allows you to write, read, or delete (trim) a range of blocks. The NVMe protocol itself is a bit more complicated, but is fully standardized (industry standard, managed by NVM Express, Inc.) and the majority of advanced features are neither required nor used. Most commonly they aren’t even accessible through the Linux block storage interface.
In essence, to migrate block storage under Linux, just follow a few simple steps, which have to be performed volume-by-volume: Take your volume offline Create your new volume of the same size or larger Copy or replicate the data on block-level (under Linux just use dd) Potentially resize the filesystem (if necessary) Verify the results In some cases, it is even possible to lower or eliminate the down-time and perform online replication. For example, the Linux Volume Manager (LVM) provides a feature to move data between physical volumes for a particular logical volume under the hood and while the volume is online (pvmove).
When operating in a Kubernetes-based environment, this simple migration is still perfectly available when using file or block storage.
Migrating from any block storage to a simplyblock logical volume (virtual NVMe block device) is simple and supported through sbcli (the simplyblock command line interface).
Within a plain linux environment, it is possible to use sbcli migrate with an input of a list of block storage volumes. The necessary and corresponding simplyblock logical volumes are created first. Those volumes may be of the same size or larger. The source volumes are then unmounted, and volume level replication takes place. Finally, source volumes may be deleted and replicated volumes are mounted instead.
To migrate existing PVCs (Persistent Volume Claim) from any underlying storage, we need to first replicate them into simplyblock. Simplyblock’s internal Kubernetes job sbcli ‑ migrate can automatically select all PVs (Persistent Volume) of a particular type, storage class, or label. During the migration, PVCs may still be active, meaning that PVs can be mounted, but pods must be stopped.
Simplyblock will then create corresponding volumes and PVCs. Afterwards it will replicate the source’s content over to the new volumes, and deploy them under the same mount points.
Optionally, it is possible to resize the volumes during this process and to automatically delete the originals when the process finishes successfully.
Migrating a specific volume away from simplyblock is just as easy. Outside of Kubernetes, using dd is the easiest way with the source and destination volumes being unmounted and just copied.
Inside a Kubernetes environment, the process of migrating block and file storage is straight-forward, too.
Individual PVs can simply be backed up after deleting the PVC. Make sure that the lifecycle of the PV and PVC aren’t bound, otherwise the PV will be deleted by Kubernetes in the process. Afterwards, the PV can be restored to new volumes and eventually re-mounted as a new PVC.
Velero is a tool that greatly helps to simplify this process.
Utilizing block storage brings the best of all worlds. Easy migration options, compatibility due to standardized interfaces, and the possibility to choose the best tool for the job by mixing different block storage options.
Simplyblock embraces the fact that there is no one-fits-all solution and enables full interoperability and compatibility with the default standard interfaces in modern computing systems, such as block storage and the NVMe protocol. Hence, simplyblock’s logical volumes provide an easy migration path from and to simplyblock.
However, simplyblock logical volumes provide additional features that make users want to stay.
Simplyblock volumes are full copy-on-write block storage devices which enable immediate snapshots and clones. Those can be used for fast backups, or to implement features such as database branching, enabling fast turn-around times when implementing customer facing functionality.
Furthermore, multi-tenancy (including encryption keys per volume) and thin provisioning enable storage virtualization with overprovisioning. Making use of the fact that a typical storage utilization is around 30% brings down bundled storage requirements by 70% and provides a great way to optimize for cost efficiency. Additional features such as deduplication can decrease storage needs even further.
All this and more makes simplyblock, the intelligent storage orchestrator, the perfect storage solution for database operators and everyone who operates stateful Kubernetes workloads that require high performance and low latency block or file storage.
The post Avoiding Storage Lock-in: Block Storage Migration with Simplyblock appeared first on simplyblock.
]]>The post What is Block Storage? appeared first on simplyblock.
]]>Block storage is the most versatile type of storage, as it is the underlying structure of other storage options, such as file or object storage. It is also the most known type of storage since most typical storage media (HDD, SSD, NVMe, …) are exposed to the system as block storage devices.
Block storage devices are split into a number of independent blocks. Each block has a logical block address (LBA) which uniquely identifies it. Furthermore, the blocks are all the same size for the same block device, and typically only one piece of information can be stored within a single block (as it is the smallest addressable unit).
When an application wants to write a file it is first determined if the file fits into a single block. If this is the case, it’s an easy operation. Find a free / unused block, write the file to it.
If the file is larger than a single block, it is split into multiple parts, with each part being written to a separate free block. The order or consecutive positioning of these blocks is not guaranteed.
Anyhow, after the file is written to one or more blocks, the block address(es) are written to a lookup table. The lookup table is provided through the filesystem that was installed onto the block device and varies depending on the filesystem in use. If you’ve ever heard the term Inode in Linux, that’s part of the lookup mechanism.
Though, when reading the file, based on the filename the blocks and their read-order is looked up in the lookup table, and the block storage reads the requested blocks back into memory where the file is pieced together in the right order.
Due to the unique characteristics of block storage, it can be used for any kind of use case. Typical simple use cases include computer storage, including virtual hard drives for virtual machines, being used to store and boot the operating system.
Where block storage really shines though is when high performance is required, or when IO-intensive, latency sensitive, as well as mission-critical workloads, such as relational or transactional databases, time-series databases, container storage, require storage. In these cases it’s common to claim, the faster the better.
A transactional workload is a series of changes from different users. That means that the database receives reads and writes from various users over time. Modifications between different changes need to be atomic (meaning, happen at once or not at all), which is known as a transaction. A common example of transactional workloads are banking systems, where multiple (money) transactions happen in parallel.
Due to the nature of block storage, where each block is an independent unit, databases can optimally read and write data, either with a filesystem in between, or taking on the role of managing the block assignment themselves. With a growing data set, the underlying physical storage can be split into multiple devices, or even multiple storage nodes. The logical view of a block storage device stays intact.
Virtual machines and containers are designed to be a flexible way to place workloads on machines, isolated from each other. This flexibility requires storage which is just as flexible and can easily be grown in size and migrated to other locations (servers, data centers, or operating environments). While alternative storage technologies are available, none of them is as flexible as pure block storage devices.
Workloads with high data velocity, meaning rapidly changing data, oftentimes within seconds, need storage solutions that can keep up with the speed of writes and reads. Typical use cases of such workloads include Big Data Analytics, but also real-time use cases, such as GPS tracking data (Uber, DHL, etc). In these cases, direct addressable block access improves read and write performance by removing additional, non-standard access layers.
File level storage, or file storage refers to storage options that work purely on a file level. File storage is commonly associated with local file systems such as NTFS, ext4, or network file systems such as SMB (the Windows file sharing protocol), or NFS.
From a user’s perspective, file storages are easy to use and to navigate, since their design replicates how we operate with local file systems. The present directories and files, and mimic the hierarchical nesting of those. File storages often provide access control and permissions on a file basis.
While easy to use, the way those storages are implemented introduces a single access path, hence the performance can be impacted compared to block storage, especially in situations with many concurrent accesses. It also means that interoperability may be decreased over a pure block storage device since not every file system implementation is available on every operating system.
Typically, a file storage is backed by a block storage device in combination with a file system. This file system is either used locally, or made available remotely through one of the available network file systems.
Object storage, sometimes also known as blob storage, is a storage approach which stores information in blobs or objects (which explains the origin of its name). Each object has a variable amount of metadata attached to it, and is globally uniquely identifiable.
The object identities are commonly collected and managed by the application that stores or reads the file. These identities commonly are represented by URIs, due to the fact that most object storages (these days) are based on HTTP services, such as AWS S3, or Azure Blob Storage. This means that typical access patterns aren’t available, and that object storages most often require application changes. The S3 protocol currently is kind of a de facto standard across many object storage implementations. Yet, not all (especially other cloud providers) implement it. Meaning that implementations aren’t compatible or interchangeable.
Object storages, while versatile, impact the performance and accessibility of files. The additional protocol overhead, as well as access patterns are great for unstructured, static files, such as images, video data, backup files, and similar, but aren’t a good fit for frequently accessed or updated data.
In summary, block storage, file storage, and object storage each offer distinct advantages and are suited to different use cases. While block storage excels in performance-critical applications, file storage is ideal for shared file access, and object storage provides scalable storage for unstructured data.
Anyhow, each of those storage types is available via one or more storage as a service offerings. While some may be compatible inside their own category, others are not. Being able to interchange implementations, or change cloud providers when necessary may be a requirement. Incompatible protocols, especially in the case of object storages, paired with the performance impact over block storages makes a basic block storage still the tool of choice for most use cases.
Simplyblock offers a highly distributed, NVMe optimized, block storage solution. It combines the performance and capacity of many storage devices throughout the attached cluster nodes and enables the creation of logical block devices of various sizes and performance characteristics. Imagine virtualization, but for your storage. To build your own Amazon EBS like storage solution today, you can get started right away. An overview of all the features, such as snapshots, copy-on-write clones, online scalability, and much more, can be found on our feature page.
The post What is Block Storage? appeared first on simplyblock.
]]>The post Simplyblock for AWS (demo videos) appeared first on simplyblock.
]]>We have recently conducted a demo showcasing capabilities of simplyblock’s high-perfromance block storage on the AWS platform. The demo offers a glimpse into the seamless integration, exceptional performance, and cost-effectiveness that simplyblock brings to AWS users. In this article, we delve into the highlights of the simplyblock demo and explore how it can revolutionize storage infrastructure on AWS.
During the simplyblock demo, AWS users can witness firsthand the unparalleled performance of simplyblock’s block storage solution. Leveraging the power of local NVMe storage attached to EC2 instances, simplyblock achieved impressive IOPS rates, ensuring lightning-fast data access and processing. The demo showcases the seamless scalability and linear performance scaling of simplyblock’s cluster-based architecture, allowing organizations to handle demanding workloads with ease.
One of the key takeaways from the simplyblock demo is its cost-effectiveness. By offering a highly efficient alternative to traditional AWS storage solutions, simplyblock demonstrated its ability to significantly reduce storage costs for AWS users. The demo highlights how simplyblock’s solution leverages resource pooling, thin provisioning, and effective compaction technology to optimize storage utilization and drive substantial cost savings. With simplyblock, organizations can achieve superior performance while staying within budgetary constraints.
The simplyblock demo for AWS showcases the game-changing potential of our block storage solution. With its unmatched performance, seamless scalability, and impressive cost efficiency, simplyblock offers AWS users a powerful storage alternative that can revolutionize their infrastructure. By harnessing the performance capabilities of local NVMe storage and lev eraging innovative storage technologies, simplyblock opens up new possibilities for organizations seeking optimal performance, scalability, and cost-effectiveness in their AWS environments. With simplyblock, AWS users can unlock the true potential of their storage infrastructure and propel their business forward into a new era of performance and efficiency.
Read more about our block storage for AWS on our website: https://www.Simplyblock.io/aws-storage
The post Simplyblock for AWS (demo videos) appeared first on simplyblock.
]]>