The post NVMe over TCP vs iSCSI: Evolution of Network Storage appeared first on simplyblock.
]]>When data grows, storage needs to grow, too. That’s when remotely attached SAN (Storage Area Network) systems come in. So far, these were commonly connected through one of three protocols: Fibre Channel, Infiniband, or iSCSI. However, with the latter being on the “low end” side of things, without the need for special hardware to operate. NMVe over Fabrics (NVMe-oF), and specifically NVMe over TCP (NVMe/TCP) as the successor of iSCSI, is on the rise to replace these legacy protocols and bring immediate improvements in latency, throughput, and IOPS.
iSCSI is a protocol that connects remote storage solutions (commonly hardware storage appliances) to storage clients. The latter are typically servers without (or with minimal) local storage, as well as virtual machines. In recent years, we have also seen iSCSI being used as a backend for container storage.
iSCSI stands for Internet Small Computer Storage Interface and encapsulates the standard SCSI commands within TCP/IP packets. That means that iSCSI works over commodity Ethernet networks, removing the need for specialized hardware such as network cards (NICs) and switches.
The iSCSI standard was first released in early 2000. A world that was very different from today. Do you remember what a phone looked like in 2000?
That said, while there was access to the first flash-based systems, prices were still outrageous, and storage systems were designed with spinning disks in mind. Remember that. We’ll come back to it later.
SCSI, or, you guessed it, the Small Computer Storage Interface, is a set of standards for connecting and transferring data between computers and peripheral devices. Originally developed in the 1980s, SCSI has been a foundational technology for data storage interfaces, supporting various device types, primarily hard drives, optical drives, and scanners.
While SCSI kept improving and adding new commands for technologies like NVMe. The foundation is still rooted in the early 1980s, though. However, many standards still use the SCSI command set, SATA (home computers), SAS (servers), and iSCSI.
Non-Volatile Memory Express (NVMe) is a modern PCI Express-based (PCI-e) storage interface. With the original specification dating back to 2011, NVMe is engineered specifically for solid-state drives (SSDs) connected via the PCIe bus. Therefore, NVMe devices are directly connected to the CPU and other devices on the bus to increase throughput and latency. NVMe dramatically reduces latency and increases input/output operations per second (IOPS) compared to traditional storage interfaces.
As part of the NVMe standard, additional specifications are developed, such as the transport specification which defines how NVMe commands are transported (e.g., via the PCI Express Bus, but also networking protocols like TCP/IP).
Traditional spinning hard disk drives (HDDs) rely on physical spinning platters and moveable read/write heads to write or access data. When data is requested, the mechanical component must be physically placed in the correct location of the platter stack, resulting in significant access latencies ranging from 10-14 milliseconds.
Flash storage, including NVMe devices, eliminates the mechanical parts, utilizing NAND flash chips instead. NAND stores data purely electronically and achieves access latencies as low as 20 microseconds (and even lower on super high-end gear). That makes them 100 times faster than their HDD counterparts.
For a long time, flash storage had the massive disadvantage of limited storage capacity. However, this disadvantage slowly fades away with companies introducing higher-capacity devices. For example, Toshiba just announced a 180TB flash storage device.
Cost, the second significant disadvantage, also keeps falling with improvements in development and production. Technologies like QLC NAND offer incredible storage density for an affordable price.
Anyhow, why am I bringing up the mechanical vs electrical storage principle? The reason is simple: the access latency. SCSI and iSCSI were never designed for super low access latency devices because they didn’t really exist at the time of their development. And, while some adjustments were made to the protocol over the years, their fundamental design is outdated and can’t be changed for backward compatibility reasons.
NVMe over Fabrics (also known as NVMe-oF) is an extension to the NVMe base specification. It allows NVMe storage to be accessed over a network fabric while maintaining the low-latency, high-performance characteristics of local NVMe devices.
NVMe over Fabrics itself is a collection of multiple sub-specifications, defining multiple transport layer protocols.
When comparing NVMe over TCP vs iSCSI, we see considerable improvements in all three primary metrics: latency, throughput, and IOPS.
The folks over at Blockbridge ran an extensive comparison of the two technologies, which shows that NVMe over TCP outperformed iSCSI, regardless of the benchmark.
I’ll provide the most critical benchmarks here, but I recommend you read through the full benchmark article right after finishing here.
Anyhow, let’s dive a little deeper into the actual facts on the NVMe over TCP vs iSCSI benchmark.
Editor’s Note: Our Developer Advocate, Chris Engelbert, gave a talk recently at SREcon in Dublin, talking about the performance between NVMe over TCP and iSCSI, which led to this blog post. Find the full presentation NVMe/TCP makes iSCSI look like Fortran.
Evaluating storage performance involves comparing four major performance indicators.
Editor’s note: For latency, throughput, and IOPS, we have an exhaustive blog post talking deeper about the necessities, their relationships, and how to calculate them.
A comprehensive performance testing involves simulated workloads that mirror real-world scenarios. To simplify this process, benchmarks use tools like FIO (Flexible I/O Tester) to generate consistent, reproducible test data and results across different storage configurations and systems.
Running IOPS-intensive applications, the number of available IOPS in a storage system is critical. IOPS-intensive application means systems such as databases, analytics platforms, asset servers, and similar solutions.
Improving IOPS by exchanging the storage-network protocol is an immediate win for the database and us.
Using NVMe over TCP instead of iSCSI shows a dramatic increase in IOPS, especially for smaller block sizes. At 512 bytes block size, Blockbridge found an average 35.4% increase in IOPS. At a more common 4KiB block size, the average increase was 34.8%.
That means the same hardware can provide over one-third more IOPS using NVMe over TCP vs iSCSI at no additional cost.
While IOPS-hungry use cases, such as compaction events in databases (Cassandra), benefit from the immense increase in IOPS, latency-sensitive applications love low access latencies. Latency is the primary factor that causes people to choose local NVMe storage over remotely-attached storage, knowing about many or all drawbacks.
Latency-sensitive applications range from high-frequency trading systems, where milliseconds are measured in hard money, over telecommunication systems, where latency can introduce issues with system synchronization, to cybersecurity and threat detection solutions that need to react as fast as possible.
Therefore, decreasing latency is a significant benefit for many industries and solutions. Apart from that, a lower access latency always speeds up data access, even if your system isn’t necessarily latency-sensitive. You will feel the difference.
Blockbridge found the most significant benefit in access latency reduction with a block size of 16KiB with a queue depth of 128 (which can easily be hit with I/O demanding solutions). The average latency for iSCSI was 5,871μs compared to NVMe over TCP with 5,089μs. A 782μs (~25%) decrease in access latency—just by exchanging the storage protocol.
As the third primary metric of storage performance, throughput describes how much data is actually pumped from the disk into your workload.
Throughput is the major factor for applications such as video encoding or streaming platforms, large analytical systems, and game servers streaming massive worlds into memory. Furthermore, there are also time-series storage, data lakes, and historian databases.
Throughput-heavy systems benefit from higher throughput to get the “job done faster.” Oftentimes, increasing the throughput isn’t easy. You’re either bound by the throughput provided by the disk or, in the case of a network-attached system, the network bandwidth. To achieve high throughput and capacity, remote network storage utilizes high bandwidth networking or specialized networking systems such as Fibre Channel or Infiniband.
Blockbridge ran their tests on a dual-port 100Gbit/s network card, limited by the PCI Express x16 Gen3 bus to a maximum throughput of around 126Gbit/s. Newer PCIe standards achieve much higher throughput. Hence, NVMe devices and NICs aren’t bound by the “limiting” factor of the PCIe bus anymore.
With a 16KiB block size and a queue depth of 32, their benchmark saw a whopping 2.3GB/s increase in performance on NVMe over TCP vs iSCSI. The throughput increased from 10.387GBit/s on iSCSI to 12.665GBit/s, an easy 20% on top—again, using the same hardware. That’s how you save money.
We’ve seen that NVMe over TCP has significant performance advantages over iSCSI in all three primary storage performance metrics. Nevertheless, there are more advantages to NVMe over TCP vs iSCSI.
Deciding between two technologies isn’t always easy. It isn’t that hard in the case of NVMe over TCP vs iSCSI. The use cases for new iSCSI deployment are very sparse. From my perspective, the only valid use case is the integration of pre-existing legacy systems that don’t yet support NVMe over TCP
That’s why simplyblock, as an NVMe over TCP first solution, still provides iSCSI if you really need it. We offer it exactly for the reason that migrations don’t happen from today to tomorrow. Still, you want to leverage the benefits of newer technologies, such as NVMe over TCP, wherever possible. With simplyblock, logical volumes can easily be provisioned as NVMe over TCP or iSCSI devices. You can even switch over from iSCSI to NVMe over TCP later on.
In any case, you should go with NVMe over TCP when:
You should stay on iSCSI (or slowly migrate away) when:
You see, there aren’t a lot of reasons. Given that, it’s just a matter of selecting your new storage solution. Personally, these days, I would always recommend software-defined storage solutions such as simplyblock, but I’m biased. Anyhow, an SDS provides the best of both worlds: commodity storage hardware (with the option to go all in with your 96-bay storage server) and performance.
Simplyblock demonstrates forward-thinking storage design by supporting both NVMe over TCP and iSCSI, providing customers with the best performance when available and the chance to migrate slowly in the case of existing legacy clients.
Furthermore, simplyblock offers features known from traditional SAN storage systems or “filesystems” such as ZFS. This includes a full copy-on-write backend with instant snapshots and clones. It includes synchronous and asynchronous replication between storage clusters. Finally, simplyblock is your modern storage solution, providing storage to dedicated hosts, virtual machines, and containers. Regardless of the client, simplyblock offers the most seamless integration with your existing and upcoming environments.
As enterprise and cloud computing continue to evolve, NVMe over TCP stands as the technology of choice for remotely attached storage. Firstly, it combines simplicity, performance, and broad compatibility. Secondly, it provides a cost-efficient and scalable solution utilizing commodity network gear.
The protocol’s ongoing development (last specification update May 2024) and increasing adoption show continued improvements in efficiency, reduced latency, and enhanced scalability.
NVMe over TCP represents a significant step forward in storage networking technology. Furthermore, combining the raw performance of NVMe with the ubiquity of Ethernet networking offers a compelling solution for modern computing environments. While iSCSI remains relevant for specific use cases and during migration phases, NVME over TCP represents the future and should be adopted as soon as possible.
We, at simplyblock, are happy to be part of this important step in the history of storage.
Yes, NVMe over TCP is superior to iSCSI in almost any way. NVMe over TCP provides lower protocol overhead, better throughput, lower latency, and higher IOPS compared to iSCSI. It is recommended that iSCSI not be used for newly designed infrastructures and that old infrastructures be migrated wherever possible.
NVMe over TCP is superior in all primary storage metrics, meaning IOPS, latency, and throughput. NVMe over TCP shows up to 35% higher IOPS, 25% lower latency, and 20% increased throughput compared to iSCSI using the same network fabric and storage.
NVMe/TCP is a storage networking protocol that utilizes the common internet standard protocol TCP/IP as its transport layer. It is deployed through standard Ethernet fabrics and can be run parallel to existing network traffic, while separation through VLANs or physically separated networks is recommended. NVMe over TCP is considered the successor of the iSCSI protocol.
iSCSI is a storage networking protocol that utilizes the common internet standard protocol TCP/IP as its transport layer. It connects remote storage solutions (commonly hardware storage appliances) to storage clients through a standard Ethernet fabric. iSCSI was initially standardized in 2000. Many companies replace iSCSI with the superior NVMe over TCP protocol.
SCSI (Small Computer Storage Interface) is a command set that connects computers and peripheral devices and transfers data between them. Initially developed in the 1980s, SCSI has been a foundational technology for data storage interfaces, supporting various device types such as hard drives, optical drives, and scanners.
NVMe (Non-Volatile Memory Express) is a specification that defines the connection and transmission of data between storage devices and computers. The initial specification was released in 2011. NVMe is designed specifically for solid-state drives (SSDs) connected via the PCIe bus. NVMe devices have improved latency and performance than older standards such as SCSI, SATA, and SAS.
The post NVMe over TCP vs iSCSI: Evolution of Network Storage appeared first on simplyblock.
]]>The post What is NVMe Storage? appeared first on simplyblock.
]]>While commonly found in home computers and laptops (M.2 factor), it is designed from the ground up for all types of commodity and enterprise workloads. It guarantees fast load times and response times, even in demanding application scenarios.
The main intention in developing the NVMe storage protocol was to transfer data through the PCIe (PCI Express) bus. Since the low-overhead protocol, more use cases have been found through NVMe specification extensions managed by the NVM Express group. Those extensions include additional transport layers, such as Fibre Channel, Infiniband, and TCP (collectively known as NVMe-oF or NVMe over Fabrics).
Traditionally, computers used SATA or SAS (and before that, ATA, IDE, SCSI, …) as their main protocols for data transfers from the disk to the rest of the system. Those protocols were all developed when spinning disks were the prevalent type of high-capacity storage media.
NVMe, on the other hand, was developed as a standard protocol to communicate with modern solid-state drives (SSDs). Unlike traditional protocols, NVMe fully takes advantage of SSDs’ capabilities. It also provides support for much lower latency due to the missing repositioning of read-write heads and rotating spindles.
The main reason for developing the NVMe protocol was that SSDs were starting to experience throughput limitations due to the traditional protocols SAS and SATA.
Anyhow, NVMe communicates through a high-speed Peripheral Component Interconnect Express bus (better known as PCIe). The logic for NVMe resides inside the controller chip on the storage adapter board, which is physically located inside the NVMe-capable device. This board is often co-located with controllers for other features, such as wear leveling. When accessing or writing data, the NVMe controller talks directly to the CPU through the PCIe bus.
The NVMe standard defines registers (basically special memory locations) to control the protocol, a command set of possible operations to be executed, and additional features to improve performance for specific operations.
Compared to traditional storage protocols, NVMe has much lower overhead and is better optimized for high-speed and low-latency data access.
Additionally, the PCI Express bus can transfer data faster than SATA or SAS links. That means that NVMe-based SSDs provide a latency of a few microseconds over the 40-100ms for SATA-based ones.
Furthermore, NVMe storage comes in many different packages, depending on the use case. That said, many people know the M.2 form factor for home use, however it is limited in bandwidth due to the much fewer available PCIe lanes on consumer-grade CPUs. Enterprise NVMe form factors, such as U.2, provide more and higher capacity uplinks. These enterprise types are specifically designed to sustain high throughput for ambiguous data center workloads, such as high-load databases or ML / AI applications.
Last but not least, NVMe commands can be streamlined, queued, and multipath for more efficient parsing and execution. Due to the non-rotational nature of solid-state drives, multiple operations can be executed in parallel. This makes NVMe a perfect candidate for tunneling the protocol over high-speed communication links.
NVMe over Fabrics is a tunneling mechanism for access to remote NVMe devices. It extends the performance of access to solid-state drives over traditional tunneling protocols, just like iSCSI.
NVMe over Fabrics is directly supported by the NVMe driver stacks of common operating systems, such as Linux and Windows (Server), and doesn’t require additional software on the client side.
At the time of writing, the NVM Express group has standardized the tunneling of NVMe commands through the NVMe-friendly protocols Fibre Channel, Infiniband, and Ethernet, or more precisely, over TCP.
NVMe over Fibre Channel is a high-speed transfer that connects NVMe storage solutions to client devices. Fibre Channel, initially designed to transport SCSI commands, needed to translate NVMe commands into SCSI commands and back to communicate with newer solid-state hardware. To mitigate that overhead, the Fibre Channel protocol was enhanced to natively support the transport of NVMe commands. Today, it supports native, in-order transfers between NVMe storage devices across the network.
Due to the fact that Fibre Channel is its own networking stack, cloud providers (at least none of my knowledge) don’t offer support for NVMe/FC.
NVMe over TCP provides an alternative way of transferring NVMe communication through a network. In the case of NVMe/TCP, the underlying network layer is the TCP/IP protocol, hence an Ethernet-based network. That increases the availability and commodity of such a transport layer beyond separate and expensive enterprise networks running Fibre Channel.
NVMe/TCP is rising to become the next protocol for mainstream enterprise storage, offering the best combination of performance, ease of deployment, and cost efficiency.
Due to its reliance on TCP/IP, NVMe/TCP can be utilized without additional modifications in all standard Ethernet network gear, such as NICs, switches, and copper or fiber transports. It also works across virtual private networks, making it extremely interesting in cloud, private cloud, and on-premises environments, specifically with public clouds with limited network connectivity options.
A special version of NVMe over Fabrics is NVMe over RDMA (or NVMe/RDMA). It implements a direct communication channel between a storage controller and a remote memory region (RDMA = Remote Direct Memory Access). This lowers the CPU overhead for remote access to storage (and other peripheral devices). To achieve that, NVMe/RDMA bypasses the kernel stack, hence it mitigates the memory copying between the driver stack, the kernel stack, and the application memory.
NVMe over RDMA has two sub-protocols: NVMe over Infiniband and NVMe over RoCE (Remote Direct Memory Access over Converged Ethernet). Some cloud providers offer NVMe over RDMA access through their virtual networks.
NVMe over TCP provides performance and latency benefits over the older iSCSI protocol. The improvements include about 25% lower protocol overhead, meaning more actual data can be transferred with every TCP/IP packet, increasing the protocol’s throughput.
Furthermore, NVMe/TCP enables native transfer of the NVMe protocol, removing multiple translation layers between the older SCSI protocol (which is used in iSCSI, hence the name) and NVMe.
That said, the difference is measurable. Blockbridge Networks, a provider of all-flash storage hardware, did a performance benchmarking of both protocols and found a general improvement of access latency of up to 20% and an IOPS improvement of up to 35% using NVMe/TCP over iSCSI to access the remote file storage.
NVMe storage’s benefits and ability to be tunneled through different types of networks (including virtual private networks in cloud environments through NVMe/TCP) open up a wide range of high-performance, latency-sensitive, or IOPS-hungry use cases.
Relational Databases with high load or high-velocity data. That includes:
No matter how we look at it, the amount of data we need to transfer (quickly) from and to storage devices won’t shrink. NVMe is the current gold standard for high-performance and low-latency storage. Making NVMe available throughout a network and accessing the data remotely is becoming increasingly popular over the still prevalent iSCSI protocol. The benefits are imminent whenever NVMe-oF is deployed.
The storage solution by simplyblock is designed around the idea of NVMe being the better way to access your data. Built from the ground up to support NVMe throughout the stack, it combines NVMe solid-state drives into a massive storage pool. It creates logical volumes, with data spread around all connected storage devices and simplyblock cluster nodes. Simplyblock provides these logical volumes as NVMe over TCP devices, which are directly accessible from Linux and Windows. Additional features such as copy-on-write clones, thin provisioning, compression, encryption, and more are given.
If you want to learn more about simplyblock, read our feature deep dive. You want to test it out, then get started right away.
The post What is NVMe Storage? appeared first on simplyblock.
]]>