PCIe Bifurcation and NVMe RAID in Linux Part 2: Benchmarks and Encryption

📅 April 10, 2023
“How fast is it?”

With the hardware configured, installed, and running, it is time to run a few benchmarks using the Disks benchmarking utility and KDiskMark to get an idea of the maximum synthetic speeds possible with our new arrangement.

Will NVMe RAID utilizing PCIe bifurcation achieve worthwhile results or will this be underwhelming?

Hardware Review

Note: Nobody sponsors this. I explored a project that I liked and wanted to share the results with others. Any links to Amazon are affiliate links to help readers locate the items for more information and to help cover the time spent researching this article since I earn a commission on qualifying purchases at no extra cost to readers.

Part 1 covered the hardware and setup and described PCIe bifurcation, but here is a brief review of the hardware used for this project.

Adapter card (left) with two SN770 NVMe devices installed in the heat sinks. The heat sinks sandwich the NVMe SSDs.

Once installed, the card looks like this and is ready to be installed into the computer.

Unencrypted Benchmarks

We will look at both unencrypted and encrypted LUKS performance, so let us begin with unencrypted single NVMe benchmarks. No RAID yet. Just the individual NVMe SSDs. These numbers will be the highest we can achieve on a PCIe 3.0 system, and we will use these numbers as a baseline.

KDiskMark (Single NVMe)

Single NVMe. Each NVMe was benchmarked independently to show that each is working at full potential.

Good. These numbers are normal and reach about the limits of PCIe 3.0 x4. Each M.2 slot on the adapter card has four PCIe lanes, which makes the high mark around ~3600 MB/s. Had this been installed in a motherboard supporting PCIe 4.0, then these numbers should be higher. But PCIe 3.0 is what I have on hand.

Disks (Single NVMe)

Excellent! The graphs in both show consistent performance without and major dips or spikes. This is the sign of a quality NVMe. Ignore the slight difference in read/write speeds since it is impossible to benchmark the same numbers twice. The numbers are close enough to assume that both M.2 slots on the adapter card offer identical performance.

These numbers from KDiskMark and Disks establish that the NVMe devices can operate 3.6/3.3 GB/s for reads and writes. Always keep in mind that real-world usage will be lower. Never become dazzled and awed by benchmark numbers. However, this shows that the hardware has the potential to reach these numbers. That is why we need to form a baseline.

RAID-0 (WD Black SN770 x2)

Yes, yes, RAID-0 is not really RAID because if one NVMe fails, all data is lost. However, it is too tempting to experiment with RAID-0 just to see what kind of benchmark number we can achieve.

To create the RAID-0 array, use mdadm and treat the NVMe devices like any other hard drive or storage device. Neither the adapter card nor the motherboard supports NVMe RAID (despite calling it that in the BIOS), so we must use a software solution.

sudo apt install mdadm
sudo mdadm --create --verbose /dev/md2 --level=0 --raid-devices=2 /dev/nvme1n1 /dev/nvme2n1

Then, format as ext4 in Disks. Also, remove the reserved blocks with tunes2fs.

sudo tune2fs -m 0 /dev/md2

(Use the RAID array device. /dev/md2 is only an example.)

Oh, wow! Using RAID-0, it is possible to exceed the limits of PCIe 3.0 x4 by using two x4 slots in parallel for reads and writes. These are reaching single PCIe Gen 4.0 x4 numbers.

Disks 100x100M with the same RAID-0 array. Expected numbers. We are essentially doubling what is possible with a single NVMe in a single PCIe 3.0 x4 M.2 slot even though it requires two PCIe 3.0 x4 M.2 slots to achieve it.

RAID-0 combines two 2TB NVMe into a single 4TB storage area. Unless you need to manage multiple NVMe devices as a single storage device for disposable data, such as a temporary scratch area for video editing or data manipulation, I cannot recommend RAID-0 because there is no redundancy involved.

RAID-0 is unusable for my needs, but this was a fun test anyway.

RAID-1 (2x WD Black SN770)

To create a RAID-1, we must first destory the RAID-0 array. Let us assume that the existing array is /dev/md2 and the two member drives (the two SN770 NVMe devices) are /dev/nvme1n1 and /dev/nvme2n1. Always double check in order to avoid ruining the partition table on the wrong device. Use sudo fdisk -l or Disks to double check.

sudo mdadm --stop /dev/md2
supo mdadm --zero-superblock /dev/nvme1n1 /dev/nvme2n1

Make sure to zero out the super blocks so we can start fresh with a new array. I would recommend rebooting the system at this point to ensure that Linux is updated with the current device table information. I encountered issues where the benchmarks would not run between RAID arrays after destroying and recreating many arrays successively.

Now, create the RAID-1 array using mdadm.

sudo mdadm --create --verbose /dev/md2 --level=1 --raid-devices=2 /dev/nvme1n1 /dev/nvme2n1

Format in Disks as ext4 and remove the reserved blocks with tune2fs.

sudo tune2fs -m 0 /dev/md2

KDiskMark (RAID-1)

Note: All benchmarks were taken while RAID-1 was resyncing. It takes 145 minutes to resync a new SN770 RAID-1 array, and I did not have time to wait.

KDiskMark. These are normal numbers because RAID-1 does not increase the read/write speeds. It only adds redundancy by mirroring the data to both NVMe devices.

Disks (100x100M)

Note: All benchmarks were taken while RAID-1 was resyncing. It takes 145 minutes to resync a new SN770 RAID-1 array, and I did not have time to wait.

The performance graph is still good with RAID-1. I recommend using paired NVMe devices of the same make and model. Read/write performance is what we should expect from RAID-1. The goal of RAID-1 is redundancy, not speed. However, there should not be too much of a drop from the baseline.

With RAID-1, the end result is an array that reads and writes at a little less than the speed possible using a single NVMe. For redundancy, this is excellent. It definitely beats a mechanical hard drive or even an SSD, but is an NVMe likely to fail compared to a mechanical drive with spinning platters? RAID-1 performs lower than a single NVMe as seen in the comparison above.

“Why use RAID-1 NVMe?”

RAID-1 was my original goal. I wanted to create a fast, reliable system (faster than using mechanical drives), so I tried using NVMe. It works as shown above. This achieves that goal, but I have not had a single NVMe fail yet compared to mechanical drives that have failed. Is RAID-1 even necessary for NVMe? Still thinking about that one…

Encryption Benchmarks

Faster data access is only one half of the solution. The other half involves encryption — protecting data at rest when the system is off. Is it possible to encrypt the NVMe devices themselves and the RAID arrays using LUKS?

LUKS is the Linux Unified Key System that allows full-disk encryption in Linux. You will need to install it, but after that, it offer AES encryption on storage devices. It is very useful.

Another full-disk encryption program is VeraCrypt that offers AES as well as many other time-tested encryption algorithms. I am choosing AES in order to utilize the CPU’s AES-NI instruction set for faster hardware encryption/decryption that results in faster performance.

Both single NVMe and RAID arrays can be encrypted using either LUKS or VeraCrypt. When using plain AES, I saw no difference between the two. LUKS tended to be a few hundred MB/s faster than VeraCrypt, so I chose LUKS for the majority of this project.

KDiskMark (Single NVMe with LUKS)

Single NVMe LUKS encryption comparison with unencrypted NVMe.

Hmm. It looks like encryption is causing the NVMe performance to take a significant hit. Notice that the type of KDiskMark preset used affects the numbers. NVMe preset yields higher numbers, but the Q1T1 number from the standard preset is what we want to watch the most since it tends to best reflect real-world performance during everyday usage.

Let’s see what LUKS does to RAID.

KDiskMark (Single vs. RAID-0 vs. RAID-1)

What effect does LUKS encryption have on performance? Shown here is a comparison of RAID-0 and RAID-1 with a single NVMe when using LUKS on NVMe RAID arrays. VeraCrypt (AES) is added for comparison (NVMe VeraCrypt preset missing).

Well, this is certainly unexpected. RAID-0 with LUKS performs hardly any better than a single LUKS-encrypted NVMe. This was consistent no matter how many times I reran the RAID-0 LUKS benchmark.

KDiskMark (Multiple RAID-0 + LUKS Benchmarks)

Multiple RAID-0 + LUKS benchmark results shown. Comparison with unencrypted RAID-0. Standard preset.

RAID-0 unencryption delivers high numbers, but LUKS and VeraCrypt (AES) will crash those numbers. I wanted to be sure that this was not a fluke. No matter how many times I rebooted the system and ran the benchmark, RAID-0 + LUKS (or VeraCrypt AES) consistently returned results that were worse than a single NVMe with LUKS.

The same is true with RAID-1. Encryption results in slower performance.

RAID-1 also results in lower performance when encryption is used.

The Q1T1 numbers are what I focus on the most since that better reflects the “do something one-at-a-time” nature of the everyday desktop user. The higher sequential numbers seen in Q8T1 and Q32T1 are mostly useful for consecutive file transfers. Unless you are copying files back and forth extensively, these numbers are somewhat irrelevant. This why I admonish not to become dazzled with high benchmark numbers. Computing is a mixture of different tasks, and specialized numbers, such as Q8T1 or Q32T1 only apply to certain situations.

In any case, these numbers demonstrate one point: RAID with encryption offers little benefit. Disks confirms this too.

Disks (RAID 100x100M)

Disks RAID-0 and RAID-1 comparison with and without LUKS encryption.

When using LUKS with NVMe RAID, RAID-1 performs better than RAID-0. Wow. I was not expecting that. Of course, for protecting data, I would not recommend RAID-0, but these results are surprising coming from an era of mechanical drives. NVMe SSDs are different beasts and require a different mindset to understand.

LVM

Curious, I set up the two SN770 as physical volumes in their own volume group using LVM (Logical Volume Management) just to see what would happen. Yes, we can use these two NVMe SSDs in LVM instead of RAID if we wish. LVM works great.

NVMe LVM. A single 4TB logical volume.

LVM works fine, but performance is no better than RAID-0 or RAID-1 when using LUKS encryption. By itself, LVM offers the same performance as a single NVMe (as it should), so LVM does not degrade reading or writing.

Real-World File Transfers

What are real-world file transfers like when using real data? I copied data from an external USB 20Gbps device encrypted with VeraCrypt (AES) to RAID-0 + LUKS, RAID-1 + LUKS, and a single NVMe + LUKS to find out how long it would take to copy about 1.8TB worth of various files. A very real-world scenario.

The same files and USB were used in all tests.

On my test system, these are the results:

     RAID-0 + LUKS (4T size) 76m23s             (2465/2147)
     RAID-1 + LUKS (2T size) 102m58s (+26 mins) (2391/1969)
Single NVMe + LUKS (2T size) 79m56s             (2511/2129)

(The 2465/2147 numbers seen at the end of each line denotes the measured KDiskMark standard preset results for that configuration. These are the Q8T1 numbers.)

RAID-0 and a single drive are practically the same. Showing again that there is no significant reason to use RAID-0 with encryption. It offers no better performance than a single NVMe with LUKS.

RAID-1 took longer, which I expected. In fact, it took an extra 26 minutes to copy data. However, RAID-1 requires rebuilding the array upon creation, and I performed the file transfer while the array was rebuilding. Maybe this affected the times? I did not have time to wait for the 145-minute rebuild to complete given how many times I needed to destroy and rebuild arrays for these tests.

The point here is that despite the previous KDiskMark benchmarks showing RAID-1 faster than RAID-0, RAID-1 was the slowest of the three file transfer tests. Keep in mind that encryption plays a major role in the file transfer. Here is the file transfer process used during each test:

  1. First, the data is stored encrypted on the external USB using VeraCrypt (AES). It must be decrypted.
  2. Second, the data is transferred via USB 20Gbps, which is lower than PCIe 3.0 x4 limits. However, this should not affect performance much given how LUKS and VeraCrypt cause reads and writes to plummet.
  3. Third, data is encrypted using LUKS onto the destination. The same data is encrypted all over again using LUKS.

From these results, RAID is looking less and less appealing when used with NVMe and encryption.

Encryption Lowers Performance

For my case, encryption is a requirement, not an option.

Despite the performance hit, encryption is necessary to help protect the data at rest or in transit on external devices or when transferred across networks. While unencrypted numbers no doubt offer the best performance, I simply must use encryption to secure the data.

Because of this, I need to look at benchmarks performed on encrypted devices, not the unencrypted devices themselves. In every case, LUKS and VeraCrypt resulted in lower performance. Exactly why, I cannot say. Perhaps the Ryzen 5 2600 CPU is too slow to encrypt/decrypt at the NVMe device’s maximum potential?

We know that the NVMe device can read and write to the limits of PCIe 3.0 x4, but with encryption? No way. So this might be a processing issue because on this test system, LUKS numbers would no go any higher than what is shown in this article no matter what I tried. It seems like this test system hit a limit.

Mechanical drives are slow, so whether the data is encrypted or not, reads and writes will be conducted at the same speed given how fast computers are these days. But when an NVMe rivals RAM itself, it appears to be harder for a system to offer encrypted performance equal to unencrypted performance.

To RAID or Not To RAID

In general, NVMe RAID-1 + LUKS performs better then NVMe RAID-0 + LUKS. That was a surprise. Plus, RAID-1 offers redundancy. NVMe RAID-0 + LUKS performs only slightly better than a single NVMe + LUKS. Because of this and the lack of redundancy, there is no reason to use NVMe RAID-0 at all.

What About RAID-1?

RAID-1 + LUKS was my original goal. The entire reason for using NVMe was to benefit from the faster transfer rates. Plus, the NVMe package itself is tiny compared to 3.5″ mechanical drives or SSDs. So, why not combine the best of both worlds?

Tiny size + super fast NVMe speeds + RAID + encryption

Sounds great on paper, but in reality, at least for my testing, the results were different. Yes, it works, but the RAID-1 + LUKS result (according to the standard KDiskMark preset) was lower than a single NVMe + LUKS.

NVMe Reliability?

The leads to another question: how likely is an NVMe to fail?

The point of using RAID-1 is to offer redundancy. This was necessary during the days of spinning hard drives where mechanical failure was a real possibility. I have seen and replaced a number of mechanical hard drives, but RAID saved the day. So, certainly, RAID made good sense then to protect the data.

These days, NVMe is superior in many ways, and I have not seen a single NVMe fail during my usage. So is it necessary to use NVMe RAID-1? Backups are still important, but redundancy for NVMe? Hmm, I wonder. NVMe RAID-1 + LUKS would sound attractive if not for the fact that a single NVMe + LUKS performs better. In addition, it feels like a waste of NVMe potential just for redundancy given that NVMe devices are rated with a finite lifespan in TBW (terabytes written). Writing the same data to two NVMe devices practically reduces the effective life span in half.

Other Thoughts

Perhaps another strategy is needed? Maybe RAID is no longer needed for NVMe? If you are not using encryption and need a large space of fast storage, NVMe RAID-0 is the way to go if unencrypted. RAID-0 doubles the read/write speeds and increases the storage space. Sure, there is no redundancy, but it performs far better than LVM offering the same storage space.

RAID-0 or LVM?

During this project, I created a 4TB RAID-0 array and a 4TB logical volume using the same SN770 NVMe devices in LVM. Both offered 4TB of storage space. Neither approach had redundancy — meaning if one NVMe failed, you lost data. However, RAID-0 was twice as fast because of parallel reading and writing. LVM performance was the same as a single NVMe. No boost at all. Because of this, I would recommend NVMe RAID-0 over NVMe LVM, but only in situations where data is unimportant. For critical data that must be encrypted and redundant, I would avoid both LVM and RAID-0.

Tied to the Computer

Another issue to keep in mind is that PCIe bifurcation and this specific adapter card practically binds these two pieces of hardware together. I would not be able swap the adapter card with both NVMe SSDs into another computer unless it also allowed x4/x4 PCIe bifurcation. x8/x8 will not work, so I would need to acquire a different adapter card for x8/x8 mode or a different motherboard.

Looks good and glimmers but not very portable across systems.

Conclusion

This was a fun project that allowed me to answer a number of question in my quest for faster data storage. NVMe RAID sounds awesome at first. I was coming from a mechanical hard drive experience, but reality shattered those illusions when confronted with NVMe numbers.

Certainly, NVMe RAID will be faster than mechanical hard drive RAID, but the bigger question is, “Is NVMe RAID even worth it?” I would say no unless you either 1) need a large, fast scratch space, or 2) need redundancy. Otherwise, it feels like a waste of high-end NVMe components.

If you need redundancy, then performance is probably not on your mind. In my case, I want both, which was the point of using NVMe in RAID-1. However, that did not work out. A better option might be to use a single NVMe on the adapter card that backs up data periodically to the second NVMe on the same adapter card through the use of cron and scripts. That way, I can benefit from the advantages of a single NVMe, and swap it to a different system if needed. Of course, these are just ideas, Options. Options. Options. When it comes to Linux, many possibilities are available.

Encryption, whether using LUKS or VeraCrypt, results in lower transfer speeds, and this becomes very apparent with NVMe devices. Expect benchmark numbers to plummet with encryption.

Overall, I am most happy with how this project went. If you need to use RAID or LVM but lack adequate M.2 slots, PCIe bifurcation will make it possible. Please keep in mind that different systems might yield different results. I can only report on what my test system reports, so your experience might vary.

NVMe RAID + encryption using PCIe bifurcation and Linux is somewhat of an esoteric topic, so there is not much information available —  especially with benchmarks. Hopefully, this helps others who might be pondering similar questions.

Have fun!

 

, , ,

  1. Leave a comment

Leave a comment