PCIe Bifurcation and NVMe RAID in Linux Part 1: The Hardware

📅 April 8, 2023
“PCIe Bifur…..WHAT?!”

Ooooh! Sounds fancy, right? On some motherboards, the BIOS will allow a single physical PCIe x16 slot to be divided into two or more logical PCIe slots in order to install multiple NVMe SSDs (two, three, or four) using an adapter card. This is PCIe bifurcation, and Linux is compatible with motherboards that support it.

What would benchmark numbers look like if we put two NVMe devices in RAID-0? How about RAID-1? How well would it compare to a single NVMe? What would be the best data storage arrangement if using NVMe? Are there different techniques to follow compared to RAID with mechanical drives?

Here are my experiments in an attempt to help protect data stored on a Linux system with the hopes of providing faster redundancy while exploring PCIe bifurcation on a system running Ubuntu Cinnamon 22.04.

PCIe Bifurcation allows you to install two M-key NVMe devices in a single PCIe slot for storage expansion, LVM, or RAID. Each NVMe device has its own x4 dedicated lanes for maximum speed, limited only by the PCIe slot it is connected to.

An example of the kind of ideal RAID-0 performance we can expect with two SN770 NVMe devices. This is a synthetic benchmark that shows the absolute maximum speeds possibles in a PCIe Gen3.0 slot utilizing PCIe bifurcation. However, do not be moved by benchmark numbers. Real-world performance is more mundane.

Why Not Just Use Motherboard M.2 Slots?

In my case, the entire technique requires PCIe bifurcation because the M.2 slots on the motherboard are inadequate for what I want to achieve: RAID-1 using two M-key NVMe devices.

“If you have two M.2 slots, why not just RAID them together?”

Reason 1. The primary M.2 slot is reserved for the OS NVMe. This is the slot near the CPU, and, as good practice, it should be dedicated to hosting the OS and nothing but the OS for optimal system performance.

Reason 2. The motherboard in the test system I am using is an older X470 that has a single PCIe 3.0 x4 M.2 slot connected to the CPU, and a secondary PCIe 3.0 x2 M.2 slot hosted by the chipset. Both are M-key slots.

Did you catch that limitation?

The second slot only has two PCIe lanes, not four. An NVMe installed in this slot maxes out at PCIe 2.0 x4 speeds. If these were RAIDed together, performance would not be optimal. Adding encryption reduces performance to begin with, so this arrangement would degrade performance even further when using RAID. Not good.

The solution? PCIe bifurcation!

Since the motherboard M.2 slots are out, the only other (inexpensive) option is to implement PCIe bifurcation. By using an additional dual NVMe adapter card that plugs into a bifurcated slot, it provides two dedicated PCIe 3.0 x4 M.2 slots. It also turns out that the test motherboard I have happens to support it.

What Is PCIe Bifurcation?

To bifurcate means to divide into two parts. In the X470 motherboard I am using for testing, only one PCIe 3.0 x16 supports PCIe bifurcation, and it must be enabled in BIOS. It will not automatically turn itself on. You must specifically enable it by entering BIOS. This changes the behavior of the x16 slot.

How the PCIe x16 slot is bifurcated on this particular motherboard. Bifurcation takes the x8 half of the slot and creates two virtual x4 slots. The result is referred to as a x4/x4 arrangement as shown here.

Not just any motherboard will work. You cannot take any existing motherboard PCIe slot, pop in a dual NVMe adapter, and expect it to work. It will not. A motherboard must specifically support PCIe bifurcation, and you will need to research and check the motherboard manual in great detail because this is often a hidden feature that does not receive attention.

PCIe Bifurcation Gotchas

PCIe bifurcation opens up new possibilities but also a few caveats.

Must have a compatible motherboard

Check the user manual online. Different vendors might call it something else. In my case, it was labeled “NVMe RAID,” not PCIe bifurcation, but it changed the physical x16 slot with x8 lanes (x8 electrical conductors while the rest of the x16 slot had no contacts) into a split x4/x4 as PCIe bifurcation should do. This means that an x8 dual NVMe adapter would work.

Must use a dual NVMe adapter that supports PCIe bifurcation

This is a separate purchase and not included with the motherboard (in my case). If you search for PCIe NVMe adapter cards, there are four kinds to be aware of. Only one will work with PCIe bifurcation.

  1. Single PCIe to NVMe adapter card. Does not work. This allows one NVMe device to be installed in any PCIe slot matching the card lanes or more. It is a simple adapter card and should be compatible with any motherboard including older boards lacking PCIe bifurcation support. They are inexpensive, so if all you need is a single, extra NVMe, this is a good solution. You can even use two of them, each in its own PCIe slot, for RAID or LVM. Works great, but I do not have the extra PCIe slots. I only have one, so this option is out.

    A good all-round performer that I have used is this single M-key M.2 slot to PCIe adapter card. It delivers four PCIe lanes to the M-key slot for full performance. You would need two of these, one in each PCIe slot, to support NVMe RAID or LVM without PCIe bifurcation. In my test system, I did not have two available PCIe slots.

  2. “Dual” M.2 adapter card. Does not work. At first glance, the product picture will show space for two NVMe devices on an inexpensive card, but if you read the fine print, one M.2 is B-key for a SATA device and the other is M-key for a true NVMe. This will not work either. I want two M-key NVMe slots for full NVMe performance, and this will not achieve that.

    Even though two M.2 slots are present on this dual M.2 PCIe adapter card, one is B-key (slower SATA) and the other is M-key (faster NVMe). This will not work with PCIe bifurcation and RAID.

  3. Non-Bifurcation adapter card. Works, but expensive. This is a PCIe adapter card that hosts its own logic for multiple NVMe devices, but it is horribly expensive. For the price, you might as well purchase a brand new motherboard that supports PCIe bifurcation natively. The purpose is to allow multiple NVMe expansion on any motherboard, including older boards, that lack PCIe bifurcation. I avoided this route because I already had a perfectly fine motherboard on hand, so why buy another if it can be avoided?

    This quad NVMe adapter card allows any motherboard to enjoy four M-key NVMe devices in a single PCIe slot…but it is expensive. Why not just purchase a new motherboard at this price point?

  4. Dual NVMe adapter card (Requires PCIe bifurcation). Works perfectly. This is a PCIe adapter card that has two M-key M.2 slots (some have four, so check carefully) that allow two individual NVMe devices to occupy a single PCIe 3.0/4.0 slot on the motherboard. This is exactly what I need. It costs much, much less than the non-bifurcation cards, but it requires a motherboard that supports PCIe bifurcation.

    This dual M.2 NVMe adapter from 10GTek is the one I used. It supports two M-key NVMe SSDs, each with a full four lanes of PCIe 3.0 or 4.0 goodness. Its cost is low, but it requires a motherboard and BIOS that supports PCIe bifurcation.

Must be supported in BIOS

BIOS must have an option that allows you to turn PCIe bifurcation on or off. By default, it is disabled, so look through your BIOS or motherboard manual to find advanced PCIe options. If your motherboard supports PCIe bifurcation, then there will be an option for this in BIOS.

Limited PCIe Slot

You cannot pick and choose which PCIe slot on the motherboard to use PCIe bifurcation. This is predetermined by your motherboard. In my case, PCIe bifurcation was only supported on the second PCIe 3.0 x16 slot (x8 lanes). The first PCIe x16 slot near the CPU could not be bifurcated. Only the second one.

Other motherboards might vary. Shown here is a secondary PCIe 3.0 x16 slot.

Preset Bifurcation

The BIOS also dictates how the PCI slot can be bifurcated. You do not get to choose x4/x4 or x8/x8 or x4/x4/x4/x4, for example. In my case with my test system, only the second PCIe slot could be bifurcated, and even then it would only allow x4/x4 mode.

How Bifurcation Works

The motherboard on the test system limited PCIe bifurcation to this PCIe 3.0 x16 slot only. It might x16 in size, but it is only an x8 lane slot.

This is why I needed to use an x8 PCIe dual NVMe adapter card. Only the first 8 lanes are relevant. The BIOS divides this slot into an x4/x4 arrangement. Both virtual slots are still PCIe 3.0 x4 each.

This dual NVMe adapter card needs a PCIe x8 slot, but it still fits in the x16 slot shown above. Notice the circuit traces? The eight lanes from the PCIe x8 slot are split into four lanes so each M.2 slot has its own x4 connection. This is the way to go for maximum performance from both NVMe devices. The PCIe x16 (x8 electrical) slot this connected to could only be bifurcated into x4/x4, which was perfect. On another motherboard where the PCIe x16 slot could only be bifurcated into x8/x8, this card would not work. Only one NVMe was recognized. Why? The first x8 out of the two x8/x8 was treated as a single M.2 slot, thus, this card was seen as a single NVMe adapter card.

On a different motherboard, only the first PCIe x16 near the CPU slot could be bifurcated and then only into x8/x8 mode. There was no x4/x4/x4/x4 mode available, which allows four NVMe devices on a single adapter card. Again, check and double check the manual.

Linux Works Perfectly with PCIe Bifurcation

PCIe bifurcation is a hardware setting affecting the underlying hardware, so it is 100% compatible with Linux. There are no drivers to install for Linux or special modification necessary. You can take your existing Linux system, enable PCIe bifurcation, and Linux will recognize it without issue with 100% compatibility. Linux will see the NVMe devices as new drives added to the system. Ubuntu Cinnamon 22.04 performed every bit as good before enabling PCIe bifurcation.

Of course, I have not tried every Linux distribution, but from what I tested, everything worked out of the box with Linux.

The Hardware

Note: Nobody sponsors this. I found a project that I liked and wanted to share the results with others. Any links to Amazon are affiliate links to help readers locate the items and to help cover the time spent researching this article since I earn a commission on qualifying purchases at no extra cost to readers.

WD SN770 NVMe

My experiments will be conducted using two Western Digital Black SN770 NVMe devices. These are truly excellent NVMe SSDs on their own, so I want to see what performance would be like with RAID and PCIe bifurcation.

“If you have a PCIe 3.0 motherboard, why use SN770 PCIe Gen 4.0 NVMe?”

Three reasons: speed, price, and future upgrades.

Speed

I wanted to ensure that the NVMe was not the limiting factor. The SN770 NVMe is a speedy device delivering over 5000 MB/s reads in a PCIe Gen 4.0 slot, so it should be able to saturate a PCIe 3.0 x4 slot, which is backwards compatible. Indeed I benchmarked this, and a single SN770 maxes out what a PCIe 3.0 M.2 slot can allow. If I see low benchmark numbers, it will not be because of the SN770.

Price

I found the 2TB SN770 NVMe for the same price as a standard 2.5″ SATA SSD and slower PCIe 3.0 NVMe devices. If they all cost the same, why not buy the fastest of the group?

Future Upgrades

If I upgrade the motherboard in the future to one that supports PCIe 4.0 slots, then the SN770 will upgrade too without needing to purchase new NVMe devices and restoring the data. I can use the NVMe as-is with future builds. If I buy PCIe 3.0 NVMe, then I would certainly want to upgrade to PCIe 4.0 NVMe later. Why not get a reasonable PCIe 4.0 NVMe now and be ready?

Since these will be used in an existing PCIe motherboard, there is no need to buy the latest and greatest NVMe for twice the cost such as the Samsung 990 Pro. That would be overkill in a PCIe 3.0 system. A PCIe 3.0 M.2 slot is limited to about 3600-3800 MB/s real-world throughput, so the extra cost of a more expensive NVMe would be wasted unless I upgraded the motherboard. What I have works well, so I saw little reason to do that.

Dual M.2 NVMe PCIe Adapter Card

This card has two M.2 slots that only accept M-key NVMe devices. Four lanes from the x8 slot each connect to the M.2 slots so each NVMe can experience full x4 bandwidth.

Ventilation holes allow airflow. There are also two green LEDs, one per NVMe and visible through the holes, that blink during NVMe activity.

“What happens if you connect this into a non-bifurcated slot?” It will still work, but only one NVMe will be recognized by the system. Which NVMe out of the two might take some trial and error to determine.

NVMe Heat Sinks

The SN770s run painfully hot under load, so I installed a heat sink on each. This particular heat sink sandwiches the NVMe between thermal pads on the bottom and top. They still become warm, but nothing painfully hot. Screws are included with the heat sinks. The SN770 does not include a heat sink.

The Setup

With the research out of the way and the parts in hand, the first step is to prepare the dual adapter card.

Prepare the Adapter Card

Adapter card (left) with heat sinks installed on both SN770 NVMe devices. Top view and bottom view of heat sinks shown.

Both NVMe devices install easily and fasten in place using M.2 screws. This increases the weight of the card a little, but nothing to be concerned about.

Install in Computer

Installed in test system.

Enable PCIe Bifurcation in BIOS

Motherboards might vary, so check your manual. This option was buried in the Advanced\Onboard Devices Configuration menu. Nowhere does it read “PCIe Bifurcation.” Instead, it refers to it as “PCIe RAID Mode.” Same thing, so select this to switch the PCIe slot into x4/x4 operation. If not enabled, Linux will see only one of the two installed NVMe devices. The description in the box at the bottom explains what the setting does.

NOTE: The name “PCIe RAID Mode” shown in the BIOS is misleading. This is not hardware RAID, nor does it create a RAID array from within BIOS. Any NVMe RAID used on this board must be created as software RAID. All RAID was set up using mdadm in Linux.

Check Disks Utility in Ubuntu

After rebooting the system, open Disks to find out if Ubuntu Cinnamon 22.04 detects both NVMe devices.

Yes! Both SN770 NVMe SSDs are shown in the left pane of Disks. This means PCIe bifurcation is working.

Second NVMe successfully formatted as ext4.

Separate NVMe

Linux sees both NVMe devices as two separate and totally independent devices, and they will have their own device names. In this case, they are identified as /dev/nvme1n1 and /dev/nvme2n1, but this can vary depending upon any other NVMe devices installed in the system.

Treat them like any other storage devices on the system. LVM, RAID, single disks. It is up to your imagination at this point. They can be formatted with different file systems, placed in an NVMe RAID array, or set up as physical volumes in LVM. I tested these situations and they all work flawlessly with Linux.

Quick Benchmarks

Benchmarks will be covered in a separate article because there are a number of RAID-related surprises and performance issues when using LUKS or VeraCrypt full-disk encryption. But to test what these drives can do right now without any RAID or encryption, I ran Disks benchmark and KDiskMark to view the maximum potential possible using PCIe 3.0.

KDiskMark

Some quick tests comparing the KDiskMark standard preset with the NVMe preset. (Presets can be chosen from the KDiskMark menu.) No encryption used here. Both SN770 NVMe SSDs are on the dual adapter card using PCIe bifurcation.

The KDiskMark tests above show what we should expect from PCIe 3.0 x4 slots. Since each NVMe has its own four lanes (x4) and each SN770 has a theoretical maximum read speed of over 5000 MB/s, both can operate up to the limits of PCIe 3.0 x4 speeds. We know that the SN770s and the dual adapter card will not be the bottlenecks in future tests.

Disks 100x100M

First NVMe in adapter card. Disks shows that we are reaching about the best PCIe 3.0 x4 speeds possible. Graph performance is every bit as good as I was expecting.

Second NVMe in dual adapter card tested. Results are just as good as the first, so both M.2 slots operate at their full potential. From this, we know that both NVMe devices perform identically in the same dual adapter card no matter the M.2 slot.

Quick RAID-0 Test

Happy with these individual results, I had to perform a quick test to see what to expect when both SN770 devices are members of a RAID-0 array created with mdadm.

Wow! I had calculated about double speeds during synthetic benchmarks, but to actually see it in operation after careful research and planning makes this project worthwhile. Each SN770 is 2TB in size, so RAID-0 doubles the available space to 4TB. However, this is just for testing. I want to see how RAID-1 will work with LUKS encryption to better protect data. RAID-0 stripes data, so it does not offer any protection in the event that one NVMe fails or goes missing. As mentioned in the beginning, these are synthetic numbers, so avoid being dazzled. Real-world performance with encryption is another story.

RAID-0 is one way to break the limits of a single PCIe 3.0 x4 M.2 slot. This is because reads and writes involving RAID-0 occur simultaneously across the two NVMe devices.

Conclusion

After running the RAID-0 test, I was elated with the success of this experiment, so I had high hopes for RAID-1 and encryption. Well, well, well. It turns out that NVMe RAID (both RAID-0 and RAID-1) and encryption introduces its own set of issues that I was not expecting. RAID might have been ideal when using mechanical drives, but NVMe devices involve a different technology to wrestle with.

Unexpectedly, I discovered that, for what I wanted to do with this configuration, a single NVMe yielded better overall performance than RAID-0. Surprise! Because of this, I found myself wondering if RAID was even worthwhile anymore when dealing with NVMe, but we will look at these details in part 2 of this series.

Part 2 will test various benchmarks and NVMe arrangements using these two SN770 devices. Is it worth the extra effort?

Have fun!

, , ,

  1. Leave a comment

Leave a comment