Pcie 8x slot bifurcation & risers (splitting lanes)

Hey gang, I’m wondering if maybe I just don’t know where to look but I can’t seem to find any riser cards that split pcie lanes to more slots from an 8x down to four 1x slots, for eg. Everything I find looks like for miners trying to get 8x slots out of a 1x slot.

I want to keep adding cards for high speed usb and nvme but I’m worried sharing 1 slot will destroy transfer speeds from that usb card to an nvme pcie card if they’re both 1x cards sharing the same slot.

Isn’t there a way I can split my 8x slot into 4 or 8 1x slots so that every card I plug in will have uninterrupted bandwidth? I’ve looked all over Amazon, ebay and forum and stuff but seems like people only want to split 1x to larger slots and not the other way around like what I’m looking for. I see a couple used ones on ebay like item “284918265571” but they’re wicked expensive and I can’t find reviews on them to know compatibility or even whether or not I’m buying a rubber ducky.

Question: does it make sense, what I’m trying to do? Does it maybe not matter if a low end nvme shares bandwidth with a usb 3.1 card in a 1x slot? Is there another name for what I’m looking for or a better place to find these types of riser cards? I really wanna make full use of that open 8x pcie slot if I can.

It’s a reasonable request, but uncommon in this particular form, which is why you’re not finding much.

At a basic level, PCI Express supports a single device for a physical connection (board slot), which can negotiate to use a particular number of lanes for communication: x1, x2, x4, x8, or x16. Since this requires physical wiring, this is the simple baseline that all this starts with, and there is only one device possible.

Of course more complexity can always be added, so there are two main ways we get multiple devices onto a single board slot.

The first is for the PCIe controller powering the slot to support treating each group of lanes as belonging to individual devices, when paired with the appropriate card. Today that PCIe controller will either be in the CPU or the chipset. A bifurcation card (riser) handles power and control signal redistribution, and physically routes each set of lanes to a new physical slot. Then the PCIe controller has to be explicitly configured for the number of new slots. This is the lowest-cost option, and what that eBay item and things like the ASUS 4x NVME card do.

For AMD desktops, that typically shows up in BIOS as being able to change a block of x16 lanes to x8/x8, x8/x4/x4, or x4/x4/x4/x4 configurations. That would allow a riser card to plug into an x16 slot and provide 2 x8 slots, or 4 x4 slots. In some cases it also allows plugging into an x8 slot and providing 2 x4 slots, though this depends on how the x8 slot was created by the board in the first place.

As far as I am aware, AMD does not supply BIOS components that support any other configurations, so getting down to e.g. x2 or x1 would require a board manufacturer to create custom support, which is unlikely given the lack of market for it. I also don’t know if the desktop AMD PCIe controllers support any finer-grained bifurcation. The EPYC I/O controller they’re based on can only get down to x2 in general configuration, 8 x1 is not possible.

I’m much less familiar with intel’s current options, but my understanding is they’re even less flexible here.

The other main approach is to add a PCIe controller, a “bridge” in PCI terms, though you’ll often see them referred to by brand as PLX or PEX chips. These can support whatever the electrical engineer desires, and can choose to guarantee individual bandwidth or share it among many devices or slots. They’re also expensive, both the controllers themselves and the riser card designs required to support them, since it’s basically like designing the main board.

Because of the cost, these tend to show up only for specific uses (e.g. NVME drive aggregation, or combo USB/network/storage cards), and not so much as general-purpose slot expansion — though that does exist too in both small and large forms.

The elephant in the room for all of this is physical: fast PCIe has very tight electrical tolerances, and maintaining that across connectors, boards and cables is challenging, so chassis constraints come into play for anything low cost. The cryptocurrency miner related uses you see don’t need the speed at all, since the communication over the PCIe bus is minimal, so they make do with much worse tolerances.

So you’re not likely to find practical x2 or x1 splits from bifurcation boards. There is a long-running thread here on the forums if you want to go down the bifurcation rabbit hole.

Can you describe more about your use case though? NVME on PCIe x1 is already bandwidth constrained, and I’m not clear what you mean by two cards sharing the same slot.

3 Likes

Thanks a million for the detailed response! What I mean by 2 cards one slot is: if I used one of those 1x-to-many crypto-riser-card-dealies I had a feeling I’d have bandwidth issues or signal integrity issues using the bus with 2 cards plugged into it (for eg a pcie nvme and a pcie usb card with an external hdd plugged in, then transferring files from the nvme expansion to the usb drive plugged into the usb expansion, both sharing the same slot for example), which it seems you’ve confirmed by noting tight tolerances and signal integrity maintenance. I wasn’t sure if I was overthinking things or not, but it seems not.

My use case is that of a polymath lol. I don’t have money for a threadripper but I want to do everything I can on this desktop as if it were a workstation. Over the years I’ve gotten into machine learning, audio/video production, web and software development, 3D modelling, some server stuff for media, and lately I’ve been using a fair bit more hard drive space playing with virtual machines, routing with docker, kubernetes cluster computing, etc.

Since I don’t have much disposable income I’ve been buying smaller hard drives as I can afford them over the years and organizing my different projects and boot options on several different physical drives which has been working out awesome so far.

Fast forward a few years from today, though, I plan to still be on this AM4 platform. I’m already using all of the physical slots on my motherboard, but a ton of lanes are going completely unused… if I could split my second x16 slot into 1x’s then I’d still have a ton of bandwidth spare for more additions in the years to come. I was hoping I didn’t have to be frugal with pcie just because I’ve run out of physical slots so long as there was still bandwidth in lanes I could use. That’d be ideal. Right now I’ve got the slots all in use and it bugs me there are lanes in the larger slots I wish I could be using for more USB 3.2 on my machine, for example. One can never have too much USB with so many projects / peripherals / external drives ;p

1 Like

I’m not sure there’s quite as much available bandwidth as you’re thinking, since it tends to be tied up by allocating lanes to devices. For PCIe itself (ignoring protocol overhead):

PCIe3: ~1GB/s @ x1, ~2GB/s @ x2, ~4GB/s @ x4
PCIe4: ~2GB/s @ x1, ~4GB/s @ x2, ~8GB/s @ x4

AM4 has 20 expansion lanes from the CPU, usually arranged as x4 feeding an M.2 slot, and the rest funneled into either a single x16 graphics slot or a pair of slots that will run at x8 each. Then there’s an additional x4 feed to the chipset, which may create other PCIe slots but all multiplexed through the single x4 link.

Fast NVMe is PCIe x4, and for each generation (PCIe3 and 4) there are NVMe devices that can come close to saturating it. So e.g. that ASUS 4x M.2 slot x16 card I linked earlier would pretty much put it at maximum unconstrained storage bandwidth for the entire system. Lower end NVMe might be closer to x2 in terms of bandwidth, but x1 would likely be bottlenecked on pretty much any of them — might as well use SATA instead at that point. Similarly, a single 10Gbps USB3.2 Gen 2 port requires an x2 link on PCIe3. (I haven’t seen any PCIe4 USB controllers yet.)

In this scenario I’d probably focus on optimizing for expansion of connection types instead of PCIe in general. So if you have x8 free, let’s see what could be done with a pair of x4 slots.

For USB, you could look for a card that has unconstrained individual controllers for each port. The idea is to take as much PCIe bandwidth as possible to feed some 10Gb/s USB3.2 Gen2 ports, and attach a good hub to each. Then you can connect USB devices to the hubs, and be able to handle a few without much in the way of bottlenecks.

Unfortunately I can’t find ideal hardware examples at the moment. For USB cards, this one is a PCIe3 x4 card but only makes use of x2 with a single controller, so not making the most of the available PCIe lanes, and this one is too big at x8. Sonnet’s other cards are all slower, but that 8-port has the kind of specs you’d need to look for in an x4 card: dual ASMedia controllers. There should be at least a couple options out there, just nothing I remembered to bookmark and I can’t recall any names at the moment.

For hubs, I only have this reference for USB4/TB4. USB4 by nature basically has USB3.2 Gen2 inside, so those will work just fine, but are also a bit overkill in this case. Again nothing else I remembered to bookmark, but the goal is to just find a solid USB3.2 Gen 2 hub.

The cards I mentioned can also do USB3.2 Gen2x2 20Gbps on the ports, but I don’t know if any of the hubs can. That would be an ideal combination in order to maximize aggregate bandwidth for the devices on each hub, although it would be fewer ports (and hubs) on the card before you’d hit the fully-loaded max bandwidth.

For direct storage, you could start with SAS as the base interconnect, but probably use SATA drives for cost. For example, a LSI2308-based card providing 8 SAS lanes would have a max of ~6GB/s across all of them, but in an x4 slot that would be down to ~4GB/s. They can be connected to individual drives directly, but you also have the option of using SAS expanders (same concept as USB hubs), either internally or in a separate chassis via the cards that have ports for external cabling. Combine a bunch of relatively slower SATA drives with software RAID, and you could make the most of the bandwidth in aggregate. For more on these components, I’d suggest this YouTube channel.

Of course this still requires picking up a good x8 to x4/x4 bifurcation card somewhere, and your BIOS to have a bifurcation option that covers the slot. If you don’t need the SATA connectivity, then maybe that single x8 USB card would cover enough.

Both of these approaches still require some investment, but they still allow low cost end devices, so it might work out better from that standpoint.

Not sure any of this really addresses what you’re aiming for, but maybe there’s at least one useful idea in here.

1 Like

Thank you much for your careful consideration. The primary objective is expansion slots, secondary to that is bandwidth. The idea is I would like to be able to keep expanding with 1x cards because I seriously can’t tell the difference on my low end nvme’s enough to warrant using up a bunch of lanes to save a few miliseconds. My bandwidth concerns are if I were to use one of those aforementioned 1x-to-many-1x pci splitters that I would get even lower than sata level speeds when 2 drives (or devices like USB) are sharing the same 1x slot.

The idea is I want lots of expansion slots, so I want to take a 16x slot and get 16 times 1x slots so they all have their own dedicated lane. I’m not worried so much about getting 1000MBps on my nvme’s, 500 is still faster than sata for playing with neural networks to load large datasets into ram, and it’s more than enough for transfer speeds through usb since my external drives are basically only usb3.0 write speeds anyway.

I’m having trouble trying to figure out a way to split the faster rated slots into lower rated dedicated slots so I can expand to my heart’s content without traffic jams over the bus (or as little as possible) and providing as much connectivity as possible on my Franken-workstation. Almost no expansion cards I want or need actually use up 4x, 8x or 16x bandwidth enough for it to make sense to occupy all those lanes on a dedicated slot. I just can’t find any way to split one 16x slot into sixteen 1x slots.

So far I haven’t encountered any product configuration that would fill that niche, at least at low cost. The mining-expansion-over-USB3-connector things I’ve seen are PCIe 2.0 on a good day. For what you’re after, you’d need a PCIe 3.0 multiplexing switch. The prices I can find for just a minimal chip are ~$200, so that’s probably the lower cost bound on any product that may be out there.

Note SATA’s ceiling is just under 600MB/s for SSDs, so a cryptominer style x1 expansion setup would be slower even with only one device connected.

A multiplexing switch? Is there another term or name for that? I looked it up on amazon but didn’t find any pcie cards / risers that match that description. Even for $200 if it allows me to connect like 8 or 16 1x devices that’d still be worth it to me in my eyes.

I’m getting 2x faster read and writes on nvme’s connected to 1x slots than my 2.5" kingston ssd’s. In practice it seems pcie 1x is much fast than sata. The 4 port sata I have on a 1x slot transfers at standard sata speeds too, but I think it chugs if I’m moving data from 2 drives connected to that same card of course. I have 6 sata and 2 nvme’s on my mobo, plus 2 nvme in their own 1x slots which has been working out nicely. I just want to take my other 16x slot and get a bunch more use out of those lanes that otherwise do nothing if possible.

1 Like

Sorry, I left out a qualifier: “a cryptominer style PCIe2 x1 expansion setup would be slower”. What motherboard and NVMe adapters do you have currently? If they’re at least PCIe3 (likely), then they will be faster.

“Multiplexing switch” is informal, most of the products I’ve seen may make mention of “switch” or “PEX chip”, but are usually just bundled as PCIe expansion. One thing you could do is search for specific switch chip models. Here’s Broadcom’s list of switch chips; you’ll want to filter for r3.0. To narrow down which one you care about, lane count would be total of uplink plus all slots (e.g. x8 + 8 x1 = 16), and port count would be the number of slots plus uplink (e.g. 9).

It may be possible to link together some seemingly unrelated products. For example, a Linkreal LRNV9349-8I creates 8 distinct ports carrying the equivalent of x4 PCIe3 each (AliExpress: $258). Or the similar 4-port LRNV9347L-4I for $178. Then there’s the Delock 62788 and similar things that seem to be floating around for $60-$110. And you’d need some solid SFF-8643 cables. (These connectors are often called U.2, which is silly because they have nothing to do with U.2 itself, but it’s a term to search for.)

Would that particular ~$500-800 combination actually work for creating new slots? No idea, PCIe wasn’t specifically designed for this, but maybe you can find reports from people who have tried something similar at least. It’s PCIe3 level signaling that distinguishes these from the typical cryptominer expansion kits; the PCIe2 hardware is much lower cost.

2 Likes

Duuuude!! That is so effn helpful, thank you!!! I wish we could normalize this type of lane splitting.

It makes way more sense to let builders split and position slots themselves rather than leaving it up to motherboard manufacturers to come up with fixed hardware designs. If it were the standard that we all could just split our own extra dedicated slots/lanes for ourselves, and put pcie devices where ever they fit best in our cases (thank you nVidia for 4-slot cards now…) instead of making all these compromises when buying a motherboard trying to match for slot layouts and configurations in addition to everything else on the mobo like pcb layers, dimm channels, bios features, I/O, etc, etc. I think most people may not think about doing this, but if it were a normal to have the option I bet everyone would choose to set up their own PCIe slots and lanes themselves rather than be stuck with whatever motherboard designs are offered.

It makes sense for the gpu and nvme’s to be directly on the board, for sure, but the rest of the expansion and lanes is always problematic for builders. I don’t know why such a flexible standard comes with such rigid constraints. I’m gonna make this my 2023 pet project :smiley:

Hopefully I’ll remember to post updates on this thread later on for anyone else interested in getting the most out of their budget workstations. Thanks again!

2 Likes

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.