Resurrecting simnet
12 Aug 2019
I’m petrified every time I make a change to the SmartOS networking stack. It feels like I’m always one small mistake away from ruin. If I’m lucky, it’s a kernel panic that affects a handful of people, and I can patch it quickly. When I’m not so lucky, it’s a subtle bug involving a multitude of factors that results in bad checksums, dropped packets, and poor performance. Bugs are inevitable, but that’s of little comfort when you break something that everyone’s traffic runs across. It takes a certain level of paranoia, patience, and vigilance to work on the network stack; but you never nave enough to catch all the bugs. To err is human. I’ve been looking for ways to remove some of this human error, and that’s where simnet comes in.
Simnet is a long forgotten, maybe never discovered, feature introduced during the days of OpenSolaris, and carried into the current day illumos kernel. Don’t go looking for a manual page on it, as one doesn’t exist. The only real documentation is the original PSARC case [1] and an old blog post from Joerg Moellenkamp [2]. Luckily, those are enough to get a good idea of what simnet is and how to use it. To start, I’ll quote some choice sections of PSARC 2009/200.
Simulated network devices (simnet) are pseudo GLDv3 network devices that aid in the creation of point-to-point network links on a system. They are intended to be a testing resource for OpenSolaris developers. Simnets should help in developing test suites that can run with minimal network hardware to test protocols and codepaths that were previously not possible to test on a single system. This case has two components: the new 'simnet' pseudo GLDv3 network device driver and the changes in dladm(1M) to create, modify and delete simnets.
Once created administrators can plumb IP, snoop and create VLANs using the new simnet. In other words upon creation simnets appear as regular Ethernet/WiFi hardware devices.
Using a combination of such links one can simulate a number of network configurations within a single system. The point-to-point nature of the device allows us to write test software that can send and capture packets at both the end-points and verify network software without the distractions that arise from the need and the use of actual network hardware.
Simnets are different from VNICs. Simnets are used to simulate point-to-point links and are MAC-level objects that appear to consumers as real Ethernet/WiFi devices. Therefore, you may create an aggregation on top of a collection of simnets, while such a thing is not possible with VNICs or the existing Etherstubs. Such a configuration would allow the testing of aggregations (and LACP signaling) without having to set up multiple physical point-to-point connections between two systems. A VNIC on the other hand is created over an Etherstub or other existing MAC-level objects (such as Simnets) to provide bandwidth control of traffic on the underlying link.
Put simply, a simnet device is a software simulated Network Interface Card (NIC). To the operating system, it’s no different than a physical NIC. But unlike NICs, we can configure simnets in all sorts of ways to mimic production scenarios without the actual cost and headache of a full production setup. Best of all, we can do this on one physical host with the help of zones, bridges, and IP forwarding (software routing). As Joerg demonstrated in his post: you can create an entire internet in a box.
That’s all well and good, but it’s not what got me excited about simnet. I’ve spent the last three years at Joyent working on the lower layers of the networking stack — specifically the Generic LAN Driver framework and Layer 2 code, known as GLDv3/MAC. As I said earlier, working on this code scares me. All traffic passes through this code. A bug here can be catastrophic. A particularly tricky aspect of this code is that the data path changes depending on the capabilities of the underlying hardware. A simple example is checksum offload. Any NIC worth its PCI Express lanes provides the ability to fully offload both Tx checksum calculation and Rx checksum verification — relieving the networking stack of performing these tasks on the CPU. A more interesting capability, that goes hand-in-hand with checksum offload, is Large Send Offload (LSO, also called TSO). LSO allows the TCP/IP stack to send large packets (up to 64K for IPv4) down to the NIC where it will then segment it into smaller packets that meet the MTU of the link. Once again, the NIC is relieving the CPU from all of that work, giving time back to the system to perform other tasks. These capabilities, by design, must affect the data path taken through the network stack. The changes are subtle but often have ripple affects throughout the stack, leading to head scratching bug reports from users when things go wrong. While I have the hardware to test various configurations, I don’t have the memory or reliability of a computer. Not only that, but it’s extremely time consuming to test umpteen combinations of different hardware and software configurations. It’s an especially debilitating prospect when you have to start testing from scratch after each change you make during development (anyone who thinks they can get away with testing only what they changed has never had the pleasure of having the firmware and operating system routinely punch them in the face after making such a proclamation).
Then it hit me: I could give simnet the powers to expose these capabilities! With these new powers in hand I could both a) test different data paths purely in software, and b) write automated regression tests that can be run by anyone on any machine. This second point is particularly important: one of the other issues with testing the network stack is that it requires not only special configurations, but also the knowledge to even know how to setup those configurations. Simnet, with capability simulation, democratizes this ability and greatly increase the confidence in modifying the network stack in the future.
The desire for this new feature became especially pressing when I realized I had caused a major regression in the IP forwarding code [3]. It was just the sort of thing I could have caught with automated regression tests. With that in mind, I used my failure as an opportunity to avoid making the same mistake twice by creating five regression tests powered by simnet. To write these tests I added the following capabilities to simnet.
-
Tx IPv4 header checksum offload
-
Tx IPv4 ULP partial checksum offload (Upper Layer Protocol, like TCP)
-
Tx IPv4 ULP full checksum offload
-
LSO
And now, with those features in hand, it’s a simple matter of creating three native zones and modifying a single configuration file with the zone names to run the IP forwarding test suite.
# /opt/net-tests/bin/nettest
Test: /opt/net-tests/tests/forwarding/ip_fwd_no_cksum (run as root) [00:31] [PASS]
Test: /opt/net-tests/tests/forwarding/ip_fwd_partial_cksum (run as root) [00:31] [PASS]
Test: /opt/net-tests/tests/forwarding/ip_fwd_full_cksum (run as root) [00:32] [PASS]
Test: /opt/net-tests/tests/forwarding/ip_fwd_partial_cksum_lso (run as root) [00:32] [PASS]
Test: /opt/net-tests/tests/forwarding/ip_fwd_full_cksum_lso (run as root) [00:32] [PASS]
Results Summary
PASS 5
Running Time: 00:02:40
Percent passed: 100.0%
This is great, but it’s just a start. I look forward to adding more features in the future which will enable more tests to be written.
- Rx IPv4 header checksum
- Rx ULP partial and full checksum
- MAC groups/rings and traffic steering (delivering packets to specific groups based on destination)
- link-layer packet drops
- adding documentation to the dladm(1M) manual
In short, simnet, along with zones, bridges, and IP forwarding, allows one to simulate an arbitrary Ethernet network on a single machine. It now also has the ability to simulate Tx checksum offloads and LSO. These features were used to create a mostly-automated networking test suite for SmartOS IP forwarding (that should eventually make its way to illumos). Finally, I have plans to extend it with more features that allow me (and others) to write more tests.