The SDLMUX software combines two network adapters into one IP interface, providing a fail over capability if the link on the active adapter goes down or the adapter fails. Once set up it requires virtually no administration but there are some things you should be aware of.
First, SDLMUX does not provide any load balancing; all traffic is transmitted out of and received on the active adapter.
Second, on the ftServer VSeries hardware the active adapter has a MAC address of the form 00:00:A8:4v:wx:yz. The 00:00:A8 is the Stratus organizationally unique identifier (OUI), assigned by the IEEE. The “v” is an index based on the order that the SDLMUX devices where initialized, starting at 0. The value “wx:yz” is based on the system serial number. The standby address differs from the active address only in the upper nibble of the fourth byte, instead of a 4 it is a 6. Also the difference between the MAC addresses of two active (or standby) adapters on the same module will only be in the lower nibble of the fourth byte. The easiest way to get a list of all the MAC addresses is with the analyze_system request “dump_sdlmux” matching on “MAC”, see figure 1.
as: match 'MAC' ; dump_sdlmux
MAC address               = 0000A8405A8B
MAC address               = 0000A8605A8B
MAC address               = 0000A8415A8B
MAC address               = 0000A8615A8B
MAC address               = 0000A8425A8B
MAC address               = 0000A8625A8B
MAC address               = 0000A8435A8B
MAC address               = 0000A8635A8B
as:
Figure 1
Just the MAC addresses aren’t very helpful; by matching on “MAC” or “#” you get the MAC addresses and the SDLMUX and network adapter device names. Note that this list will not contain any network adapters that are not part of an SDLMUX partnership.
as: match 'MAC' -or '#' ; dump_sdlmux
sdlmux device             = #sdlmuxA.m16.10-5-0.11-5-0
MAC address               = 0000A8405A8B
Interface device = %phx_vos#enetA.m16.10-5-0
MAC address               = 0000A8605A8B

Interface device          = %phx_vos#enetA.m16.11-5-0

sdlmux device             = #sdlmuxA.m16.10-5-1.11-5-1
MAC address               = 0000A8415A8B
Interface device = %phx_vos#enetA.m16.10-5-1
MAC address               = 0000A8615A8B

Interface device          = %phx_vos#enetA.m16.11-5-1

 

sdlmux device             = #sdlmux.m16.11-2
MAC address               = 0000A8425A8B
Interface device = %phx_vos#enet.m16.11.11-2
MAC address               = 0000A8625A8B

Interface device          = %phx_vos#enet.m16.10.11-2

 

sdlmux device             = #sdlmux.m16.11-3
MAC address               = 0000A8435A8B
Interface device = %phx_vos#enet.m16.10.11-3
MAC address               = 0000A8635A8B
Interface device          = %phx_vos#enet.m16.11.11-3
as:
Figure 2
Third, SDLMUX network adapters send Ethernet 802.2 LLC frames to their partners to insure that the network path is functioning correctly. Five sets of these test frames go out at three second intervals, followed by a 33 second interval. Trace 1 shows three cycles of this pattern. Trace 1 also shows the actual frame from the active adapter (frame 1) and the standby adapter (frame 2). I’ve highlighted the frame following the 33 second gap just to make the cycles easier to read. These are not Ethernet type II frames or IP packets and the switches connected to the network adapters and any switches along the path between the two adapters must have a configuration that does not block these 802.2 LLC frames.

 

No. delta Time       Source              Destination   Protocol Info
1 0.000000   StratusC_42:5a:8b     StratusC_62:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
0000 00 00 a8 62 5a 8b 00 00 a8 42 5a 8b 00 1b ac ac   ...bZ....BZ.....
0010 03 31 32 39 2e 31 2e 30 00 00 00 00 00 00 00 00   .129.1.0........
0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0030 00 00 00 00 00 00 00 00 00 00 00 00 b1 60 74 48   .............`tH
2 0.000007   StratusC_62:5a:8b     StratusC_42:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
0000 00 00 a8 42 5a 8b 00 00 a8 62 5a 8b 00 1b ac ac   ...BZ....bZ.....
0010 03 31 32 39 2e 31 2e 30 00 00 00 00 00 00 00 00   .129.1.0........
0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0030 00 00 00 00 00 00 00 00 00 00 00 00 45 23 24 c0   ............E#$.
3 2.999945   StratusC_42:5a:8b     StratusC_62:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
4 0.000007   StratusC_62:5a:8b     StratusC_42:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
5 2.999943   StratusC_42:5a:8b     StratusC_62:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
6 0.000007   StratusC_62:5a:8b     StratusC_42:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
7 2.999882   StratusC_42:5a:8b     StratusC_62:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
8 0.000007   StratusC_62:5a:8b     StratusC_42:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
9 2.999946   StratusC_42:5a:8b     StratusC_62:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command

10 0.000009   StratusC_62:5a:8b     StratusC_42:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command

11 32.99900   StratusC_42:5a:8b     StratusC_62:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
12 0.000007   StratusC_62:5a:8b     StratusC_42:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
13 2.999945   StratusC_42:5a:8b     StratusC_62:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
14 0.000007   StratusC_62:5a:8b     StratusC_42:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
15 2.999938   StratusC_42:5a:8b     StratusC_62:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
16 0.000007   StratusC_62:5a:8b     StratusC_42:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
17 2.999942   StratusC_42:5a:8b     StratusC_62:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
18 0.000007   StratusC_62:5a:8b     StratusC_42:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
19 2.999943   StratusC_42:5a:8b     StratusC_62:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
20 0.000008   StratusC_62:5a:8b     StratusC_42:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
21 32.99900   StratusC_42:5a:8b     StratusC_62:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
22 0.000007   StratusC_62:5a:8b     StratusC_42:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
23 2.999944   StratusC_42:5a:8b     StratusC_62:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
24 0.000007   StratusC_62:5a:8b     StratusC_42:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
25 2.999947   StratusC_42:5a:8b     StratusC_62:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
26 0.000007   StratusC_62:5a:8b     StratusC_42:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
27 2.999938   StratusC_42:5a:8b     StratusC_62:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
28 0.000007   StratusC_62:5a:8b     StratusC_42:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
29 2.999946   StratusC_42:5a:8b     StratusC_62:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
30 0.000010   StratusC_62:5a:8b     StratusC_42:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
31 32.99900   StratusC_42:5a:8b     StratusC_62:5a:8b LLC   U, func=UI; DSAP 0xac Individual, SSAP 0xac Command
Trace 1
Note, that you cannot use packet_monitor to see these test frames. The frames are sent below the point where packet_monitor taps into the stack and SDLMUX removes the frames from the stack before packet_monitor can read them.
Fourth, if each adapter is connected to a different switch and the link between those switches fails or for some other reason the test frames do not get though you will see something that looks like figure 3 in the syserr_log. Starting in release 16.2.1ak and 17.0.0ah if there is a test frame failure SDLMUX triggers an ARP request to the last host that was successfully ARP’ed over the suspect interface. If it gets an answer it knows that the active adapter is working so it resets the standby adapter to try to get it to work. If there is a network issue blocking the test frames resetting the adapter doesn’t help and the adapter eventually goes MTBF, see time stamp 08:08:10 in figure 3. The dlmux_admin command will report the broken adapter as DOWN, figure 4, and a trace will show no test frames since at this point there is nothing to test. If an ARP reply is not received or the problem is on a release that does not send the ARP request SDLMUX will fail over the adapters and break the new standby adapter. If during the next cycle the test frames again fail and no ARP reply is received the fail over and break will be done again. Eventually one adapter will go MTBF.
08:05:02 WARNING(64): SDLMUX: the devices in group #sdlmux.m16.11-2
08:05:02 WARNING(65): SDLMUX: are not exchanging test packets with each other
08:05:02 WARNING(66): SDLMUX: but the active adapter is able to communicate wi
+th other hosts!
08:05:02 WARNING(67): SDLMUX: This indicates some sort of network or cabling i
+ssue!
08:05:02 WARNING(68): SDLMUX: breaking adapter %phx_vos#enet.m16.10.11-2: XID
+communication issue
08:05:02 PCI 10/11/2           enet.m16.10.11-2 Break Requested
08:05:02 WARNING(69): SDLMUX: device name %phx_vos#enet.m16.10.11-2 is broken
08:05:02 PCI 10/11/2           enet.m16.10.11-2 Adding
08:05:04 PCI 10/11/2           enet.m16.10.11-2 Online
08:05:04 WARNING(70): genet in (10/11/2) Link is Up.
08:05:04 WARNING(71): SDLMUX: device %phx_vos#enet.m16.10.11-2 back to servic
+e
08:05:49 WARNING(72): SDLMUX: breaking adapter %phx_vos#enet.m16.10.11-2: XID
+communication issue
08:05:49 PCI 10/11/2           enet.m16.10.11-2 Break Requested
08:05:49 WARNING(73): SDLMUX: device name %phx_vos#enet.m16.10.11-2 is broken
08:05:49 PCI 10/11/2           enet.m16.10.11-2 Adding
08:05:52 PCI 10/11/2           enet.m16.10.11-2 Online
08:05:52 WARNING(74): genet in (10/11/2) Link is Up.
08:05:52 WARNING(75): SDLMUX: device %phx_vos#enet.m16.10.11-2 back to servic
+e
08:07:22 WARNING(76): SDLMUX: the devices in group #sdlmux.m16.11-2
08:07:22 WARNING(77): SDLMUX: are not exchanging test packets with each other
08:07:22 WARNING(78):  SDLMUX: but the active adapter is able to communicate wi
+th other hosts!
08:07:22 WARNING(79): SDLMUX: This indicates some sort of network or cabling i
+ssue!
08:07:22 WARNING(80): SDLMUX: breaking adapter %phx_vos#enet.m16.10.11-2: XID
+communication issue
08:07:22 PCI 10/11/2           enet.m16.10.11-2 Break Requested
08:07:22 WARNING(81): SDLMUX: device name %phx_vos#enet.m16.10.11-2 is broken
08:07:22 PCI 10/11/2           enet.m16.10.11-2 Adding
08:07:25 PCI 10/11/2           enet.m16.10.11-2 Online
08:07:25 WARNING(82): genet in (10/11/2) Link is Up.
08:07:25 WARNING(83): SDLMUX: device %phx_vos#enet.m16.10.11-2 back to servic
+e
08:08:10 WARNING(84): SDLMUX: breaking adapter %phx_vos#enet.m16.10.11-2: XID
+communication issue
08:08:10 PCI 10/11/2           enet.m16.10.11-2 Break Requested
08:08:10 PCI 10/11/2                             MTBF Failure
08:08:10 WARNING(85): SDLMUX: device name %phx_vos#enet.m16.10.11-2 is broken
Figure 3
dlmux_admin #sdlmux.m16.11-2 sdlmux_status
Group Name:          #sdlmux.m16.11-2
Device Name:         %phx_vos#enet.m16.11.11-2
Adapter State:       ACTIVE   UP
Partner:             %phx_vos#enet.m16.10.11-2
Partner State:       DOWN
Figure 4
To bring the adapter back into service you need to add it back with the board_admin command. Of course if you do that without correcting the underlying problem the adapter will just MTBF again.
board_admin 10/11/2 add
board_admin
device_id:    10/11/2
action:       add
Do you want to continue? (yes, no) yes
Command completed.
Figure 5
Fifth, since the active adapter always has the same MAC address the effect of a fail over is to move the MAC address from one switch port to another. Any security settings on the switch ports must allow this change. Also switch ports can be configured to talk to other switch ports, negotiating various settings. These can be triggered when the switch notices a change in topology, like a new MAC address or a link being restored. Until these settings are negotiated a switch may not pass regular frames. The switch ports connected to the SDLMUX adapters should be configured not to perform these negotiations. In addition the switch ports connected to the SDLMUX adapters should be configured not to run the spanning tree protocol or to skip the learning and listening steps (this is called portfast by Cisco). During the learning and listening steps the switch will not pass regular data frames. In extreme cases the delay caused by these settings can be so long that SDLMUX triggers another fail over.
Finally, the link status of the adapters needs to be monitored to be sure that both links are up. The failure of a link will not cause the system to call home and because its fault tolerant there is no loss of system connectivity I covered this in a previous blog post, see Monitoring network adapter status.