|  Redundant 
              NICs on Solaris
 Thomas Kranz
              You have your resilient network in place. Dual switches, dual 
              routers, HSRP, failover, redundant firewalls -- they're 
              all there. Now, what about your Sun boxes? Having two network interface 
              cards (NICs) on your Solaris server, with the primary NIC failing 
              over to the secondary, seems like an obvious and easy task, yet 
              there are many pitfalls that make it unnecessarily complicated. 
              In this article, I'm going to explore how to provide redundant 
              NICs simply and cheaply.
              I'm not going to be covering Sun Trunking or Alternate Pathing 
              in this article. Sun Trunking needs to be purchased as add-on software 
              and it's designed to trunk interfaces together, not provide 
              redundancy. Alternate Pathing might fit the bill, however, it introduces 
              a higher level of complexity, making overall management harder.
              The first place to look is interface grouping. Introduced in Solaris 
              2.6, interface groups changed the behavior of the IP stack and the 
              kernel. Previously, if Solaris received packets on the secondary 
              interface, it would send them out the primary, regardless of whether 
              the primary was up. With interface groups, you tell Solaris that 
              the two NICs are connected to the same subnet. Routing tables are 
              then manually modified to show that routes can be out of either 
              interface, which have different IP addresses. Interface groups are 
              enabled by using ndd to set ip_enable_ifs_groups to 
              one in /dev/ip. Although earlier releases of Solaris 2.6 
              and higher had this enabled by default, a subsequent patch (to fix 
              problems with sending the correct hostname) has changed this to 
              be disabled by default.
              When a packet comes in on either interface, the kernel updates 
              the ARP (address resolution protocol) cache to reflect the fact 
              that a particular host is available on a particular interface. All 
              packets still go out the primary interface. However, when the primary 
              goes down, destination hosts (whose MAC addresses are defined by 
              the ARP cache so that they are available via the secondary interface) 
              will have packets sent out that interface to them. Beyond that, 
              any incoming packets on the secondary interface will update the 
              ARP cache so that hosts previously marked as being on the primary 
              interface will now be seen as available on the secondary.
              That's the theory, anyway. In practice, the results are unexpected 
              and unreliable. Interface groups only work well for directing traffic 
              in via multiple interfaces; the redundancy side of things doesn't 
              work too well. Symptoms include excessive TCP retransmits, resets 
              on the interfaces, and the ARP cache not being updated.
              So, although it looked promising, interface groups don't 
              completely solve the redundant NIC issue. Another way to approach 
              the problem is to run a script that monitors the interfaces, and 
              brings them up or down accordingly. The first logical place to start 
              is with some poking around with ndd. /dev/hme contains 
              a link_status variable. A value of 1 indicates that a link 
              is up, and a value of 0 indicates it is down. Thus, a script is 
              written that parses ndd's query of link_status, 
              and brings interfaces up and down accordingly.
              However, it doesn't quite work like that in the field. link_status 
              will report a value of 1 (link is up) for any interface that hasn't 
              been plumbed, or that hasn't been configured with an IP address 
              (even a null one). For example, if you have a qfe board, and qfe2 
              isn't physically plugged in, or configured, link_status 
              will be 1. If you ifconfig qfe2 up with a random IP address, 
              link_status will then correctly report 0.
              Things are complicated on the switch end, as well. For example, 
              let's say you have redundant Cisco Catalyst switches providing 
              your network connectivity. To test the ndd script, we can 
              go to the switch and disable a port. link_status reports 
              0, and the link is down. However, when we re-enable the port, link_status 
              still reports 0, and the Catalyst sees the port as not connected.
              It's not until we actually try to send data over that link 
              that the interface driver wakes up and sets link_status to 
              1, whereupon the Catalyst also wakes up and flags the port as connected. 
              This poses problems for us -- while all this is going on, the 
              Catalyst will be discarding traffic, because it sees the link as 
              down.
              So, monitoring interfaces with ndd is a dead end. However, 
              could we accomplish the same thing using a much simpler method, 
              such as ping? A simple script (Listing 1) pings the 
              default gateway, which will be our redundant network kit. If that 
              ping fails, we know we've lost connectivity. The script 
              will then ifconfig down that interface, and ifconfig 
              up the second interface, which has already been configured with 
              the same IP address. It will then try another ping to the 
              default gateway, to bring up the link and check that all is indeed 
              well.
              This script runs from cron once a minute, and after some extensive 
              testing seems to fit the bill. Total runtime when failing over is 
              around 40 seconds, so we don't have any danger of the script 
              not finishing by the time cron kicks off another run.
              The final piece of the puzzle is configuring that redundant interface 
              on bootup. Adding an entry to /etc/hosts and creating a corresponding 
              /etc/hostname.<int> file will configure the interface, 
              but will also bring it up at boot time -- not something we want 
              to happen. This means we need to have a script that's run on 
              bootup, which will correctly configure the secondary interface with 
              a duplicate IP address, but not bring it up.
              Listing 2 shows such a script. It depends on a config file 
              called /etc/redundant.int, which contains the name of your 
              redundant interface. From there on in, it parses your current configuration 
              and configures the secondary interface with the duplicate IP address, 
              but doesn't bring it up. I run this script out of /etc/rcS.d/S31scnd_int, 
              which will run just after /etc/rcS.d/S30rootusr.sh, which 
              is responsible for bringing up the primary interface and setting 
              routes.
              There are several advantages to this approach. We don't need 
              to worry about what the network kit thinks is going on; it's 
              free; it utilizes existing kit without introducing a need for extra 
              purchases; it's simple; and it could easily be expanded to 
              other platforms. As there are no Solaris-specific aspects to this, 
              it can be modified to work with whatever UNIX platform on which 
              you need redundant connectivity.
              Tomas Kranz has been a sys admin for six years, and is currently 
              Senior SysAdmin at Flutter.com. In his copious free time, he enjoys 
              cycling and spending time with his family. He can be reached at: 
              thomas.kranz@flutter.com.
           |