|  What 
              to Do When the Server Doesn't Serve -- Duplicating Data
 Brett Lymn
              In my previous article, (Sys Admin, February 2001) I talked 
              about providing file system failover using features of Sun's 
              cachefs implementation to provide a transparent serving of files 
              when the server is down. In this article, I will take a more traditional 
              approach to server failover and discuss duplicating data on multiple 
              servers and causing client machines to select another server if 
              the one that they are talking to stops responding.
              The process by which a client automatically mounts file systems 
              from a server is called an automounter. An automounter is a daemon 
              that runs on the client machine that monitors a configured set of 
              directory paths. When an access is made to one of these monitored 
              paths, the daemon mounts the appropriate portion of the file system 
              from a remote server using NFS to make it appear as though the path 
              was always there. After a period of inactivity on the mounted path, 
              the daemon will unmount the path to prevent the system from accumulating 
              mounted file systems. On Sun systems, this is simply called the 
              automounter; for other systems, there is an open source equivalent 
              called amd. Both Sun's automounter and amd allow 
              for multiple servers to serve the same filesystem. On versions of 
              Solaris prior to 2.6, the binding of a client to a particular server 
              was resolved only at mount time. So, if the server from which your 
              client automounted a file system went down, too bad, you had to 
              reboot the client to force it to mount the file system from the 
              other server.
              For amd and Sun's automounter on Solaris 2.6 and later, 
              the automounter daemon monitors the connection to the server. If 
              the connection to the server fails, the automounter automatically 
              tries to connect to the other listed servers so that the client 
              can continue working.
              Sun Automounter
              For Sun's automounter, two files are required to start the 
              automounter: one is called the direct map and the other is the indirect 
              map. The direct map may contain automounter configuration lines 
              but, in practice, this is discouraged because any changes to the 
              direct map require the automounter to be restarted before they take 
              effect. Normally, the automounter direct map will contain names 
              of files. These files are called indirect maps and are commonly 
              used to hold the automounter configuration because the indirect 
              maps can be updated and take effect without restarting the automounter.
              The direct map is normally a file called /etc/auto_master 
              and contains lines like the following:
              
             
/file  /usr/automount/file_map
that specify the locations of the indirect maps. In the example above, 
            the automounted directory /file is controlled by the indirect 
            map /usr/automount/file_map. Note that you may store the indirect 
            maps anywhere that is accessible to the automounter daemon. My preference 
            is to put all of the indirect maps into a single directory, which 
            I typically share out to all the clients. As mentioned before, you 
            may put the contents of an indirect map into the direct map, but doing 
            so will create headaches later if you change your indirect maps. For 
            a typical automounted directory, the indirect map contains lines of 
            the form:  
             
tree  server1:/share/file/tree
which tell the automounter that if a process accesses the directory 
            /file/tree, it should mount the file system /share/file/tree 
            from the machine server1 as /file/tree. The above example gives 
            no failover protection. If server1 goes down, then the client will 
            hang. The automounter syntax for specifying failover servers is a 
            space-separated list of file server names and directories in the automounter 
            entry, like this:  
             
tree    server1:/share/file/tree server2:/share/file/tree
In this case (for automounters running on Solaris 2.6 and later), 
            if server1 goes down, then the automounter will automatically change 
            over to server2 for file serving. As noted earlier, automounters running 
            on versions of Solaris prior to 2.6 will need to have the machine 
            rebooted to change file servers.  amd
              amd may already be installed on your system. There are 
              quite a few vendors that bundle amd as part of the operating 
              system or as a package on the distributions CD. Red Hat Linux, for 
              example, has amd as a package on the latest distribution 
              CDs. Check your man pages to see if it exists. If it does not, then 
              check your vendor site for a downloadable package.
              For amd, the concepts are like Sun's automounter, 
              but the syntax for the amd maps is different. amd 
              does not have a direct map in the same manner as Sun does. The mapping 
              of root directories to maps is either done on the command line in 
              simple cases, or by creating an amd.conf file. To perform 
              the same function as the /etc/auto_master file shown in the 
              previous example, we would add the following to the end of the amd 
              command line:
              
             
/file  /etc/amd/file_map
As with the Sun automounter, the amd map files should be put 
            into a single directory to ease the task of administering the amd 
            files. The amd syntax for the file_map file would be 
            as follows:  
             
tree      -opts:=ro;type:=nfs;rfs:=/share/file/tree rhost:=server1
This tells amd to mount the shared file system /share/file/tree 
            from server1 as the tree subdirectory of the /file top-level 
            directory. The file system will be mounted read-only.  Again, this does not provide any failover protection. If the machine 
              server1 goes down, then the client will hang until server1 comes 
              back on line. As with Sun's automounter, amd can mount 
              a file system from multiple servers and will monitor the servers 
              to detect when one has gone down. To specify redundant file servers 
              in amd, more rhost parameters must be added to the entry. 
              So, to have two servers, server1 and server2, serving a replicated 
              file system to the clients, the amd map entry looks like:
              
             
tree      -opts:=ro,soft,intr;type:=nfs;rfs:=/share/file/tree rhost:=server1 rhost:=server2
Now, if server1 goes down when the file system is not mounted, the 
            client will automatically switch to server2 for the serving of the 
            files. Unfortunately, unlike Sun's automounter, if a process 
            is using the mount when the server goes down, that particular process 
            will hang. Subsequent processes that access the file tree will not 
            hang because amd will failover to the server that is still 
            up, but you will be left with a hung process. By using the intr 
            flag in the mount options, we can allow hung processes to be killed, 
            and the soft option will allow file operations to fail on a 
            server error rather than hang. Because of this behavior, any long-running 
            processes accessing the file system when the server fails will have 
            to be restarted. This can be done with a simple script that monitors 
            the server availability and performs a restart on critical processes.  I noticed another quirk with amd (using am-utils 
              version 6.0.1s11 on a NetBSD 1.5 machine). When using multiple servers 
              for a mount, once amd has marked a server as down, it will 
              not use that server to mount from, even after the server has come 
              back up. Thus, you should not rely on amd to load balance 
              your NFS traffic, because all the clients will automatically migrate 
              to your most reliable server.
              Synchronizing the Servers
              To make failover work, multiple servers must contain complete 
              copies of the exported file systems. One server is designated as 
              the "master" repository for the file system, and a periodic 
              job is run to update the files on the slave servers with any changes 
              using something like rdist, rsync, or unison.
              The tool rdist normally comes standard on UNIX systems 
              and is intended for updating file trees from a master tree. The 
              command for performing the update:
              
             
rdist -R -c /share/file/tree server2
will update the directory tree /file/tree on server2 from the 
            server on which the rdist command was run. Though this is quite 
            straightforward, there are some problems using rdist. In the 
            past, rdist has been known for some easily exploitable security 
            holes, and since rdist was, by default, a setuid program, it 
            has been a target for intruders to gain root access. For this reason, 
            rdist may have been removed or disabled. Another problem with 
            rdist is that it requires equivalent access to the remote machine, 
            which means that the user running rdist on the master has access 
            to the other machines without requiring a password. This may cause 
            security problems in some environments.  Another method of syncing the file trees is to use some free software 
              called rsync. (See the Resources section for information 
              on obtaining rsync and other tools mentioned here.) The rsync 
              tool was written by Andrew Tridgell (the original author of Samba) 
              and is designed to efficiently copy files from one server to another. 
              rsync is efficient because it only transfers the differences 
              between the files on the master and the files on the slave machine, 
              which dramatically reduces the amount of data pushed over the network. 
              Not only does rsync make efficient use of network bandwidth, 
              but the transport method by which rsync talks to the remote 
              machine can be manually selected. By default, rsync will 
              use rsh to talk to the remote machine, but you can change 
              this to use something like ssh or openssh, which will 
              not only allow stronger verification of the remote machine's 
              credentials but also allow the traffic between the two machines 
              to be encrypted. To perform the same function as shown in the rdist 
              command above, but using ssh as our transport mechanism, 
              the rsync command looks like:
              
             
rsync -va --rsh /usr/local/bin/ssh --delete /share/file/tree/ server2:/share/file/tree/
One note of warning with rsync -- the trailing slashes 
            on the directory paths in the above example are important. If these 
            slashes are not there, rsync may eat your target directory 
            contents. So, it is wise to try out the rsync command on a 
            backed up sample directory to ensure you do not accidentally destroy 
            something important.  Another problem with rdist and rsync is that both 
              assume the master source is on a single machine, which limits where 
              the updates can be performed. If files on the slave servers are 
              updated without updating the master, then updates on the slave will 
              be overwritten the next time the master distributes files to the 
              slave servers. To overcome this, we can either strictly enforce 
              the rule that only the master server files are updated, or we can 
              use a program such as unison, which allows files to be updated 
              on multiple servers and can synchronize the updated files between 
              the servers automatically.
              unison is currently under active development and is considered 
              beta software by the authors. Despite the beta status, the capabilities 
              that unison provides sound very promising in cases where 
              there is either the lack of discipline or lack of will to enforce 
              the rule of only updating one master server. unison will 
              detect conflicting changes between two file trees so that the administrator 
              can decide what to do about the conflict. If the file on one of 
              the machines has been modified but is unchanged on the remote machine, 
              then the two files are simply synchronized. By default, unison 
              uses ssh as its transport, so data is not sent over the wire 
              in the clear. The relative newness of unison does make selecting 
              it over rdist or rsync a more risky proposition, but 
              if you are willing to help debug the code or put up with some teething 
              problems, then unison may be a good choice for your site.
              Although the approach of having multiple servers is simple, it 
              also has some disadvantages. The method can be used only for file 
              systems that will be mounted read-only on the clients. Also, any 
              updates to the file tree must be carried out only on the master 
              machine -- unless a program like unison is used to update 
              the servers. Finally, there is a small risk that the master server 
              will fail just when it is performing the update. If you have multiple 
              slave machines, you may encounter a situation in which one slave 
              has a truncated version of a file (because the server went down 
              during the copy) or some of the slave servers have an out-of-date 
              version of some files (because the master went down part way through 
              the update). If the slave servers are updated frequently and the 
              replicated file system is small, the chance of inconsistencies arising 
              is remote. However, these inconsistencies may cause much confusion 
              if they occur. You must also take into account that you are replicating 
              all the resources required to support the duplicated file systems. 
              This includes all machines and disk space, adding to the cost of 
              providing the failover capability. Also, there are some bursts of 
              network traffic involved in the synchronization process when the 
              servers replicate data between themselves, which may impact the 
              client network. To address this, you may want to set up a private 
              network between the servers rather than causing the replication 
              traffic to go across the client network.
              Conclusion
              That's it for this time. In my next article, I will look 
              at a more adventurous way of providing failover using a file system 
              specifically designed to handle network disconnections between the 
              client and the server.
              Resources
              rsync information can be found at: http://www.samba.org/
              
              ssh information can be found at: http://www.cs.hut.fi/ssh
              
              OpenSSH information can be found at: http://www.openssh.com/
              
              unison information can be found at: http://www.utu.net/unison/unison.html
              Brett Lymn works for a global company helping to manage a bunch 
              of UNIX machines. He spends entirely too much time fiddling with 
              computers but cannot help himself. 
           |