|  What 
              To Do When the Server Doesn't Serve
 Brett Lymn 
              Not that long ago when a server stopped serving files, most people 
              would ask when the machine would be back on the air, smile ruefully, 
              and wait patiently for their files to reappear on their network 
              drives. The excuse "sorry, I cannot tell you that because the 
              computer is down" was accepted, and people would call back 
              later for the information. Those were the days. Servers are now 
              expected to reliably serve up files 24 hours a day, 7 days a week. 
              Downtime no longer is an inconvenience -- it costs money. If 
              your Web server is down because of a file server failure, then people 
              will rarely wait patiently for the Web server to come back on line. 
              They will take their business, and their money, elsewhere. Because 
              of this, there has been a lot of focus placed on building systems 
              that do not rely on a single point of failure, so that even if one 
              component fails, the system as a whole will continue functioning. 
              Hardware designers have been working at this for some time, and 
              it shows in the latest machines that have dual this, hot-swap that, 
              to provide the ability to ride out a hardware failure and repair 
              the machine without requiring it to be shut down. Of course, the 
              operating systems that sit on top of this hardware have been modified 
              to exploit the new hardware features to provide resiliency in the 
              face of failures. The problem now is that modern machines rarely 
              live in isolation. They are typically networked to other machines 
              with the result that all your hardware and software redundancy grinds 
              to a halt when the file server goes down. To prevent this situation, 
              we need to have a system that will detect when the file server has 
              gone down and take some action to ensure the continuity of service 
              in the face of a file server failure. 
              Managing Failover with cachefs 
              There are quite a few solutions to the problem of file 
              serving failure. My three-part series of articles will contain a 
              sampling of what is available. My articles will have a heavy Sun 
              Solaris slant because that is what I work predominately with. The 
              first article in the series describes providing failover using a 
              little known feature of Sun's cachefs implementation 
              that allows cachefs clients to operate disconnected from 
              the file server. 
              Both Solaris 2 and amd running on Solaris 2 have a file 
              system called cachefs. This filesystem uses the client's 
              local disk to cache copies of files that originate from a remote 
              server. The idea behind this file system is to improve performance 
              by providing files at local disk access speeds for frequently used 
              files, but still have the files served from a central location. 
              Although amd supports the cachefs on Solaris 2, it 
              does not provide all the features that Sun's implementation 
              does and, hence, is not as useful in providing a fail-over service. 
              In this article, I will concentrate on the features of the Sun cachefs 
              implementation. 
              One of the interesting things with the cachefs is that 
              once a file has been accessed on the client, the file resides on 
              the client's local disk. So, if the cachefs can be convinced 
              to serve up a (possibly out of date) file from cache, even if the 
              origin server is currently down, then the system could mount files 
              from a central server, but still ride out server outages if the 
              files required by the client are in the client's local cache. 
              With some judicious use of flags, this is exactly what we can do 
              with the cachefs. In cachefs parlance, the file system 
              mounted from the server is called the back file system, and the 
              local cachefs file system is called the front file system. 
              To create a cachefs mount, you first need to initialize the 
              front file system storage on the client using the cachefs 
              administration tool, cfsadmin, like this: 
              
             cfsadmin -c /var/cache
This creates and populates the file system cache in the directory 
            /var/cache. The location of the cachefs cache directory 
            is arbitrary; I will use the location /var/cache in the examples 
            in this article. The contents of this directory must only be manipulated 
            by the cachefs or cfsadmin. Any attempts to edit or 
            rename files in the cache directory will probably corrupt the cache 
            file system. After we have created the front file system, we can mount 
            the back file system onto the cachefs:  
             mount -F cachefs -o backfstype=nfs,cachedir=/var/cache server1:/file/tree /file/tree
The options here tell the cachefs that the back file 
            system is from a NFS server and that the cache directory is located 
            in /var/cache. The client will now be caching files from server1 
            onto local disk in /var/cache. The whole process is transparent 
            to the file system user; all the user sees is a normal directory and 
            is not aware that the files are not coming straight from the server. 
            The example as given will not handle a server outage. If the server 
            goes down, then the cachefs mount in the above example will 
            hang when attempting to validate the file on the server.  Fortunately, the cachefs has some options that help the 
              situation. One of these options is the "local-access" 
              cachefs mount option. This option tells the cachefs 
              to use the file attributes of the cached copy of the file, rather 
              than validate those file attributes on the back file system server. 
              This is meant to save a round trip to the server when checking the 
              file attributes, but it also serves to decouple the cachefs 
              a bit more from the back file system server; we no longer have to 
              rely on the server's being up to get file attributes for files 
              that are in cache. 
              Another pair of handy options are the demandconst and noconst, 
              which affect the way the cachefs validates the contents of 
              the cache against the back file system server. Normally, the cachefs 
              periodically automatically validates the contents of the cache against 
              the back file system server. By using the demandconst mount 
              flag, you can indicate to the cachefs that validation will 
              be done manually using the cfsadmin -s command. The noconst 
              mount option tells the cachefs that the cache will never 
              be validated against the back file system server. 
              Either of these mount options are good if the files on the back 
              file system are modified infrequently. With the demandconst 
              mount option, the clients can be instructed to revalidate their 
              caches after the changes have been made. With the noconst 
              mount option, the client cachefs must have the file in the 
              cache cleared so that the updates will flow through. Note with both 
              the automatic validation and the demandconst mount option, 
              if the back file system server is down when the cache object is 
              validated, then the cachefs will remove the object from the 
              cache. Clearly, this is undesirable if the primary reason for running 
              the cachefs is to provide some resiliency in the server mount. 
              An approach to this problem is to use the demandconst mount 
              option on the client and to either NFS ping the server prior 
              to requesting an update to ensure the validation will work, or to 
              make the server send a signal out to inform the clients to revalidate 
              their cached files. Fortunately, this is not neccessary. 
              Sun has built a mount option into the cachefs called disconnectable. 
              This option is only available when the back file system type is 
              NFS. The disconnectable option is very poorly documented -- 
              it does not appear in the man pages for mount_cachefs, nor 
              is there a man page for the daemon called cachefsd that supports 
              the disconnectable mode. I found out about this mode by chance when 
              I was searching the Sun support pages looking for patches that I 
              may need to apply to my system. I found an infodoc number 
              21701 that provided information on how to set up the cachefs 
              in disconnectable mode. The procedure is quite simple. You create 
              the directory: 
              
             /etc/fs/cachefs
and add the mount option "disconnectable" to the 
            cachefs mount command. The mount command now looks like:  
             mount -F cachefs -o backfstype=nfs,cachedir=/var/cache,disconnectable \Or if you want the file system to be mounted automatically 
            when the machine boots, then add a line like the one below to the 
            /etc/vfstab file:server1:/file/tree /file/tree
  
             server1:/file/tree /var/cache /file/tree cachefs - yes \Those who are familiar with the syntax of the vfstab 
            entry will note that where there is normally a raw device name, we 
            have the directory /var/cache. This field is used during boot 
            to fsck the file system, so for a cachefs file system 
            must provide the cache directory for fsck_cachefs to operate 
            on.To properly implement the disconnectable mode, you will need to 
            make an entry in the /etc/vfstab and either run the cachefs 
            scripts from /etc/init.d or simply reboot the system because 
            there is a supporting daemon, cachefsd, that needs to be run.backfstype=nfs,cachedir=/var/cache,disconnectable,local-access
  Understanding Disconnectable Mode 
              The disconnectable mode changes the behavior of the cachefs 
              in some subtle ways. First, the back file system is mounted in what 
              Sun terms a "semi-soft" mode, which allows the back file 
              system access to fail without hanging (much like a soft NFS mount). 
              However, if the accessed file exists in the cache, then the requestor 
              will receive the cached copy of the file instead of seeing an error 
              that would be seen if the mount was a usual NFS soft mount. In disconnectable 
              mode, the cachefs will block writes to the file system as 
              a normal hard NFS mount. Second, the operating system starts a cachefsd 
              daemon on the first access to the cachefs filesystem. This 
              daemon monitors the link to the back file system server and will 
              manage the process of keeping the client cache in sync with the 
              files on the back file system server. 
              By using the disconnectable mount option, the behavior of the 
              file attributes fetching changes as well. If the back file system 
              server is down, then the cachefs will wait for the RPCs to 
              the server to time out, but rather than returning an error, the 
              cachefs detects the failure and offers the file attributes 
              from the cached object instead. You can speed things up in disconnectable 
              mode by still using the local-access mount flag to prevent the cachefs 
              from checking the file attributes on the server. This provides a 
              noticeable improvement if the server is down. Another advantage 
              of the cachefs in disconnectable mode is that it supports 
              writes to the file system, even when the back file system server 
              is down, but only if you mount the cachefs with the non-shared 
              mount option. This option indicates that only one client will modify 
              the files on the server. In this manner, Sun has neatly avoided 
              the problem of multiple clients modifying a file when they are disconnected 
              from the server by limiting who can modify the files. If you do 
              not have the non-shared mount flag, then attempts to write to the 
              file system when the back file system server is down will result 
              in the write blocking until the server comes back up. 
              What to Watch For 
              As with any solution, the cachefs solution has problems. 
              If the files you are interested in are not in the local cache when 
              the server is down, then you will get a read failure on those files. 
              To make matters worse, there is no way to tell the cachefs 
              system which files are desired to be in the client cache except 
              by performing a read of each file that is deemed critical. You also 
              must provision sufficient client-side storage to accommodate all 
              the files that you want available during a possible server outage. 
              Normally, you do not need to match byte for byte server storage 
              capacity on the client. It is expected that the client will only 
              use a small subset of the server files, and some objects in the 
              cachefs can be purged to make room for newer objects. If 
              you use the cachefs as a failover store, then you must ensure 
              there is enough client-side storage for the cachefs cache 
              to hold all the files that you need during a server outage. Also, 
              the limitation of the number of clients that can update the file 
              system (although it does avoid the problems of conflicting updates) 
              may be problematic in some circumstances. Also watch the default 
              size limit for caching files in the cachefs. This limit is 
              set at 3 MB, but can be increased by using the cfsadmin command 
              like this: 
              
             cfsadmin -o maxfilesize=5 /var/cache
The effect of this command is to increase the maximum size 
            limit of the files cached by the cachefs to 5 MB. This limit 
            may need to be tweaked to cope with the largest file that is required 
            to be cached so that it is available during a server outage. In addition 
            to the file size limit, there are a lot of options available in cfsadmin 
            that control the resource utilization of the cachefs. Details 
            about these options are outside the scope of this article, but if 
            you are planning to cache a large number of files, check the man page 
            of cfsadmin to ensure that all the files you require to be 
            cached will get cached without exceeding the default resource limits. 
            Finally, when the cachefs is in disconnectable mode, the cachefs 
            behaves like the demandconst mount flag is being used. That 
            is, changes to the back file system are not propagated unless you 
            run:  
             cfsadmin -s all
This behavior is odd, but it can be worked around by putting 
            a cron job onto the client system to periodically run the cfsadmin 
            command. If the server is down, the cfsadmin will quit without 
            doing anything. Note the difference disconnectable mode makes here 
            -- if you just use demandconst and run the cfsadmin 
            -s command when the server is down, the contents of the cachefs 
            will be flushed, but in disconnectable mode the files stay intact.  Conclusion 
              That about does it for cachefs, which gives you 
              the option to use some local machine resources to provide a way 
              to ride out server outages. This approach is best suited to situations 
              where you have a small number of files necessary for the local machine 
              operation. Larger file trees will spend more time doing the validation 
              of the cachefs contents and will also increase the network 
              traffic while performing the validation, which may make the cachefs 
              approach undesirable. 
              Furthermore, if you are planning to use the cachefs, check 
              that you have the appropriate patches for your version of Solaris. 
              Even though cachefs has been available since Solaris 2.3, 
              the disconnectable mode was a later addition, which may require 
              a patch to your system. In my next article, I will cover a more 
              traditional way of implementing failover by using replicated servers. 
              Brett Lymn works for a global company helping to manage a bunch 
              of UNIX machines. He spends entirely too much time fiddling with 
              computers but cannot help himself. 
           |