|  Real-Time 
              Remote Data Mirroring
 Primitivo Cervantes
              When planning for a disaster, there are many issues to consider 
              and resolve. Some of these issues are resolved with the use of a 
              remote location as an emergency production site in case the primary 
              site is unavailable due to a disaster. In this situation, a remote 
              data center is set up with systems similar to the primary production 
              systems. This remote disaster recovery site is usually connected 
              to the primary site via a network.
              Even with a remote disaster recovery site, there are still many 
              issues to consider. The remote site is usually not in use during 
              normal production hours and is a cost center until it is used as 
              a recovery center. You must determine, for example, whether you 
              want the exact system as the primary site or whether you can do 
              with reduced performance during an emergency. You must decide whether 
              you can afford to lose some data in an emergency or whether you 
              need all of the most current data when bringing the systems online 
              at the remote site. You must also consider how current the data 
              must be and how much you are willing to pay for it. In this article, 
              I will discuss various data-replication techniques and their associated 
              advantages and disadvantages. I will also describe costs associated 
              with these techniques and describe cases where it might be justifiable 
              for them to be implemented. I will also describe an installation 
              of one of these techniques, using IBM's HAGEO product to replicate 
              data in real-time to a remote site.
              Before I get into the data-mirroring techniques, I will explain 
              the types of data that clients usually mirror. There are three basic 
              types of data that are usually required to run a customer application: 
              operating system data, database data, and non-database application 
              data. Operating system data consists of the programs and files needed 
              to run the operating system such as IBM's AIX. Database data 
              is the actual data contained in a client's application typically 
              using a database manager, such as Oracle, Sybase, or Informix. Non-database 
              data is the client application's executables, configurations 
              files, etc. needed to run the application (such as Oracle Financials, 
              SAP, or PeopleSoft).
              A system backup is typically used to backup and replicate the 
              operating system data, so this will not be covered in this article. 
              I will discuss only techniques to mirror database data and non-database 
              application data.
              
            Remote Data Mirroring Techniques
              System Tools
              Historically, data has been mirrored to a remote location using 
              operating system tools such as ftp, rcp, uucp, 
              etc. These tools have distinct advantages in that they are available 
              with the operating system, easy to use, and very reliable.
              Typically, a system backup is used to replicate a system at a 
              remote location. An initial database backup is created and restored 
              at the remote location. From this point on, the only things that 
              need to be sent to the remote location are the database archive 
              logs and any changed application executable or configuration file. 
              By sending only the database logs to the remote location and then 
              using the database to "play the logs" forward, the database 
              at the remote location can be kept consistent but will always be 
              behind the primary site. In other words, we send changes to the 
              database to the remote location, but the most current changes are 
              not sent.
              The main disadvantage of these tools is that the remote site is 
              not always as current as the primary site (behind in database transactions). 
              This is usually reason enough to prevent many clients such as banks, 
              and hospitals from using these tools. These clients demand a real-time 
              mirroring solution. 
              Built-In Database Tools
              Database vendors have been adding and enhancing remote mirroring 
              capabilities to their products, and this is another way of mirroring 
              data to a remote location. Using the database vendor products, the 
              database data can be mirrored and kept consistent across a geography. 
              The data at the remote location is just as current as the primary 
              site.
              The main advantages of database vendor tools are that they are 
              available with the database at a low cost and are guaranteed to 
              work by the vendor. If there is a problem, the vendor has a support 
              structure that can help the client determine the cause and also 
              provide a fix, if necessary.
              The main disadvantages of database vendor tools are that they 
              replicate only the database data and not application data. They 
              are also sometimes difficult to use and may require the source code 
              of the application be modified and recompiled.
              Hardware-Based Tools
              While database vendor tools are specific to the database vendor, 
              hardware-based solutions aim at providing data mirroring regardless 
              of the application or database vendor. One way to use hardware to 
              mirror data is with a storage system from a storage vendor, such 
              as EMC or IBM. EMC has long had remote mirroring capabilities, and 
              IBM recently announced (Dec. 2000) mirroring capabilities in some 
              of their storage products. These storage products allow you to mirror 
              data from one storage system to another storage system at a remote 
              site regardless of the data being mirrored. This mirroring is done 
              in real-time and works very well. Because these hardware storage 
              systems can use large amounts of cache, system performance can be 
              very good.
              In cases where there are long distances (greater than 60 miles) 
              between sites, EMC uses three storage systems in series to mirror 
              data. A primary storage system is mirrored in real-time to a secondary 
              system at a nearby remote location (usually less than 10 miles). 
              This secondary system then mirrors the data to the third storage 
              system at best speed. This means the third storage system is not 
              as current as the first storage system; but no data is lost, because 
              it will eventually be fully synchronized from the second storage 
              system.
              IBM's mirroring solution is relatively new, so I have not 
              had a chance to implement it at long distances. IBM says the mirroring 
              capabilities will work at longer distances than EMC (about 100 miles).
              Another consideration with the hardware solution is that performance 
              degrades significantly as the distance between sites increases. 
              I have seen increases of 4000% in write times with as little increase 
              as 10 miles between sites. This is probably worst case, but nonetheless 
              is significant and should be considered.
              The main disadvantage of these hardware solutions is the cost. 
              A single EMC or IBM storage system can easily cost $750K and the 
              long-distance mirroring solutions will certainly be in the millions. 
              Even so, for clients that need this additional safety net, I think 
              the hardware solution will eventually provide the best combination 
              of performance and data integrity. A hardware solution will probably 
              be the most expensive but provide the most functionality.
              IBM's GEOrm or HAGEO
              If the client application happens to be running on an IBM AIX 
              system, there are a couple of software products from IBM, called 
              HAGEO and GEOrm, that could be used for real-time data mirroring.
              IBM's HAGEO and GEOrm actually mirror logical volumes from 
              one site to a remote site. They provide device drivers that intercept 
              disk writes to the AIX logical volumes and mirror this data in real-time 
              to the remote location.
              The difference between the HAGEO product and the GEOrm product 
              is that the HAGEO product also integrates with IBM's high-availability 
              product HACMP. The HACMP product detects system failures and allows 
              a secondary system to take over an application should the primary 
              system fail. With the HACMP/HAGEO product combination, if your primary 
              system fails, the secondary system will automatically detect this 
              failure and take over the application, usually within half an hour 
              or so.
              IBM typically charges anywhere from $100K to $400K in software 
              and services to implement a two-system HACMP/HAGEO cluster. Implementing 
              GEOrm does not involve HACMP, so it usually costs less (anywhere 
              from $40K to $150K). These are all ballpark figures, and you should 
              contact an IBM representative for an actual quote.
              The main advantage of the HAGEO and GEOrm solutions is that they 
              are software based and work regardless of the database vendor or 
              application. The main disadvantages are that it usually requires 
              a higher skilled systems administrator to maintain and that it has 
              some system performance implications. Since the data mirroring is 
              done in real-time and without a large cache, this can have a high 
              impact on system performance. Also, this solution is only available 
              on IBM AIX systems.
              An HAGEO Implementation
              I have implemented and managed several HACMP/HAGEO projects, so 
              I will relate my experience with this product at a utility company 
              in California. In this example, the company had two systems, the 
              primary one located in Los Angeles County and the secondary one 
              in Orange County. They were using the IBM SP2 systems (the same 
              type that defeated Kasparov at chess). The systems chosen used multiple 
              processors and were several times faster than the single processor 
              systems they were replacing. Everyone was initially optimistic that 
              performance was not going to be an issue.
              The disk drives were IBM SSA drives (only 14 4-GB disks in all). 
              The data was locally mirrored, so the usable disks were 7 4-GB disks, 
              a relatively small application. Note that we configured HAGEO in 
              its most secure mode of mwc (called mirror-write-consistency). This 
              required us to create logical volumes (called state maps) to use 
              with HAGEO, so we had to allocate two of these drives exclusively 
              for HAGEO use.
              The systems were connected to the network with two 10-Mb Ethernet 
              adapters for the data mirroring (between the sites) and one 10-Mb 
              Ethernet adapter for the client traffic (where users were going 
              to be logging in).
              For ease of debugging problems, we chose the following schedule: 
              
              1. Setup systems (local and remote sites) and get operating system 
              working. 2. Install application and Sybase database on one site.
 3. Verify that application works correctly.
 4. Install HACMP/HAGEO.
 5. Initiate HACMP/HAGEO mirroring capabilities.
 6. Verify that application still works correctly and that the data 
              is mirrored to remote location.
  Initial Application Installation (Before HACMP/HAGEO)
              During the initial application installation (Steps 1-3 above), 
              the application AIX logical volumes and filesystems were created 
              and the application software installed on the primary system. This 
              was done without HACMP/HAGEO, so the application installation processes 
              were typical of any IBM AIX system with that application. The application 
              was a custom labor management system using a Sybase database.
              Up until this point, the overall cluster looked like Figure 1. 
              In this diagram, the two systems are installed and have all network 
              connections, but the application is only installed and running on 
              one system. The application and data will eventually be mirrored 
              to the remote system, so it is not necessary to install it on both 
              systems.
              Note that the AIX logical volumes (LVs) have been defined to contain 
              the application and data. Sybase knows about them and reads and 
              writes to the LVs directly. These LVs are defined as:
              
              /dev/appslv00 -- Where the Sybase DBMS and application 
              files are located /dev/sybaselv01 -- Sybase database data
 /dev/sybaselv02 -- Sybase database data
 /dev/sybaselv03 -- Sybase database data
 /dev/sybaselv04 -- Sybase database data
  
              There are two Ethernet adapters for the user or client network. 
              One of these adapters is the HACMP "service" adapter to 
              which the clients connect. The other adapter is the HACMP "standby" 
              adapter used by HACMP as a backup adapter if the primary adapter 
              fails.
              There are two Ethernet adapters for the HAGEO mirroring function. 
              HAGEO will load balance between the two adapters, so there is no 
              need for a standby adapter in this network.
              In the following discussion, I will give the actual commands that 
              we used to configure HACMP and lastly, how we configured and started 
              the HAGEO mirroring.
              Installing HACMP and HAGEO
              Installing HACMP and HAGEO (Step 4 above) is a simple process 
              of loading the CDs on the system and running SMIT to install the 
              base HACMP and HAGEO packages. To do this on the command line, run 
              the following commands:
              To install HACMP (all on one line):
              
             
/usr/lib/instl/sm_inst installp_cmd -a -Q -d '/dev/cd0' -f 'cluster.base
ALL  @@cluster.base _all_filesets'  '-c' '-N' '-g' '-X'   '-G'
To install HAGEO (all on one line):  
             
/usr/lib/instl/sm_inst installp_cmd -a -Q -d '.' -f 'hageo.man.en_US
ALL  @@hageo.man.en_US _all_filesets,hageo.manage  ALL  @@hageo.manage
_all_filesets,hageo.message ALL  @@hageo.message _all_filesets,hageo.mirror ALL
@@hageo.mirror _all_filesets'  '-c' '-N' '-g' '-X'   '-G'
Configuring HACMP  Configuring HACMP and HAGEO (Step 5 above) requires that you enter 
              all of the cluster information to HACMP and HAGEO. Most of this 
              information is shown in Figures 1 and 2. In this particular case, 
              the information needed to configure HACMP and HAGEO is:
              
             
              To configure the HACMP cluster: Create the HACMP cluster. 
               Add the node names and IP addresses. 
               Create the HACMP resource group. 
               HAGEO geo-mirror device information. 
               Activate the mirroring capabilities.
              
             
/usr/sbin/cluster/utilities/claddclstr -i'1' -n'hageo1'
To configure the HACMP nodes and IP addresses:  
             
/usr/sbin/cluster/utilities/clnodename -a 'labor1'
/usr/sbin/cluster/utilities/clnodename -a 'labor2'
When configuring the IP addresses, note that there are two addresses 
            associated with the primary adapter on each system (en0). One of these 
            addresses is the "service" address or the address to which 
            the clients connect. The second address is a "boot" address 
            used only when the system boots up. This is because the "service" 
            address will move from the primary system (labor1) to the standby 
            system (labor2) during a primary system failure. To bring up "labor1" 
            when this happens, we must have an address for the system, so it does 
            not conflict with the "service" address that just moved 
            over to "labor2". This is the "boot" address.  Figures 1 and 2 do not show the interface names that resolve to 
              the IP addresses. These are as follows:
              
             
10.25.172.32    labor1svc
10.25.172.31    labor1stby
10.25.182.32    labor1boot
10.25.172.42    labor2svc
10.25.172.41    labor2stby
10.25.182.42    labor2boot
To configure these IP addresses in HACMP:  
             
/usr/sbin/cluster/utilities/claddnode -a'labor1svc' :'ether' :'ether1' \
  :'public' :'service' :'10.25.172.32' :'0004ac5711aa' -n'labor1'
/usr/sbin/cluster/utilities/claddnode -a'labor1stby' :'ether' :'ether1' \
  :'public' :'standby' :'10.25.172.31' : -n'labor1'
/usr/sbin/cluster/utilities/claddnode -a'labor1boot' :'ether' :'ether1' \
  :'public' :'boot' :'10.25.182.32' : -n'labor1'
/usr/sbin/cluster/utilities/claddnode -a'labor1svc' :'ether' :'ether1' \
  :'public' :'service' :'10.25.172.42' :'0004ac5711bb' -n'labor1'
/usr/sbin/cluster/utilities/claddnode -a'labor1stby' :'ether' :'ether1' \
  :'public' :'standby' :'10.25.172.41' : -n'labor1'
/usr/sbin/cluster/utilities/claddnode -a'labor1boot' :'ether' :'ether1' \
  :'public' :'boot' :'10.25.182.42' : -n'labor1'
Creating the HACMP Resource Groups  The application resource information comprises all the system 
              resources that HACMP uses to manage an application. In other words, 
              this is everything needed to start or stop the application. In our 
              case, we will need the AIX logical volumes (and AIX volume group) 
              associated with the application, the IP address the users log into, 
              and a script to start and stop the application.
              We created three resource groups, one for "labor1", 
              one for "labor2", and one for moving the application from 
              site "losangeles" to site "orange".
              Here are the commands we used to configure HACMP. To create the 
              HACMP "resource groups":
              
             
/usr/sbin/cluster/utilities/claddgrp -g 'resourcegroup1' -r 'cascading' \
  -n 'losangeles orange'To create the HACMP "application server" that contains the 
            start and stop script information:
/usr/sbin/cluster/utilities/claddgrp -g 'resourcegroup2' -r 'cascading' \
  -n 'labor1' 
/usr/sbin/cluster/utilities/claddgrp -g 'resourcegroup3' -r 'cascading' \
  -n 'labor2'
  
             
/usr/sbin/cluster/utilities/claddserv -s'appserver1' \
  -b'/apps/hacmp/start_script' -e'/apps/hacmp/stop_script'
To put all of the system resources in the HACMP resource group:  
             
/usr/sbin/cluster/utilities/claddres -g'resourcegroup1' SERVICE_LABEL= FILESYSTEM= 
FSCHECK_TOOL='fsck' RECOVERY_METHOD='sequential' EXPORT_FILESYSTEM= MOUNT_FILESYSTEM= 
VOLUME_GROUP='appsvg' CONCURRENT_VOLUME_GROUP= DISK= AIX_CONNECTIONS_SERVICES= 
AIX_FAST_CONNECT_SERVICES= APPLICATIONS='ktazp269ORACLE' SNA_CONNECTIONS= MISC_DATA= 
INACTIVE_TAKEOVER='false' DISK_FENCING='false' SSA_DISK_FENCING='false' FS_BEFORE_IPADDR='false'
/usr/sbin/cluster/utilities/claddres -g'resourcegroup2' SERVICE_LABEL='labor1svc' FILESYSTEM= 
FSCHECK_TOOL='fsck' RECOVERY_METHOD='sequential' EXPORT_FILESYSTEM= MOUNT_FILESYSTEM= 
VOLUME_GROUP= CONCURRENT_VOLUME_GROUP= DISK= AIX_CONNECTIONS_SERVICES= 
AIX_FAST_CONNECT_SERVICES= APPLICATIONS= SNA_CONNECTIONS= MISC_DATA= INACTIVE_TAKEOVER='false' 
DISK_FENCING='false' SSA_DISK_FENCING='false' FS_BEFORE_IPADDR='false'
/usr/sbin/cluster/utilities/claddres -g'resourcegroup3' SERVICE_LABEL= FILESYSTEM= 
FSCHECK_TOOL='fsck' RECOVERY_METHOD='sequential' EXPORT_FILESYSTEM= MOUNT_FILESYSTEM= 
VOLUME_GROUP='appsvg' CONCURRENT_VOLUME_GROUP= DISK= AIX_CONNECTIONS_SERVICES= 
AIX_FAST_CONNECT_SERVICES= APPLICATIONS= SNA_CONNECTIONS= MISC_DATA= INACTIVE_TAKEOVER='false' 
DISK_FENCING='false' SSA_DISK_FENCING='false' FS_BEFORE_IPADDR='false'
After configuring HACMP, it's time to configure HAGEO.  Configuring HAGEO
              The HAGEO configuration involves some basic steps:
              
             
              To import the HACMP configuration: Import the HACMP configuration. 
               Start GEOmessage. 
               Configure the actual mirroring device drivers. 
               Start the mirroring process.
              
             
/usr/sbin/krpc/krpc_migrate_hacmp
To start GEOmessage:  
             
/usr/sbin/krpc/cfgkrpc -ci
Configuring HAGEO Mirroring Device Drivers  I mentioned previously that Sybase knows about the AIX logical 
              volumes and reads and writes directly to them. To configure the 
              HAGEO device drivers and have the application work without reconfiguration, 
              I will rename the AIX logical volumes to something else and create 
              the HAGEO device drivers with the names of the AIX logical volumes.
              Look at Figure 2 and see that the AIX logical volumes were renamed 
              using the following names:
              
             
              We then created the GEOmirror devices (GMD's) with the previous 
            AIX LV names (the ones Sybase is configured for): /dev/appslv00 rename to /dev/appslv00_lv 
               /dev/sybaselv01 rename to /dev/sybaselv01_lv 
               /dev/sybaselv02 rename to /dev/sybaselv02_lv 
               /dev/sybaselv03 rename to /dev/sybaselv03_lv 
               /dev/sybaselv04 rename to /dev/sybaselv04_lv
              
             
/dev/appslv00
/dev/sybaselv01
/dev/sybaselv02
/dev/sybaselv03
/dev/sybaselv04
Here's the command to create a GMD "/dev/appslv" 
            on "labor1":  
             
mkdev -c geo_mirror -s gmd -t lgmd -l'appslv00' '-S' -a minor_num='1' 
-a state_map_dev='/dev/appslv_sm' -a local_device='/dev/rappslv00_lv' -a
device_mode='mwc' -a device_role='none' -a remote_device='labor2@/dev/rappslv00_lv'
where "appslv" is the GMD name, "/dev/appslv00_sm" 
            is an AIX logical volume used as a log (called a "state map"), 
            "/dev/rappslv00_lv" is the local AIX logical volume 
            being mirrored and labor@/dev/rappslv00_lv is the remote AIX 
            logical volume mirror target.  The other GMDs were created pointing to the appropriate logical 
              volumes:
              
             
              Almost Ready to Start HAGEO Mirroring /dev/sybaselv01 (mirrored the local and remote /dev/sybaselv01_lv) 
               /dev/sybaselv02 (mirrored the local and remote /dev/sybaselv02_lv) 
               /dev/sybaselv03 (mirrored the local and remote /dev/sybaselv03_lv) 
               /dev/sybaselv04 (mirrored the local and remote /dev/sybaselv04_lv) 
              After configuring HACMP and the HAGEO devices, we are almost ready 
              to start mirroring across the WAN. Before doing this, however, we 
              need to tell HAGEO what the good copy is using the /usr/sbin/gmddirty 
              and /usr/sbin/gmdclean commands. By marking one side as "dirty" 
              and the other side as "clean", we are telling HAGEO that 
              all of the data needs to be copied from one site to the other. HAGEO 
              moves data from the "dirty" site to the "clean" 
              site. To mark a site as "dirty" or the site to copy from 
              (in this case "labor1"):
              
             
/usr/sbin/gmd/gmddirty -l appslv (and do this for all of the other GMD's)
To mark a site as "dirty" or the site to copy to (in this 
            case "labor2"):  
             
/usr/sbin/gmd/gmdclean -l appslv (and do this for all of the other GMD's)
Starting the HAGEO Mirroring  Now that we have configured HACMP/HAGEO, all we have to do is 
              start HACMP. HACMP will then start the HAGEO GMDs and commence the 
              mirroring.
              Here's the command to start HACMP:
              
             
/usr/sbin/cluster/etc/rc.cluster -boot '-N' '-b'  '-i'
Final HAGEO Configuration  The outcome of all this is shown in Figure 2. If you look closely 
              at the Sybase DBMS, it is writing to what it thinks are the AIX 
              logical volumes. Sybase is actually sending the data to the HAGEO 
              GMDs. The HAGEO GMDs are sending the data to the AIX logical volumes 
              and to the remote site. This is remote mirroring using HAGEO.
              Performance Issues
              As far as the configuration and mirroring functionality was concerned, 
              everything went smoothly. We ran into severe problems with performance. 
              Again, the systems that we installed were several times faster (using 
              multiple processors) than the previous system, so we did not expect 
              performance problems in terms of the system. We did not know, however, 
              the application-write characteristics, so we expected some performance 
              issues but not severe ones.
              We were surprised that many aspects of the application were single-threaded 
              and did not utilize the multiple processors of the system. A good 
              deal of work had to be done by the customer and application vendor 
              to streamline the application, and in some cases change it, so it 
              could use multiple processors. This work was unexpected and caused 
              several months of delay in the implementation of the solution.
              Another performance issue was the use of the state maps by the 
              HAGEO product. The state maps are AIX logical volumes that are used 
              to log changes to the application logical volumes. This allows HAGEO 
              to keep track of changes to the logical volumes and synchronize 
              only the changes. We had initially thought that scattering the state 
              maps across drives would give us good performance. After some testing, 
              it was discovered that better performance was achieved by placing 
              those state maps together in a single drive (and without any application 
              logical volumes). This change alone gave us a 20% improvement in 
              the performance tests that we were using (which were specific to 
              this application).
              After the performance testing and changes, we were able to mirror 
              data across the sites with performance levels satisfactory to the 
              customer. The performance was actually only slightly better then 
              their old systems but with more data integrity. Also, with the integration 
              of HAGEO with IBM's HACMP product, when the primary system 
              failed, the secondary system could automatically detect the error 
              and be up and running with the customer application within 20 minutes.
              This cluster has been in production for a couple of years now, 
              and has worked well. Again, it has required good systems administration 
              skills to maintain, but in the several cases where the primary system 
              has failed (for hardware reasons mostly), the secondary system successfully 
              took over the workload within the expected amount of time. This 
              particular configuration cost the customer approximately $500K (not 
              including application changes).
              Summary
              I have presented several methods of mirroring data between sites 
              focusing on the real-time mirroring techniques. This is by no way 
              a detailed or all-inclusive summary but describes the most common 
              techniques for achieving this function.
              If the database vendors could make the data replication capabilities 
              of their products easier to implement, there would certainly be 
              a lot more usage of their products. Hardware storage vendor solutions 
              currently provide very good performance at short distances and probably 
              have the best potential of all of the solutions in terms of functionality. 
              They also are the most expensive.
              I also discussed IBM's HAGEO software solution as implemented 
              in a real-life client situation. This provided the functionality 
              the customer was looking for along with the performance that was 
              acceptable to his clients.
              Primitivo Cervantes is an IT Specialist who has worked as a 
              consultant for the last nine years. He has been in the computer/systems 
              industry for fifteen years and has specialized in high-availability 
              and disaster-recovery systems for the last seven years.
           |