DRBD on ClearOS 6.X
DRBD is a data replication engine that works well in redundant ClearBOX configurations. With DRBD you can replicate the data storage so that in the event of a failover, data from the volumes will be available to the redundant server.
Getting started
Hardware
ClearBOX 300 is an appropriate platform for DRBD because it has sufficient network interfaces to handle replication of data and the required heartbeat engine for fail over. For greater storage, use ClearBOX 400 or contact ClearCenter for additional hardware options.
At a minimum, you will need a connection for:
DRDB connection requirements |
---|
LAN facing interface. This is where you will offer services. |
Replication interface. This should be a dedicated connection for data transfer only |
Heartbeat interface. This should be used for supporting heartbeat services. Can be a NIC or Serial interface. But using BOTH is preferred |
You will need to designate a box to be your primary and a separate box to be your secondary. The disk(s) of the secondary box must meet or exceed the performance of the primary so that replication will not bottleneck at the secondary. A backlog of data means your data is NOT well replicated.
Demonstration
For this demonstration, we will be using two boxes with 3 network cards each. The servers are set to standalone mode.
- eth0
- Role: External
- Network facing interface
- eth1
- Role: LAN
- Sync network. Crossover between boxes.
- eth2
- Role: LAN
- Heartbeat network. Crossover between boxes.
The address designations for these boxes in our example are as follows:
- server1
- eth0: 10.10.10.141
- eth1: 192.168.40.40
- eth2: 192.168.41.40
- server2
- eth0: 10.10.10.238
- eth1: 192.168.40.41
- eth2: 192.168.41.41
Prerequisites
Ensure that your system is up to date by running the following:
yum update
Set up the hosts file on your servers so that they are the same. For example, they may look similar to this:
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 #::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 10.10.10.141 server1.example.com server1 10.10.10.238 server2.example.com server2 192.168.40.40 server1-s.example.com server1-s sync1 server1-sync primary master 192.168.40.41 server2-s.example.com server2-s sync2 server2-sync backup slave 192.168.41.40 server1-hb.example.com server1-hb 192.168.41.41 server2-hb.example.com server2-hb
Software
You will need to install the DRDB software packages for DRDB. These are currently maintained in a testing repository and can lead to further issues when run in a production environment. Again, this howto is for experimentation and testing at this point, use at your own caution.
Install DRBD by running the following command:
yum --enablerepo=clearos-dev install drbd
It will install other packages as well, the list should look like this:
================================================================================ Package Arch Version Repository Size ================================================================================ Installing: drbd x86_64 8.4.2-1.v6 clearos-dev 26 k Installing for dependencies: drbd-udev x86_64 8.4.2-1.v6 clearos-dev 6.0 k drbd-utils x86_64 8.4.2-1.v6 clearos-dev 258 k Transaction Summary ================================================================================ Install 3 Package(s)
Once this is installed you will be able to set up DRBD using the next section.
ClearBOX
If you are using ClearBOX, you will probably want to decouple the /store/data0 partition and reuse this for your data. Be sure to backup this partition before performing the steps.
Stop the services associated with the data partition (prevent them from automatically starting up as well, you will likely will add these services to our high availability (heartbeat) daemon). For example:
service httpd stop service smb stop service nmb stop service cyrus-imapd stop service mysqld stop chkconfig --level 2345 httpd off chkconfig --level 2345 smb off chkconfig --level 2345 nmb off chkconfig --level 2345 cyrus-imapd off chkconfig --level 2345 mysqld off
Unmount the bind mounts (must be logged in as root, not su):
umount /home umount /var/spool/imap umount /var/lib/mysql umount /root/support umount /var/samba/drivers umount /var/samba/netlogon umount /var/samba/profiles umount /var/flexshare/shares umount /var/www/cgi-bin umount /var/www/html umount /var/www/virtual
Finally, unmount the data0 partition.
umount /store/data
You will also want to comment out the mount point for the /store/data0 partition in /etc/fstab (this will prevent the system for automatically mounting the volume on reboot). Also add the noauto option:
#/dev/data/data0 /store/data0 ext3 defaults,noauto 1 2
While you are in this file, change the bind mounts to not automatically mount the devices by adding the noauto option:
/store/data0/live/server1/home /home none bind,noauto,rw 0 0 /store/data0/live/server1/imap /var/spool/imap none bind,noauto,rw 0 0 /store/data0/live/server1/mysql /var/lib/mysql none bind,noauto,rw 0 0 /store/data0/live/server1/root-support /root/support none bind,noauto,rw 0 0 /store/data0/live/server1/samba-drivers /var/samba/drivers none bind,noauto,rw 0 0 /store/data0/live/server1/samba-netlogon /var/samba/netlogon none bind,noauto,rw 0 0 /store/data0/live/server1/samba-profiles /var/samba/profiles none bind,noauto,rw 0 0 /store/data0/live/server1/shares /var/flexshare/shares none bind,noauto,rw 0 0 /store/data0/live/server1/www-cgi-bin /var/www/cgi-bin none bind,noauto,rw 0 0 /store/data0/live/server1/www-default /var/www/html none bind,noauto,rw 0 0 /store/data0/live/server1/www-virtual /var/www/virtual none bind,noauto,rw 0 0
Configuring DRBD
Disk
For DRBD to function properly, you will need a volume to replicate. For ClearBOX, we will be using the partition previously used by data0 for this purpose On ClearBOX this is a multi-disk array (even on a single drive system this has been done for easy replacement under predictive failure). Find the data volume by running the following and finding the largest mirror:
cat /proc/mdstat
If you are using a completely unused volume (i.e. a blank disk) you c
Naming
You should adjust your naming of your servers with the DRBD replication network in mind. The servers should be referencing each other via friendly names associated to the DRBD network. For example, if your LAN segment is 10.10.10.x, you will have a separate network for DRBD replication. In our example we will use 192.168.40.x. You will need to have modified your /etc/hosts files as mentioned above (can be done in DNS server settings in Webconfig) so that you have a similar configuration.
SSH trusting
Optionally, you can set up your servers so that they trust each other which will allow you to SSH between them without requiring a password. This can be convenient for transfering files via SCP or rSYNC.
DRBD configs
We will begin by modifying files.
/etc/drbd.conf
This file installs like this by default. Ensure that it looks the same.
# You can find an example in /usr/share/doc/drbd.../drbd.conf.example include "drbd.d/global_common.conf"; include "drbd.d/*.res";
/etc/drbd.d/global_common.conf
For the most part, this file is unchanged from the original that should be installed. Of particular not is the addition of protocol C. You can read more about that here.
global { usage-count yes; # minor-count dialog-refresh disable-ip-verification } common { protocol C; handlers { pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f"; # fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; # split-brain "/usr/lib/drbd/notify-split-brain.sh root"; # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root"; # before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k"; # after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh; } startup { # wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb } disk { # on-io-error fencing use-bmbv no-disk-barrier no-disk-flushes # no-disk-drain no-md-flushes max-bio-bvecs } net { # sndbuf-size rcvbuf-size timeout connect-int ping-int ping-timeout max-buffers # max-epoch-size ko-count allow-two-primaries cram-hmac-alg shared-secret # after-sb-0pri after-sb-1pri after-sb-2pri data-integrity-alg no-tcp-cork } syncer { # rate after al-extents use-rle cpu-mask verify-alg csums-alg } }
/etc/drbd.d/data0.res
If you are using ClearBOX and your old data storage device was md3, you can use the following configuration.
resource data0 { device drbd0; disk /dev/md3; meta-disk internal; on server1-s.example.com { address 192.168.40.40:7789; } on server2-s.example.com { address 192.168.40.41:7789; } }
Otherwise, you will need to create a new configuration with a name that ends in .res and the location on each server. The resource should match between servers or else you will need to get specific on each side of the mirror.
For example, if I wanted to mirror the entire disk /dev/sdb between both my primary and my backup I could use the following for /etc/drbd.d/sdb.res:
resource sdb { device drbd0; disk /dev/sdb; meta-disk internal; on server1-s.example.com { address 192.168.40.40:7789; } on server2-s.example.com { address 192.168.40.41:7789; } }
Once you have the configuration, replicate the data to the backup.
rsync -av /etc/drbd.d/* server2-s.example.com:/etc/drbd.d/
If you don't have rsync installed, run the following (replacing server2-s.example.com with your backup server's hostname):
yum -y install rsync && ssh server2-s.example.com yum -y install rsync
Start DRBD
Once the configuration is in place, you can start the DRBD daemon by running the following:
service drbd start
At this point the server will bring up the volume. You will need to perform these previous actions on both server in order to proceed. Once in place we will issue commands which will coordinate between both servers.
Run the following commands to view the drbd status:
cat /proc/drbd drbd-overview
If the disk you are working with is referenced by DRBD as disk 0, run the following:
drbdsetup 0 invalidate-remote drbdsetup 0 primary
This will start the synchronization process. If you want to monitor the synchronization run the following (use Ctrl+C to stop the monitor):
watch cat /proc/drbd
Setup partition with LVM
You will want to configure the partition with LVM and activate the partition. The specific of setting up the partition with LVM is beyond the scope of the DRBD documentation. Your device is now /dev/drbd0 and NOT /dev/md3.
If /dev/md3 is still in your LVM list, you will need to remove it and add /dev/drbd0 as your LVM volume. Assign it to the name data0. You should get similar results from the following commands:
[root@primary ~]# pvs PV VG Fmt Attr PSize PFree /dev/drbd0 data lvm2 a- 439.97G 13.22G /dev/md2 main lvm2 a- 24.41G 7.53G [root@primary ~]# vgs VG #PV #LV #SN Attr VSize VFree data 1 1 0 wz--n- 439.97G 13.22G main 1 3 0 wz--n- 24.41G 7.53G [root@primary ~]# lvs LV VG Attr LSize Origin Snap% Move Log Copy% Convert data0 data -wi-ao 426.75G logs main -wi-ao 4.88G root main -wi-ao 10.00G swap main -wi-ao 2.00G
Centralized Storage
You will want to set up Centralized Storage for the data on the partition. You will also want to change services so that they don't start unless the data is online. Uncomment the /store/data0 line in /etc/fstab. Also, add the noauto line.
/dev/data/data0 /store/data0 ext3 defaults,noauto 1 2
Mount the data partition and create the necessary structure:
mount /store/data0/ cd /store/data0/ mkdir live cd live/ mkdir server1 cd server1/ mkdir home imap mysql root-support samba-drivers samba-netlogon samba-profiles shares www-default www-virtual www-cgi-bin chown winadmin:domain_users samba* chown mysql:mysql mysql chown cyrus:mail imap
If you have data previously on root partition (ie. not a ClearBOX previously) then you will need to sync data over to the new partition and delete the previous data before mounting the bind mounts.
rsync -av /home /store/data0/live/server1/home rsync -av /var/spool/imap /store/data0/live/server1/imap rsync -av /var/lib/mysql /store/data0/live/server1/mysql rsync -av /root/support /store/data0/live/server1/root-support rsync -av /var/samba/drivers /store/data0/live/server1/samba-drivers rsync -av /var/samba/netlogon /store/data0/live/server1/samba-netlogon rsync -av /var/samba/profiles /store/data0/live/server1/samba-profiles rsync -av /var/flexshare/shares /store/data0/live/server1/shares rsync -av /var/www/cgi-bin /store/data0/live/server1/www-cgi-bin rsync -av /var/www/html /store/data0/live/server1/www-default rsync -av /var/www/virtual /store/data0/live/server1/www-virtual
Once the data is removed, you will need to delete the data from the original locations before mounting the devices via bind mounts or you will not be able to reclaim the disk space. Do this step with caution and make sure your data is properly copied to /store/data0 before proceeding.
Mount the new virtual devices:
mount /store/data0/live/server1/imap mount /store/data0/live/server1/imap/ mount /store/data0/live/server1/mysql/ mount /store/data0/live/server1/root-support/ mount /store/data0/live/server1/samba-drivers/ mount /store/data0/live/server1/samba-netlogon/ mount /store/data0/live/server1/samba-profiles/ mount /store/data0/live/server1/shares/ mount /store/data0/live/server1/www-cgi-bin/ mount /store/data0/live/server1/www-default/ mount /store/data0/live/server1/www-virtual/
Validate your mount points.
mount
At a minimum you should have:
/dev/mapper/data-data0 on /store/data0 type ext3 (rw) /store/data0/live/server1/home on /home type none (rw,bind) /store/data0/live/server1/imap on /var/spool/imap type none (rw,bind) /store/data0/live/server1/mysql on /var/lib/mysql type none (rw,bind) /store/data0/live/server1/root-support on /root/support type none (rw,bind) /store/data0/live/server1/samba-drivers on /var/samba/drivers type none (rw,bind) /store/data0/live/server1/samba-netlogon on /var/samba/netlogon type none (rw,bind) /store/data0/live/server1/samba-profiles on /var/samba/profiles type none (rw,bind) /store/data0/live/server1/shares on /var/flexshare/shares type none (rw,bind) /store/data0/live/server1/www-cgi-bin on /var/www/cgi-bin type none (rw,bind)