Centralized User Data
On the ClearOS 5.x release, important user data is stored in various locations on the file system. For example:
- Mail is stored in /var/spool/imap and /var/lib/imap
- Home directories are found in /home
- LDAP is stored in /var/lib/ldap, but backup dump files are in /etc
Centralizing this user data not only helps with storage management, but also simplifies backup. This is a draft proposal, so please feel free to send us your feedback. It is far from baked but the concept lends itself to a lot of possibilities including, user data preservation between installs, data migration capabilities, foreign mounts, clustering and other options.
Problem
Part of the problem with Linux systems is the diverse and varied ways in which user data is stored. Because of a long legacy with POSIX standards, code ends up being placed in many and varied locations depending on the distribution and also between versions. Moreover, because the data is stored in varied locations volume management becomes a chore and being able to get to your data centrally can cause grief as an administrator tries to shuffle around data and partitions.
From this problem it would seem that the solution is to, once and for all, move the data. This notion is completely untenable because third party applications will want to reference this data in areas that are 'standard' unto themselves.
The need for a centralized data management policy and a consistent interface is exacerbated by the rapid adoption of iSCSI storage systems and by the expanding use of eSATA attached storage. iSCSI storage is by its very nature storage that can be (and often is) shared by many local and remote computer systems. The absense of a centralized storage policy and a consistent implementation voids the ability to gain maximum efficient and effective use of these emerging storage technologies and will be a blocker to adoption of ClearOS.
As we move to new ideas and concepts associated with de-coupling Webconfig to the OS we need to address this issue because other platforms also suffer from this data management nightmare as well. The solution seems to be to move the files but to keep them where they are. We can do both.
Solution Overview
We propose a standard for volume management that keeps user and customer data centralized. This gives the customer the following advantages.
- Centralized data is easier to locate and is therefore easier to backup
- Centralized data can be added nicely to clustered volumes
- Additional storage can be added and shows up in the same general location
- Easier data and service migrations between servers
With this centralized data, Webconfig will use the bind mount method to allocate the data where it is supposed to go. This also helps us to fulfill the objective of having distro-agnostic Webconfig. By storing the user and customer data in a specific central location, we can move the data between different distributions and place the data back in the place that the specific distribution wants it to be.
A Webconfig interface for managing user data types and bind mounts must be created to allow for manipulation of the data stores.
Standard Locations
The standard location for user and customer data is called /store. Under this directory is mounted the various physical, virtual, exported, and targeted mount points on the system. These mounts are ordinal and called data0, data1,….dataN. Under these mount points are the following directories:
Location | Description |
---|---|
live | contains live data used by the current system |
backup | contains backup data |
log | contains log files specific to volume management |
sbin | contains script and executable information for the volumes |
So some example paths would look like:
- /store/data0/live/
- /store/data0/backup/
- /store/data0/log/
- /store/data0/sbin/
Live
The live directory contains server hostnames for directories. This represents to the webconfig the current allocated server who's hostname that presents the data. In case of multiple systems presenting the same data, linking methods will be used or the clustered presentation name will be used for clustered systems.
Under this directory is the bind mount devices. Currently this includes:
Location | Description |
---|---|
shares | mapped to /var/flexshare/shares |
mailman | mapped to /var/lib/mailman |
www-virtual | mapped to /var/www/virtual |
www-default | mapped to /var/www/html |
www-cgi-bin | mapped to /var/www/cgi-bin |
home | mapped to /home |
imap | mapped to /var/spool/imap |
mysql | mapped to /var/lib/mysql |
samba-profiles | mapped to /var/samba/profiles |
samba-netlogon | mapped to /var/samba/netlogon |
samba-drivers | mapped to /var/samba/drivers |
Backup
Contains backup data. TODO.
Log
Contains centralized volume log data. TODO.
Sbin
Contains scripts and tools for centralized volume management. TODO
Complications
Run-time Automagic Required
The mapping needs happen at run-time (i.e. before the application starts up for the first time) or at module install time (e.g. a user installs the web server component via webconfig). Two reasons:
- The target directory (e.g. /var/www/html) may not exist at OS install time.
- The required owners and groups may not exist at OS install time (e.g. apache, winadmin, cyrus) so file ownership changes are not possible.
Action: implementation must be done at run time.
Mixed Data / Non-Data
The /var/lib/imap directory is a bit messy. This directory contains meta-data on mailboxes, including:
- Mailbox folders and permissions (mailboxes.db)
- Mailbox types (annotations.db)
- Per user filtering rule (sieve directory)
For now, this can be considered “configuration” data and outside the scope of the data store. However, we may run into this kind of messy situation in other third party applications.
Action: a pre-mount script could be used to migrate data from the original directory to the new data store directory
Pre-populating Data Directory
When the Apache web server software is installed, the system drops a default index.html and logo into /var/www/html. Leaving this directory empty would cause a 404 error, so the directory should be populated with something.
Action: a post-mount script is required.
Failed Mount Should Cause Application To Not Start
If the “mount bind” fails for some reason or a Layer 8 issue comes along, we want to prevent the target application from starting. For example, we don't want the MySQL server to startup using the old & empty /var/lib/mysql directory. That would be… bad.
Action: hack the init.d scripts from upstream. In the event the this happens, raise immediate alarm via pager, or other alert system.
Migration
It should be possible to manually migrate from a non-datastore implementation to a datastore implementation. It should also be possible to manually move stores around, e.g. move /store/data0/www to /store/data1/www.
Action: Version 2 allow the end user to “stop” a particular application's data store, move files, then “start” the application's data store. For example:
- Copy files from old data store to a temp directory (e.g. cp /var/www/html /tmp/myhtml)
- Run: datastore httpd stop
- Edit /etc/datastore.d/httpd.conf to whatever is desired (e.g. /store/data11 instead of /store/data0)
- Run: datastore httpd start
- Copy/move the files from temp directory to new data store
Of course, you could skip the temporary directory copy by adding even more functionality:
- Edit /etc/datastore.d/httpd.conf to whatever is desired (e.g. /store/data11 instead of /store/data0)
- Run: datastore httpd remap
- Copy/move the files from old data store to the new data store
- Run: datastore httpd start
Implementation
Configuration
Data store configuration is stored in /etc/datastore.conf and configlets are stored in /etc/datastore.d/.
/etc/datastore.conf
The /etc/datastore.conf contains the default mapping and any other global configuration variables that may be required. Examples:
base=/store
base=/store/data0/server1
/etc/datastore.d/httpd.conf
A module (e.g. app-httpd) can drop a configlet file into /etc/datastore.d. This configlet contains mapping details for the particular application, for example:
$mapping['/var/www/virtual']['target'] = '%{base}/www/virtual'; $mapping['/var/www/virtual']['permissions'] = '0755'; $mapping['/var/www/virtual']['owner'] = 'root'; $mapping['/var/www/virtual']['group'] = 'root';
$mapping['/var/www/cgi-bin']['target'] = '%{base}/www/cgi-bin'; $mapping['/var/www/cgi-bin']['permissions'] = '0755'; $mapping['/var/www/cgi-bin']['owner'] = 'root'; $mapping['/var/www/cgi-bin']['group'] = 'root';
$mapping['/var/www/www']['target'] = '%{base}/www/www'; $mapping['/var/www/www']['permissions'] = '0755'; $mapping['/var/www/www']['owner'] = 'root'; $mapping['/var/www/www']['group'] = 'root';
In the future, a premount and postmount hook should be added. For example, populating the web server store with a default web page could be a postmount action. These hooks could also be used to migrate existing files to data store.
$premount = '/usr/share/system/modules/httpd/do-something'; $postmount = '/usr/share/system/modules/httpd/add-default';
Installation
When a data-store aware application is installed (e.g. app-httpd), the configlet file will be installed in /etc/datastore.d. This should then trigger the datastore script to:
- Create the data store directories with proper permissions
- Add the /etc/fstab entries
- Mount the datastore directories
- Run the postmount script (if defined)
Upstream
All software packages that use the data store should refuse to start if the datastore script fails for any reason. All init.d scripts need an added hook.
New Boxes
The best time to implement these changes seems to be at system installation time. Here is the list of additions and here is a list of command which gets this all going. Reboot when your are done.
Prep drive
You can relabel your drive that you use for data. Use this command only if you store data on a separate disk from the rest of your OS. Use this command (replace the device name with your data drive):
tune2fs -L data0 /dev/sda5 && tune2fs -l /dev/sda5
Also, make the mount point for the data drive.
mkdir /store /store/data0
You will need to update the label in your /etc/fstab for this device, you should also change the mount point to /store/data0.
For example:
LABEL=/ / ext3 defaults 1 1 LABEL=data0 /store/data0 ext3 defaults 1 2 LABEL=/boot /boot ext3 defaults 1 2
Additions to /etc/fstab
Add these entries also to your fstab.
/store/data0/live/server1/home /home none bind,rw 0 0 /store/data0/live/server1/imap /var/spool/imap none bind,rw 0 0 /store/data0/live/server1/mysql /var/lib/mysql none bind,rw 0 0 /store/data0/live/server1/root-support /root/support none bind,rw 0 0 /store/data0/live/server1/samba-drivers /var/samba/drivers none bind,rw 0 0 /store/data0/live/server1/samba-netlogon /var/samba/netlogon none bind,rw 0 0 /store/data0/live/server1/samba-profiles /var/samba/profiles none bind,rw 0 0 /store/data0/live/server1/shares /var/flexshare/shares none bind,rw 0 0 /store/data0/live/server1/www-cgi-bin /var/www/cgi-bin none bind,rw 0 0 /store/data0/live/server1/www-default /var/www/html none bind,rw 0 0 /store/data0/live/server1/www-virtual /var/www/virtual none bind,rw 0 0
Migration commands and setup
Provided you've done all the previous commands, you should be able to run these commands and transition all the data to the new bind mounts.
Mount the device
mount /store/data0
Make the directories
mkdir /store/data0/live /store/data0/live/server1 cd /store/data0/live/server1 mkdir shares www-virtual www-default www-cgi-bin home imap mysql samba-profiles samba-netlogon samba-drivers root-support mkdir /root/support mkdir /var/samba/drivers /var/samba/netlogon /var/samba/profiles
Set Permissions
chown cyrus:mail imap && chmod 700 imap chown mysql:mysql mysql chmod 750 /root/support ./root-support chown winadmin:domain_users samba-drivers/ samba-netlogon/ samba-profiles/ chmod 755 samba-drivers/ samba-netlogon/ && chmod 775 samba-profiles/ chmod g+s samba-drivers/ samba-profiles/ samba-netlogon/
Duplicate the data
rsync -av /home/. ./home service cyrus-imapd stop rsync -av /var/spool/imap/. ./imap service mysqld stop rsync -av /var/lib/mysql/. ./mysql rsync -av /root/support/. ./root-support rsync -av /var/samba/drivers/. ./samba-drivers rsync -av /var/samba/netlogon/. ./samba-netlogon rsync -av /var/samba/profiles/. ./samba-profiles service smb stop service nmb stop service winbind stop service postfix stop service proftpd stop service httpd stop rsync -av /var/flexshare/shares/. ./shares rsync -av /var/www/cgi-bin/. ./www-cgi-bin rsync -av /var/www/html/. ./www-default rsync -av /var/www/virtual/. ./www-virtual
Mount the devices
mount -a
Show your mounts
mount
Reboot