👉 This post was initially written in 2007 and referred to specific software versions. When tunning your system, always consider which version you are running. Information below may be outdated. Use it at your own risk.
This post cover a creative solution for a cheap SAN and secure backup architecture solution for small-sized platform (5/10 servers). In this case study we will see how to use Network Block Devices ( nbd) and soft-raid with mdadm.
Initial platform
The initial platform is made of five production servers (Apache, PHP, MySQL, Postfix), one “monitoring server” running with cacti, nagios, POP3, IMAP, Subversion, Trac, Primary DNS, etc. and one “backup server” used as Secondary DNS, MySQL Slave and finally rsnapshot server. Of course I manage all this remotely with my old laptop. In this schema the Backup server can deal only 100 Go of backup.
Cons
- Backup server is a single point of failure (SPOF). If i lost my HD, I lost all my backup data
- Don’t use all my disk space on all my server
- When I need more space on my backup server, I have to rent and move to a new and more expensive server
- Backup server represent 15% of all my monthly cost (Really hudge for so small platform !)
New platform
There is a large set of options in regard to distributed filesystem
(Coda, AFS, LegionFS, MogileFS, DRDB, etc.). Though, for our current use case the need wasn’t to be able to distribute data to a large number of clients with hudge load. Instead, the requirements are:
- ability to grow backup disk space easily
- store backup data in a redundant maner and secure way (removing SPOF)
- cheap (aka not expensive)
I have lot of space unused on all my production servers. So, I choose to use Network Block Devices (NBD) with Raid 6 implementation. In this schema I currently have 4*100 Go (Raid device) that give me 200 Go of available space for my backups.
Why Raid 6 instead of Raid 5?
As I wrote earlier, my goal isn’t read/write performance, but the reliability of the system. Due to the risk of lost of a network member, I choose to use Raid 6 Implementation that let me the possibility to lose up two node at the same time.
Pros
- Distributed backup data between all the array members
- No more SPOF if I crash the backup server (This structure let me the possibility to mount the array on another server or my laptop)
- Use all available space on my servers
- Less expensive platform and easy to upgrade
Cons
- If you crash more than two node of your RAID Array you will lost all your data, this mean that you must take care when you choose to reboot a server : RAID reconstruction is not fast !
- When writing to the RAID device that will does many network i/o and probably use some CPU on your nbd-server
How to do it?
first of all, I install nbd-server
on each server with unused disk space. I choose three server from my five production servers. I’m using Ubuntu Server distrib, so I just have to do an apt-get install nbd-server
. Now I have to build an image disk on each server that will be used for the network block device. I use dd
for build a 100 Go file image then I start nbd-server on each server (node).
dd if=/dev/zero of=/home/nbd/backup.img bs=1024 count=100000000
nbd-server 10130 /home/nbd/backup.img
I encourage you to look at the man page of nbd-server for know more about all the available option.
Next, we have to setup the main server for building the RAID Array. I choose RAID 6. We previously setup 3 servers for doing 3 Network Block Devices. I need one more server for implementing my RAID 6 Array. I choose to setup a file image on the local server where I build the RAID Array, I just need to use loop device with losetup
. Then we “mount” the network block devices locally with nbd-client
(look at the man page too).
dd if=/dev/zero of=/home/nbd/backup.img bs=1024 count=100000000
losetup /dev/loop0 /home/nbd/backup.img
modprobe nbd
nbd-client serv1.example.com 10130 /dev/nbd0 -persist
nbd-client serv2.example.com 10130 /dev/nbd1 -persist
nbd-client serv3.example.com 10130 /dev/nbd2 -persist
We have all our device up on our main server, we just need to use mdadm to build our array and mount our file system to use it like any other partition.
mdadm -create /dev/md0 -raid-devices=4 -level=6 -spare-devices=0 -chunk=256 /dev/nbd[012] /dev/loop0
mkfs.ext3 /dev/md0 -b 4096 -E stride=64
mount /dev/md0 /.snapshots
That’s all folks! I can now run rsnapshot
using my distributed RAID 6 implementation and stopping to lost my unused spaces. NB : I use a chunk size of 256k because of the RAID 6 implementation and the network constraint, performance looks better. Due to this chunk size, I use a block size of 4k and stride of 64 while building the file system (4k x 64 = 256k).******
I need more space, how can I do?
It’s easy! You just need to install a new nbd-server
on a server having enough space then load a new nbd-client on your server running the RAID Array. When it’s done, just use mdadm
to grow your array.
mdadm -grow /dev/md0 -raid-disks=5******
mdadm -add /dev/md0 /dev/nbd4
The server running the RAID Device crashed. How can I get my backup??
It’s easy! You just need to build a new device on another server or on your laptop.
mdadm -create /dev/md0 -assume-clean -raid-devices=4 -chunk=256 -level=6 /dev/nbd[012] missing
“missing” mean that our last device of our array (/dev/loop0) is missing, the RAID Array will start in degraded mode.
I have a big external HD at home for doing a second site backup. What I have to do for copying and read it?
You will just need to copy all the file image (“backup.img”) to your external disk . Take care to stop your RAID Array before (except if you are sure to don’t have any write on your images while doing your copy). Then You will just have to load each images with losetup
locally and building a RAID Array with mdadm
.
To sum up
I think that this platform design is really useful for small sized (5/10 servers) platform because you get a distributed data backup solution at low cost. You don’t need to have dedicated server to manage your backup. However, you have to take care of the touchy aspect of any raid implementation, a wrong manipulation with mdadm or too many server crash could lead you to the lost of all your data. You have to keep in mind that RAID 6 need lot of time to rebuild large storage. If you plan to reboot lot of your servers, don’t forget to stop your RAID Array.
If you are interested in this kind of platform for build your SAN, I encourage you to look at the iSCSI
or AoE
Protocols.