To make it short: There is absolutely NO real Cluster-Filesystem (like GFS or OCFS2) for FreeBSD at present. Also other projects for distributed filesystems like GlusterFS, PVFS or DRBD are not ported to FreeBSD, or the ports are very old.
Since I was in the need to have four identical data-filesystems (which have to be in sync just seconds after the upload), I wrote a little work-around for rsync, using the FreeBSD audit-system. The idea to use the audit-system for triggering the rsync I got from Luke Marsden, who is monitoring filesystem activity with audit_control and some python-scripts.
First of all, the audit_system must be activated and configured. The event-auditing is part of FreeBSD and has to be compiled into the kernel.
Add the following line to your kernel configuration:
options AUDIT
Then rebuild and reinstall your kernel as described in the FreeBSD Handbook
After this, add the following line to your /etc/rc.conf
auditd_enable=”YES”
The next step is, to configure the audit-system: Open the file /etc/security/audit_control and change the config to:
dir:/var/audit
flags:fc,fd,fw
minfree:20
naflags:lo
policy:cnt
filesz:0
That’s all for now. You can now start the audit-system by either calling
/etc/rc.d/auditd start
or by rebooting your system.
If rsync isn’t already installed on your system, you may do this by using the ports:
cd /usr/ports/net/rsync
make
make install
Installation of rsync should be no issue.
Next step is, to set an alternative path to your data-directory, using a symbolic link (I’ll explain later why).
ln -s /path/to/your/data/ /alternative_data_path/
Now we have to configure rsync to run as daemon. Therfor we create (or change) the config for rsync: /etc/rsyncd.conf
max connections = 5
log file = /var/log/rsync.log
timeout = 30[shareName]
comment = Name of this “Rsync mount”
path = /alternative_data_path/
read only = no
list = yes
uid = validUser
gid = validGroup
hosts allow = ,
hosts deny = *
To start the rsync daemon, you have to call:
/usr/local/bin/rsync –config=/etc/rsyncd.conf –daemon
It is perhaps a good idea to monitor the rsyncd with the daemontools to make sure, the rsync-service is always available (you have then to run it with the –no-detach option).
#!/usr/bin/perl ## # This software is published under the Apchae 2.0 licenses. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # # Author: Erik Scholtz # Web: http://blog.elitecoderz.net ### # We are strict, cauz we are elitecoderz! use strict; use threads qw(yield); use threads::shared; use Thread::Semaphore; # No caching $|=1; ################ # Configuration my $debug = 1; # 0/1 to enable logging to the console or disable my $path = '/path/to/your/data/'; # Path to sync my @cmds; # Syncer commands that should be executed $cmds[0] = '/usr/local/bin/rsync -raz --progress --size-only /path/to/symboliclink/data/<!--target--> rsync:///shareName/<!--target-->'; $cmds[1] = '/usr/local/bin/rsync -raz --progress --size-only /path/to/symboliclink/data/<!--target--> rsync:///shareName/<!--target-->'; $cmds[2] = '/usr/local/bin/rsync -raz --progress --size-only /path/to/symboliclink/data/<!--target--> rsync:///shareName/<!--target-->'; ############################################################################### # DO NOT CHANGE ANYTHING BELOW THIS LINE, UNLESS YOU KNOW WHAT YOU ARE DOING! # ############################################################################### ### # Set Threads yield threads->yield(); # SetUp some thread-shared variables my $commands :shared; $commands = &share([]); my $run :shared; $run = &share({}); $run->{'status'} = 1; my $sema :shared; $sema = &share({}); # Local array where all syncer-threads are stored my @threads; # Create a thread for each syncer my $maxid = -1; for (my $i=0;$i<=$#cmds;$i++) { print "Starting syncer $i\n" if $debug; $sema->{$i} = Thread::Semaphore->new(0); my $syncer = threads->create('syncJob',$run,$sema,$i,$commands,$path,$cmds[$i],$debug); push(@threads,$syncer); $maxid = $i; } # Create the Checker thread, which cleanup the jobs and ensures the function of all syncers $sema->{'checker'} = Thread::Semaphore->new(0); my $syncer = threads->create('JobChecker',$run,$sema,$commands,$maxid,$debug); push(@threads,$syncer); # Create the audit thread print "Starting audit\n" if $debug; my $auditthread = threads->create('audit',$run,$sema,$commands,$path,$maxid,$debug); print "Waiting for audi to terminatet\n" if $debug; $auditthread->join(); # If the audit-thread gets joinable, we have to terminate everything # Terminate all threads and cleanup $run->{'status'} = 0; while ($#threads >=0 ) { my $worker = shift(@threads); print "Shutdown of syncer ...\n" if $debug; $worker->join(); } print "Shutdown clean completed\n" if $debug; exit(0); ######################################################################################################################################## ######################################################################################################################################## #################################################################################################### # audit thread sub audit { my $r = shift; my $sp = shift; my $c = shift; my $p = shift; my $m = shift; my $d = shift; print " audit started ...\n" if $d; # open listener on the audit device open(STATUS, "/usr/sbin/praudit /dev/auditpipe |") || die "can't fork: $!"; while (<STATUS>) { my $line = $_; last if ($line eq '' || $r->{'status'}<=0); # Terminate if audit terminated if ($line =~ /path,$p(.+)/) { # Check if the changed file is in the observed path my $file = $1; print "Change detected on file: $file\n" if $d; my $hash :shared; # Create a command for the syncers $hash = &share({}); $hash->{'file'} = $file; $hash->{'status'} = ''; $hash->{'time'} = ''; for (my $j=0;$j<=$m;$j++) { # init job done charta $hash->{$j} = 'no'; } if (1) { lock($c); push(@{$c},$hash); print "Added new job for $file\n" if $d; } for (my $j=0;$j<=$m;$j++) { # wakeup syncers $sp->{$j}->up(); } $sp->{'checker'}->up(); } } close STATUS || die "audit not closed correctly: $! $?"; print " audit terminated ...\n" if $d; return(0); } #################################################################################################### # syncer thread sub syncJob { my $r = shift; my $sp = shift; my $id = shift; my $c = shift; my $p = shift; my $e = shift; my $d = shift; print " syncer $id started ...\n" if $d; while ($r->{'status'}>0) { if ($#{$c}>=0) { # if there are any jobs to be done for (my $j=0; $j<=$#{$c}; $j++) { next if ($c->[$j]->{$id} eq 'ok'); # if my job is already done skip this job and check next my $file = $c->[$j]->{'file'}; if (-e $p.$file) { # check if the file is existing $c->[$j]->{$id} = 'working'; # mark this job as being worked on my $dif = 1; while ($dif>0) { # check if the file is in upload and changes size within 1,5 secs print "Checking Filesize ...\n" if $d; my $ssize = -s $p.$file; sleep(1.5); my $eesize = -s $p.$file; $dif = $eesize - $ssize; print "Checking Filesize $ssize - $eesize = $dif\n" if $d; } my $cm = $e; $cm =~ s/<!--target-->/$file/g; system($cm); # rsync to other server } lock($c); $c->[$j]->{$id} = 'ok'; # mark job as done for me } } $sp->{$id}->down(); } print " syncer $id terminated ...\n" if $d; return(0); } #################################################################################################### # checker thread that checks if all jobs are done sub JobChecker { my $r = shift; my $sp = shift; my $c = shift; my $m = shift; my $d = shift; print " checker started ...\n" if $d; while ($r->{'status'}>0) { while ($#{$c} >= 0 && $r->{'status'}>0) { print " Checker loop ...\n" if $d; my $rem = 0; foreach my $job (@{$c}) { # loop through all jobs my $mem = 'ok'; for (my $j=0;$j<=$m;$j++) { # check job done charta if ($job->{$j} eq 'no') { # job not handled $mem = 'no' if ($mem ne 'working'); # job not handled (may never override a job in progress state) } elsif ($job->{$j} eq 'working') { # job in progress (always overrides not handled) $mem = 'working'; } } # Job not completed if ($mem eq 'no') { if ($job->{'time'} eq '') { # Set timestamp to know, how long this job is already waiting $job->{'time'} = time; } else { # Job already got a timestamp my $watch = time - $job->{'time'}; print "Job age: $watch\n" if $d; if (time - $job->{'time'} > 300) { # Job has waited for more than 5 minutes. terminate program print "TIME FOR JOB EXCEEDED - shutting down syncer"; $r->{'status'} = 0; for (my $j=0;$j<=$m;$j++) { # wakeup syncers $sp->{$j}->up(); $sp->{'checker'}->up(); # wakeup ourself } } } } elsif ($mem eq 'working') { # job in progress - just actualize the timestamp $job->{'time'} = time; } else { $job->{'status'} = 'complete'; # job is completely done and is marked for being removed $rem = 1; } } # Job to remove available if ($rem > 0) { lock($c); # lock the command-queue my @arr; for (my $j=0;$j<=$#{$c};$j++) { # store all not handled jobs / drop completed jobs my $ex = shift(@{$c}); if ($ex->{'status'} ne 'complete') { push(@arr,$ex); } } for (my $j=0;$j<=$#arr;$j++) { # put all stored (not finished) jobs back into the command queue push(@{$c},$arr[$j]); } } print " Checker reloop ...\n" if $d; sleep(1); } print " Checker sleeping (".$#{$c}.")...\n" if $d; $sp->{'checker'}->down(); } print " checker terminated ...\n" if $d; return(0); }
This script does the whole magic: It listens via the audit-system for files changed or added and then uses rsync to sync the file to the other systems. And here we come to the part why we need a symbolic link to the data directory: when the script uses rsync to copy a file to a second system, the audit-system of this second system will notify the script there about this change. So the script on the second system would start to copy the file back to the first system and so on. So if you do not use a symbolic link for the rsync, you will create an endless loop of copy and recopy-processes!
Copy this script to each system that should be kept in sync with the others. I recommend to observe this script via daemontools too. Then edit the script on each system as shown below:
$debug can be set to 0 (for no debug output) or 1 (for debugging output).
$path should be set to the physical path of your data.
For each system that should be kept in sync add the following line. Please keep in mind to increase the number in the square-brakets ($cmds["number"]) by 1 in each line:
$cmds[0] = ‘/usr/local/bin/rsync -raz –progress –size-only /path/to/symboliclink/data/ rsync:///shareName/‘;
Before changing anything on your system, make sure you have a complete backup of your system! The usage of this script and howto is at your very own risk. So if you suffer any data-losses by using this howto or the script you can not hold me responsible for this.
To get close to a “realtime sync”, the script starts an own thread for each volume to keep in sync. So you need to have a perl-installation that is thread-enabled.
After over one year working together with Danny Braniss and testing several thousands of options, settings and configurations, I managed to get the iStore iSCSI-device working together with FreeBSD.
Just to remember. The following error occured, when trying to write an UFS filesystem to the device:
newfs -O2 /dev/da0s1
/dev/da0s1: 782023.5MB (1601584044 sectors) block size 16384, fragment size 2048
using 4256 cylinder groups of 183.77MB, 11761 blks, 23552 inodes.
super-block backups (for fsck -b #) at:
160, 376512, 752864, … … …
1601377920
internal error: can’t find block in cyl 0
And in dangerously dedicated mode:
“# newfs -O2 /dev/da0“
Creating the lable in this mode fails with the message:
newfs -O2 /dev/da0
/dev/da0: 782023.5MB (1601584044 sectors) block size 16384, fragment size 2048
using 4256 cylinder groups of 183.77MB, 11761 blks, 23552 inodes.
super-block backups (for fsck -b #) at:
160, 376512, 752864, … … …
1601377920
internal error: cg 0: bad magic number
The important hint I got from a test with a PetaStor system, where everything worked perfectly. On the FreeBSD-FS mailinglist, I got the last part of the puzzle. Creating the filesystem works with these commands:
“# gpart create -s GPT da0"
# gpart show da0
# gpart add -b 34 -s 20971519 -t freebsd-ufs -l AnosLabel da0
# newfs -O2 /dev/da0p1
Important: Replace 20971519 by the size of your device, given by gpart show da0.
Today I had to install a FreeBSD 6.3 on a server with a “Tyan S2925 Tomcat n3400B” mainboard. Unfortunatly, the chipset of this board (nForce Pro 3400 / MAC with Marvell 88E1116-CAA Gigabit Ethernet PHY) isn’t supported by FreeBSD 6.3. So after successfully installing the system, the ifconfig looked like this:
/root# ifconfig
lo0: flags=8049mtu 16384
inet6 fe80::1%lo0 prefixlen 64 scopeid 0×4
inet6 ::1 prefixlen 128
inet 127.0.0.1 netmask 0xff000000
In this special case the onboard network interfaces had to be used, so I checked up FreeBSD 7.0 – and – I was lucky!
After downloading the driver-files from the FreeBSD sourectree, I made a Makefile for this driver:
.PATH: ${.CURDIR}
KMOD= if_nfe
SRCS= if_nfe.c miibus_if.h opt_bdg.h device_if.h bus_if.h pci_if.h.include <bsd.kmod.mk>
So, the directory with the files looked like this:
drwxr-xr-x 4 root wheel 512 Dec 22 17:11 ..
-rw-r–r– 1 root wheel 156 Nov 24 2007 Makefile
-rw-r–r– 1 root wheel 82631 Nov 24 2007 if_nfe.c
-rw-r–r– 1 root wheel 10223 Nov 24 2007 if_nfereg.h
-rw-r–r– 1 root wheel 3633 Nov 24 2007 if_nfevar.h
Compiling the driver with a simple “make” on the commandline made no problems, but after installing the driver, there were some wired effects, that prevent the driver from working fine (You shouldn’t install the driver at this time, so I explain this later). After a lot of more work I found out, that the native nve-Support in the kernel must be disabled (you should do this first):
By editing the Kernel Config (Here you find how to make a new kernel for FreeBSD) and commenting out the following line:
device nve # nVidia nForce MCP on-board Ethernet Networking
change to:
#device nve # nVidia nForce MCP on-board Ethernet Networking // removed 21.12.2008 sch
After building the new kernel, the devices can be used as expected. To install the driver you compiled, you must copy the resulting driver into the modules directory:
/root# cp if_nfe.ko /boot/modules/.
To load this driver at boot-time of the system change the /boot/loader.conf, by adding the following line:
if_nfe_load="yes"
You do not need to restart the system. To load the driver during runtime, simply type the following into your command-line:
kldload if_nfe
After this, your ifconfig will look like this:
/root# ifconfig
nfe0: flags=8843mtu 1500
options=1b
inet 192.168.1.110 netmask 0xffffff00 broadcast 192.168.1.255
ether 00:e0:81:b5:45:08
media: Ethernet autoselect (100baseTX)
status: active
nfe1: flags=8802mtu 1500
options=1b
ether 00:e0:81:b5:45:09
media: Ethernet autoselect (none)
status: no carrier
lo0: flags=8049mtu 16384
inet6 fe80::1%lo0 prefixlen 64 scopeid 0×4
inet6 ::1 prefixlen 128
inet 127.0.0.1 netmask 0xff000000
That’s it – happy networking!