Archive for February, 2006

amanda cron jobs

Wednesday, February 15th, 2006

In addition to adding the commands to prune log files periodically, I also wanted to make it easy to determine from the server’s console which tape is required without having to login or run any amanda commands.  The solution is to redirect the output of the appropriate amadmin command to a virtual tty that is not being used for logins.  Fedora allows the key sequence to select a virtual tty or X session.   By default, getty runs on ttys 1 through 6 and is linked to the first X display.  But tty8 is accessible via and getty is not running on it.  So here is a first cut at a good command:

date;/usr/sbin/amadmin CCH1 tape

With this addition, my current amanda crontab looks like this:

Screen clipping taken: 15/02/2006, 09:37

# backup daily and eject tape when done
0 0 * * * /usr/sbin/amdump CCH1;/usr/sbin/ammt -f /dev/nst0 offline
# During the week, send warning if proper tape is not loaded
0 16 * * 1,2,3,4,5 /usr/sbin/amcheck -m -M pah1@bmw.hapgoods.com CCH1
# Prune old logfiles
0 8 * * * find /var/log/amanda -type f -mtime +25 -exec rm {} \;
0 8 * * * find /var/lib/amanda/CCH1/oldlog/log.* -type f -mtime +25 -exec rm {} \;

Error Messages from NTPD

Wednesday, February 15th, 2006

Lately I’ve noticed that ntpd on ducati and suzuki (but not triumph and bmw) have been generating error messages like this:

can’t open /var/lib/ntp/drift.TEMP: No such file or directory

Indeed, there is no /var/lib/ntp directory on these servers.  Curiously, the directory does exist on bmw and triumph, and I don’t think these error messages have been present on suzuki and ducati since Day One.  What changed?

The solution to the problem is easy enough:

# mkdir /var/lib/ntp

# chown ntp:ntp /var/lib/ntp

# restorecon -v -R /var/lib/ntp

Done.

Yet another new tape drive

Sunday, February 12th, 2006

Matt got tired of having used DLT7000 drives and I agreed to help him get a Quantum VS160 drive working.  He bought it for about $1300 and I’ve got it running some burn-in tests (amtapetype…).  Frankly, I would not have spent so much cabbage, but it is a cool drive: 1/2 height, DLT convenience/reliability, 80GB capacity (160GB compressed), 40MB/s SCSI interface, etc.

amanda log files

Thursday, February 9th, 2006

Amanda generates a ton of log files and debug files by default.  Some are cleared automatically, others are not.  Here are commands to prune the old log files.

# find /var/log/amanda -type f -mtime +25 -exec rm {} \;

Prune old debug files (>> tapecycle)

# find /var/lib/amanda/CCH1/oldlog/log.* -type f -mtime +25 -exec rm {} \;

Prune old log files (>> tapecycle)

References:

http://wiki.zmanda.com/index.php/Amanda_log_files

More advanced amanda configuration

Thursday, February 9th, 2006

Now that I have become more confident with the operation of amanda, I’ve decided to increase the number of tapes in the rotation.  This should allow for a richer file history, a partial measure of redundancy and an opportunity to manage off-site tapes more efficiently.  These are my current parameters:

dumpcycle 1 weeks

runspercycle 5 days

tapecycle 6 tapes

runtapes 1 tapes

bumpdays 1 day

I run amdump every day, but I don’t expect to change tapes but about five times per week.  The holding disk is used to catch the dumps from days where I forget to change tapes.  I have the amflush parameter set so those stored dumps get written out on the next amdump run with an appropriate tape.  With the current parameters:

  • Files that change daily should have six versions available
  • Static files will be on one tape about 4/5 of the time.

This interpretation depends on several factors, including available excess tape capacity and holding disk capacity, the timing of tape swaps, etc.  But if I hold all factors the same and increase the number of tapes to eight (tapecycle 8 tapes) and bumpdays to 2:

  • Files that change daily should have eight versions available
  • Static files will be on one tape about 5/8 of the time.

The bumpdays parameter has a dramatic effect on the redundancy of taped data.  It multiplies the number of days at a particular dump level.  This can effectively multiply the required tape and tape drive capacity as well.  However, it seems to be a prudent move to set it to at least two to ensure that a given file is on at least two tapes two days after its last revision.

Another parameter critical to the understanding of how amanda works is the runspercycle parameter.  I have found a distressingly poor definition of how this parameter works on even the official amanda sites.  Some key insight comes from the original changelog for amanda:

“A new `runspercycle’ keyword in amanda.conf to specify the number
of amdump runs in a dumpcycle. The default is one run every day.  A value of 0 (the default) means the same value as dumpcycle. A value of -1 means guess the number of runs from the tapelist file, which is the number of tape used in the last dumpcycle days / runtapes.  If you don’t run amdump every days, you must set runspercycle
otherwise amanda will not be able to balance the dump. You must set runspercycle to -1 if you want the same behavior as previous version of amanda.”

Pasted from <http://stuff.mit.edu/afs/sipb/user/zacheiss/amanda-2.4.1p1/NEWS>

If confirms that runspercycle is not used to define the maximum targeted time between level zero dumps (that would be dumpcycle), but it is used instead to schedule the dumps needed to both meet the target and outperform it (”balance”).  It is probably used to calculate a factored backup capacity per dumpcycle:

capacity = length * dumpcycle * runtapes * (runspercycle / dumpcycle)

More amrecover

Thursday, February 9th, 2006

Even though amrecover and selinux don’t get along, I still suspect that they will be the best choice for most single-file recoveries.  Here is an annotated recovery session:

  1. Change to the appropriate directory to ensure the restored file is placed where you want.   Keep in mind that amanda backs-up and restores with paths relative to the root of the disklist entry (DLE).  For example, if the relevant DLE is /home, and you restore /home/cch1/BigFile, amrecover will create the file ./cch1/BigFile.  If the current working directory is /home, the file is thus restored to its original location.  Otherwise, a directory cch1 is created under the current working directory and BigFile is placed into that directory.  This can be handy if you don’t want to restore files into their original location.  The lpwd and lcd commands can help you display and adjust, respectively, the currently working directory after you have started amrecover, but to be safe, set it before starting.

In this example, I do want to restore into the original location.

# cd /home

  1. Disable selinux on both the localhost (where the file is missing) and the amanda index/tape server.

# setenforce 0

  1. Start the recovery application as root.

# amrecover CCH1 -s triumph -t triumph

  1. You may need to select the disk (DLE) holding the files to be restored if you did not start amanda from a directory within the same DLE that held the file you want to restore.  Otherwise, the default DLE will be the right one.

amrecover> listdisk

…

amrecover> setdisk /home

  1. Check the potentially relevant tapes.  This can help you determine the dates on which backups were performed.

amrecover> history

  1. Select an appropriate as-of date.  If you want the most recent version of the backed-up file, the default date is appropriate.  Otherwise, enter an appropriate date.

amrecover> setdate 2006-02-03

  1. Change to the tape directory with the file in question.

amrecover> cd cch1

amrecover> ls

…

  1. Identify the file(s) that you wish to restore and add them to the restore set.  Confirm your selection and note the tape that will be required.

amrecover> add BigFile

amrecover> list

…

  1. Start the recovery process.  Amanda will do a decent job of walking you through the remainder of the process.

amrecover> extract

…

Proliant Storage Options

Thursday, February 9th, 2006

In the search for offsite backup solutions, I looked into rsync. It solves a lot of problems, but requires double the hard drive space -there is no way to store the rsync blocks directly on tape, and compressing filesystems don’t seem to be ready for Fedora Prime Time. So I decided to look at increasing the disk space on Matt’s server (6/333) and one of my servers (probably bmw, my 6/550). The first problem that presents itself is that both of these servers use the old-style drive cages that can hold at most eight drives -and that only with dual controllers, a duplex backplane and 1″ drives. Here are some of the valid configurations:

Configuration

Wide-Ultra (SCSI-2/SCSI-3) Simplex Drive Cage (Spare PN 306572-001)

Simplex Backplane (Spare PN )

Simplex Drive Cage (Spare PN 328695-001)

Holds 8 x 1″ or 6 x 1.6″ drives

Wide-Ultra SCSI-2 and SCSI-3

Probably identical to

Pass-through Board (Spare PN 306569-001)

Splits single signal across both rows of cage

Drives

Wide-Ultra SCSI-3 SCA Drives (various)

Wide-Ultra SCSI-2 SCA and lesser Drives (various)

Controller

Wide-Ultra SCSI-3 Controller (various, including Smart-2DH Array)

Only a single channel (<=7 drives) is required.

Configuration

Wide-Ultra (SCSI-2/SCSI-3) Simplex/Duplex Drive Cage (Option Kit 328073-B21)

Simplex/Duplex Backplane (Spare PN )

Simplex/Duplex Drive Cage (Spare PN 328696-001)

Holds 8 x 1″ drives or 6 x 1.6″ drives

Wide-Ultra SCSI-3 SCA and lesser Drives (various)

Wide-Ultra SCSI-3 Controller (various, including the SmartArray 3200)

Configuration

Wide-Ultra2/3 Simplex Drive Cage (Option Kit 382157-001)

LVD SCA Simplex Backplane

10 x 1″ Drive Cage (Spare PN 387087-001)

Wide-Ultra SCSI-3 SCA Drives (various)

Wide-Ultra SCSI-3 Controller (various, including the SmartArray 3200)

Configuration

Wide-Ultra2/3 Simplex/Duplex Drive Cage (Option Kit 123135-B21, Spare PN 144575-001)

LVD SCA Duplex Backplane

10 x 1″ Drive Cage (Spare PN )

Wide Ultra2 Drives (various)

Wide Ultra3 Drives (various)

Wide Ultra2/Ultra3 Controller (various, including the SmartArray 5300)

References:

http://partsurfer.hp.com/cgi-bin/spi/main?sel_flg=partlist&uniqparts=Y&template=main&model=PRO3000&plist_styp=flag&plist_sval=ALL

More readable amanda reports

Wednesday, February 8th, 2006

According to this blog (http://saintaardvarkthecarpeted.com/blog/?p=214), you can improve the look of amanda’s reports with the columnspec configuration directive.  Since I had never heard of the columnspec directive, I went to the relevant amanda manual section (http://www.amanda.org/docs/amanda.conf.5.html) and read up on my options there.  I found a couple of useful directives that I incorporated into my configuration:

displayunit m

#Changes unit used for reports and displays to megabytes

columnspec “HostName=0:12,Disk=1:8,DumpRate=1:9,TapeRate=1:9″

The effect is to change the standard amanda summary

Screen clipping taken: 08/02/2006, 12:11

to something more readable:

Screen clipping taken: 08/02/2006, 13:49

Useful Commands:

# amreport CCH1 -l /var/lib/amanda/CCH1/log.20060207.0 -f /tmp/t; view /tmp/t

NTP Clients

Tuesday, February 7th, 2006

Today I noticed that my simple webcam (a D-Link DCS-1000W) was not getting time from its configured server (triumph). In fact, I could not get time from any “basic” LAN client -yet I know this used to work.

A little research on the web shows that the meaning of the notrust restrict option had substantially changed from ntpd version 4.1 to ntpd version 4.2. I am now running 4.2.0.1.20040617. In effect, the notrust option previously meant “don’t trust as a time source” and now it means “don’t allow non-authenticated connections.” This recycling of a configuration keyword seems like a monumentally bad decision.

Simply removing the notrust option to the restrict allows LAN clients to again get time from the server. However, I wonder what restrictions, if any, are now in place that would prevent a client from adjusting the time on the server. I suspect the mere fact that the client is not mentioned as a server or peer is adequate.

Interesting commands:

# ntpdate -q triumph

Should approximate query of simple client

# ntpdate -d triumph

Should approximate query of simple client

# ntpq -p triumph

Shows the sync state of the server

References:

http://ntp.isc.org/bin/view/Support/AccessRestrictions#Section_6.4.3.2.

Insight into Passive FTP problem

Friday, February 3rd, 2006

After reviewing some open issues with amanda, I noted the similarities with my passive FTP problem and the remote backup I have  suzuki (Bugzilla 172845).  In both cases, a firewall helper module resolves a problem for “local” hosts but fails with hosts on the other end of an IPSec tunnel.  Hmmm…

Since the problems are so similar I raised a bug for the FTP problem as well (179890).