Friday, May 22, 2015

Firewall build (part 3) VLAN Security

Since I plan on using VLANs on the WiFi Access Points to separate guest vs friend vs trusted traffic, I need to make sure that I'm doing VLANs in a secure fashion and not leaving any large holes.

The primary recommendations from the listed sources are:

  • Don't use VLAN 1 (the default VLAN) for anything
  • Any ports that do VLAN trunking should use a dedicated VLAN ID
  • Do explicit tagging of the native VLAN on trunking ports
  • Set all end-user ports to non-trunking (a.k.a. "access ports"?)
  • Disable unused ports and put them in a separate VLAN
  • Disable Spanning Tree Protocol (STP) on end-user ports
  • Use MD5 authentication for Virtual Trunking Protocol (VTP)
  • Physically secure the switch and control access to the management functions


Reference Links:

  1. Virtual LAN Security: weaknesses and countermeasures (SANS)
  2. VLAN Hacking (InfoSec Institute)
  3. VLAN Hopping (Wikipedia)

Thursday, May 21, 2015

Firewall build (part 2) hardware needed for VPN duties

One thing to think about when sizing the hardware for a firewall is how much CPU power will be needed for OpenVPN (or IPSec / L2TP).  OpenVPN comes with a built in "speed" command which will benchmark your system and give you an idea of maximum possible bandwidth.

Just run "openssl speed" at the command line and look for the AES-128 and/or Blowfish results.  I prefer to look at the 1024 byte or 8192 byte columns in the output to figure out the upper range.  While Blowfish is good at the smaller block sizes, AES-128 catchs up and surpasses it with the larger block sizes.

Values at or above 100000k should indicate that the firewall has enough performance to drive an OpenVPN connection at close to gigabit speeds.  Or handle multiple OpenVPN connections at the same time, without completely saturating the CPU.

AMD Opteron 2210 HE @ 1.8GHz
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
blowfish cbc     68358.84k    74350.46k    75845.03k    76373.67k    76556.97k
aes-128 cbc      50477.29k    53816.28k    55093.08k   128709.63k   130465.79k

AMD Phenom II X4 810 @ 2.6GHz
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
blowfish cbc     94477.65k   101825.28k   103154.35k   103857.83k   104060.25k
aes-128 cbc      76376.65k    81608.09k    83915.50k   213516.45k   216016.95k

AMD Opteron 4180 @ 2.6GHz
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
blowfish cbc     93781.75k   101154.41k   102983.68k   103730.52k   103923.71k
aes-128 cbc      76233.64k    81631.07k    83197.52k   213366.89k   215309.87

103720k = 98 MiB/s or ~980Mbps, which is pretty close to gigabit speeds

General guidelines/notes:

  • I'm a firm believer in multi-core for servers and desktops.  So look for hardware that is at least dual-core when shopping.  An inexpensive quad-core would be even better and give a bit of headroom for monitoring tasks.
  • For the AMD CPUs (Opteron / Athon64 / Phenom) made in 2007-2011, you'll want at least a 2.2GHz core.  For Intel Core2 CPUs or 1st/2nd generation i3/i5, try to get at least a 2.0GHz core.
  • Intel Atom CPUs are underpowered, the 1.8GHz dual-core units are reported to top out at around 500Mbps for general routing and definitely can't handle gigabit speed OpenVPN.  But they are low power, so maybe that outweighs the performance issue.  A rule of thumb is that the Atom CPUs are about 1/3 to 1/2 as powerful as i3/i5 for the same clock speed.


Resource links:

Wednesday, May 20, 2015

Migration of domain to tgharold.blogspot.com (from techblog.tgharold.com)

Hopefully at some point, Google adds SSL support for blogs hosted at the blogspot.com domain.  In anticipation of that, I'm changing the primary URL to be "tgharold.blogspot.com" and will point "techblog.tgharold.com" at it so that old links still work.

Sunday, May 17, 2015

md5sum bash script to create check file for a directory tree

Just a quick script that will run through the current directory and all descendant directories, creating a single "verify-tree.md5" file (using md5sum). If the check file already exists, then the existing one gets moved out of the way to "verify-tree.yyyymmdd-hhmmss.md5" for safekeeping.

Check files are useful for whenever you have a set of files that will not (or should not) change over time.  Such as files written off to an archive tape / disk / optical media / flash drive.  While the media and file system might also do some checking, it's good to have a second layer that is under your control and which can be queried.

(If you want file recovery features, you should look into MultiPar or par2j.)

The script's output looks like:


The file (verify-tree.md5) that is created is a standard md5sum file and can be read by just about any compatible software.  Some software cannot handle sub-folders, however, so you may have to use the md5sum program to do the verification.

----------------------------------------------------------------
#!/bin/bash

# stop script on errors
set -e

PROG=md5sum
FILENAME=verify-tree

if [[ -e "${FILENAME}.md5" ]]; then
    mv "${FILENAME}.md5" \
    "${FILENAME}.$(date --reference=${FILENAME}.md5 '+%Y%m%d-%H%M%S').md5"
fi

echo ""
echo "Output Filename: ${FILENAME}.md5"
echo "Files Found: $(find . -type f -not -name "${FILENAME}*.md5" | wc -l)"
echo "Size: $(du -chs . | grep 'total')"

time find . -type f -not -name "${FILENAME}*.md5" \
    -exec ${PROG} "{}" \; >> "${FILENAME}.md5"

echo ""
echo "Files Processed: $(wc -l ${FILENAME}.md5)"

echo ""
echo "Checking..."
time ${PROG} -c --quiet "${FILENAME}.md5"
echo ""
echo "All files verified."
----------------------------------------------------------------

Note: There are multiple ways to write the md5sum line.  It's probably better to use the xargs method, but I have not tested it out.

find . -type f -not -name "${FILENAME}*.md5" -exec ${PROG} "{}" \; >> "${FILENAME}.md5

find . -type f -not -name "${FILENAME}*.md5" -print0 | xargs -0 ${PROG} >> "${FILENAME}.md5"

find . -type f -not -name "${FILENAME}*.md5" | xargs ${PROG} >> "${FILENAME}.md5"

It's also easy to adapt this script to work with sha256sum or sha1sum.


The verification script is a trimmed down version of the original script:

----------------------------------------------------------------
#!/bin/bash

# stop script on errors
set -e

PROG=md5sum
FILENAME=verify-tree

echo ""
echo "Checking: ${FILENAME}.md5"
echo "File Hashes: $(wc -l ${FILENAME}.md5)"
time ${PROG} -c --quiet "${FILENAME}.md5"
echo ""
echo "All files verified."
----------------------------------------------------------------

I have tested this in CygWin 64-bit on Windows 7 64-bit Professional, but it should also work fine on Linux/Unix/OSX systems as long as the md5sum command is available.  The script is conservative in design with very few "tricks" so it should be portable.

Sunday, May 10, 2015

MD5 vs SHA-1 vs SHA-256 performance

I was curious this weekend about how MD5 vs SHA-1 vs SHA-256 performance stacks up.  If you have the OpenSSL libraries installed, you can run a short test to calculate performance on your CPU.  It gives ballpark estimates, which may or may not carry over to real-world performance on actual file data.

$ openssl speed md5 sha1 sha256

Estimates/Summary:

SHA-1 is about 55-75% the speed of MD5
SHA-256 is about 25-40% the speed of MD5
SHA-256 is about 50-60% the speed of SHA-1

Whether or not you will be CPU-bound in computing the file hashes will depend on where you are reading the files from.  Over gigabit LAN or from older mechanical hard drives, or external USB2/USB3 drives, most modern CPUs can keep up even with SHA-256.  But if you are reading the files off of a local SSD or really fast mechanical drive (or RAID) then you are likely to be bottlenecked by the CPU.

Details (openssl test):

AMD Opteron 2210 HE @ 1.8GHz
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md5              25350.32k    80336.17k   191208.72k   295806.29k   353239.04k
sha1             27497.51k    77933.74k   168168.28k   235850.41k   268205.53k
sha256           22543.94k    51445.66k    88818.90k   108275.37k   116416.81k

AMD Opteron 6212 @ 2.6GHz
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md5              45318.86k   134308.27k   324486.83k   493194.58k   582216.36k
sha1             47027.79k   128927.42k   278593.02k   410244.44k   471425.02k
sha256           28690.18k    62662.12k   106333.61k   127998.98k   136325.80k

AMD Opteron 1214 @ 2.2GHz
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md5              30417.85k    96951.65k   235868.24k   361845.33k   433094.16k
sha1             34126.39k    98072.04k   208456.70k   291041.35k   328654.85k
sha256           27362.92k    63127.50k   108537.35k   132564.38k   142727.86k

AMD Opteron 4180 @ 2.6GHz
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md5              43180.73k   129437.27k   298339.15k   442248.19k   519602.18k
sha1             45699.96k   123160.41k   255322.20k   347329.88k   390250.50k
sha256           33757.18k    75191.85k   128929.37k   157659.82k   168790.70k

Intel Xeon E5520 @ 2.27GHz
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md5              25945.50k    88897.43k   220292.01k   354356.91k   430339.41k
sha1             26568.69k    78716.78k   174506.35k   251495.08k   288858.11k
sha256           20495.77k    49206.73k    90735.59k   114904.75k   124605.78k

AMD Phenom II X4 810 @ 2.6GHz
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md5              42430.27k   130101.07k   301091.11k   444996.84k   522918.05k
sha1             46015.86k   125298.40k   256391.07k   349968.95k   393227.03k
sha256           34432.84k    76378.03k   130828.67k   158859.76k   168782.78k

Additional tests/experiments:

Tests using "(sha256|sha1|md5)sum" programs on Opteron 4180 w/ SSDs, using 133GB of large files.  The CPU core assigned to the task was pegged at 100% utilization according to "atop".  The SSD used has an estimated top speed of 490 MB/s.

MD5: ~318 MiB/s (428 sec)
SHA-1: ~187 MiB/s (729 sec) - 1.7x slower then MD5
SHA-256: ~113 MiB/s (1204 sec) - 2.8x slower then MD5, 1.65x slower then SHA-1

Sunday, April 26, 2015

Firewall build (part 1)

Part of moving to a new place is reevaluating your network.  On my current network, I have a fairly basic setup:

  • One WiFi Access Point (WAP) running 802.11 b/g
  • A Linux server acting as firewall / file share / backup storage
  • A few laptops
  • A few tablets/phones
  • A few other PCs
When I set this all up a few years ago I kept it very simple.  The Linux server is the gateway device with routing / filtering / NAT and other features.  The WAP is part of the internal network running in WPS/PSK mode with a very long and randomly generated password.

After I move, I want to accomplish a few things:
  • Use a refurb or low-power PC to run just the firewall / VPN
  • Put the WiFi access point on a separate NIC
  • Possibly run a DMZ
  • Provide limited guest WiFi
  • Evaluate pfSense instead of Linux+Shorewall
To do all that, I need a minimum of four network ports for perfect security or something with two ports if I use VLANs (not as safe, more difficult to configure and get right).

I've done some looking around and while a low-power 25-35W compact PC for the firewall would be nice, it would cost me around $600.  Maybe $400-$500 if I shop around.  There are also the really tiny units that will run monowall (m0n0wall), but those are also $200-$300 for something that will handle the faster WiFi / FIOS / cable modems.  Plus it can be difficult to find something with four network ports.

Firewalls don't need a lot of CPU power, but a dual/quad CPU Intel Atom isn't enough.  An i5/i7 would likely be complete overkill, even for 802.11ac / 802.11n or gigabit traffic.  The older Pentium / Celeron / Core Duo are probably a bit on the slow side.  The AMD Phenoms or Athlon64 chips are probably okay.

So what I've settled on is a refurbished PC that is at least a Core 2 Duo (2 cores) with 4GB of RAM, along with a refurbished NIC.  The pfSense distro only needs a handful of gigabytes to install, so any unit with at least 40GB of space will be plenty.  The base units can be picked up for as little as $50-$125 for the base computer, and add-in NIC cards are $10-$40 depending on what you use.  If the box dies, I get another and move the drive over.  If one of the NICs fry, I can pickup another NIC.  Power requirements will probably be around 80W to 120W.

For the smaller sized PCs, you might only have 1-2 expansion slots which means you'll need a multi-port NIC. The cost of the dual-port NICs is likely to be more then what you pay for the base PC.  I've seen dual-port refurbished NICs for as low as $50, but paying $100-$150 is more likely.  However, good NICs tend to work fine for close to a decade, and it can be moved from PC to PC.

Friday, August 08, 2014

Postfix: Calculate number of TLS encrypted SMTP sessions

I was curious as to what amount of SMTP traffic is encrypted to our servers.

This assumes that you are running Postfix, and you might need to adjust smtpd_tls_loglevel to be 1 or 2.  I'm not sure if this catches all instances where the SMTP connection switches to SSL, or just those that support TLS.

# fgrep 'postfix/smtpd' maillog* | fgrep ': connect from' | wc -l
# fgrep 'postfix/smtpd' maillog* | fgrep ': setting up TLS connection' | wc -l

One box #1 that we have at the office:

16151 out of 293746 connections were TLS (5.5%)

On box #2:

27485 out of 654294 connections were TLS (4.2%)

A very rough estimate is that one connection = one message delivered to the server.  Assuming that is true, only 4-5% of SMTP traffic to our domains (via port 25/tcp) is sent over an encrypted channel.  On the other hand, probably 90% of all of our connections are spam zombies who probably don't do TLS.  In order to dig deeper, I would have to tie every non-spam message to a specific connection in the Postfix log file.

Thursday, October 03, 2013

Install GRUB onto multiple boot disks in Software RAID-1 (quick reference)

Here is an example where I have a 3-way RAID-1 array. The /boot partition is stored at /dev/md0. This installs GRUB to each disk, so that if one disk fails, you can boot off one of the other disks.

# grub
grub> find /grub/stage1
 (hd0,0)
 (hd1,0)
 (hd2,0)
grub> device (hd0) /dev/sda
grub> root (hd0,0)
grub> setup (hd0)
grub> device (hd0) /dev/sdb
grub> root (hd0,0)
grub> setup (hd0)
grub> device (hd0) /dev/sdc
grub> root (hd0,0)
grub> setup (hd0)
grub> quit

With this you should be able to now boot from any of the disks in the RAID-1 array, no matter what boot order you set in the BIOS.

For safety, I suggest using UUIDs in your /etc/fstab file for your /boot and / (root) partitions. This way the machine will boot off the UUIDs of the file systems, even if mdadm (software RAID) decides to renumber your /dev/md# devices. Note: This is the default behavior in RHEL 6 / CentOS 6.

Monday, September 09, 2013

View and changing the SSH HostKey files

With all of the NSA leaks in the past few months, I figured it was a good time to go look at the SSH keys that we use on the servers and decide whether we want to re-key things. Naturally, this is a bit of a PITA because you'll have to let all clients know that the SSH host key changed and users will have to edit their ~/.ssh/known_hosts file.

First off, let's look at the current key information (using the "-l" option to display the signature, and the "-f filename" option to look at an existing file):

# /usr/bin/ssh-keygen -l -f /etc/ssh/ssh_host_dsa_key
1024 86:72:0c:d8:47:ce:c4:4a:79:25:9b:ad:22:1b:de:87 /etc/ssh/ssh_host_dsa_key.pub (DSA)
# /usr/bin/ssh-keygen -l -f /etc/ssh/ssh_host_rsa_key 
3072 2b:3d:27:77:49:bf:05:09:ee:b7:74:68:e8:f3:fc:3f /etc/ssh/ssh_host_rsa_key.pub (RSA)

This displays a few useful pieces of information:

#1 - The key size is 1024 for the DSA key. All DSA keys are 1024 bits in size due to FIPS 186-2 (Federal Information Processing Standard 186-2). While there is a newer FIPS 186-3 and FIPS 186-4 standard that allows larger DSA keys, I'm not sure how well supported it is in OpenSSH.

My RSA key is 3072 bits in size instead of the default 2048 bits in CentOS 6. Older releases had a default of only RSA/1024 bits, which is considered to be a bit weak today. The current recommended minimum is 2048 bits and the maximum in common use is 4096 bits.

A good read is Anatomy of a change - Google announces it will double its SSL key sizes.

#2 - The key signature, which should be communicated to your users via out-of-band communications.

To re-key, I suggest using the following for DSA keys:

# /usr/bin/ssh-keygen -N '' -C 'servername SSH host key Sep 2013' -t dsa -f /etc/ssh/ssh_host_dsa_key
Generating public/private dsa key pair.
/etc/ssh/ssh_host_dsa_key already exists.
Overwrite (y/n)? y
Your identification has been saved in /etc/ssh/ssh_host_dsa_key.
Your public key has been saved in /etc/ssh/ssh_host_dsa_key.pub.
The key fingerprint is:
86:72:0c:d8:47:ce:c4:4a:79:25:9b:ad:22:1b:de:87 severname SSH host key Sep 2013

For RSA keys, you need to change "-t dsa" to "-t rsa", change the filename, and add a "-b 2048" option before the "-f filename" option. Suggested key sizes are 2048 for short-term, 3172 for 1-2 decades, and 4096 for keys that will be in use past 2030. The downside is that as key length doubles, performance drops by a factor of 6-7x.

Friday, August 23, 2013

Linux server partitioning with mdadm software RAID and LVM2

Over the years, I've really come to appreciate what judicious use of LVM (or LVM2) brings to the table when administering servers. If you rely on it heavily and leverage it properly, you can do things like:
  • Snapshot any file system (other then /boot) for backups, or to make images, or to test something out.
  • Migrate a logical volume (LV) from one physical volume (PV) to another, without having to take the file system offline or deal with downtime.
  • Resize file systems that are too large or too small, with minimal downtime (if the file system supports it).
Basically, other then /boot, if you're thinking of creating a separate partition or Software RAID device the you should be using LVM instead of physical partitions or RAID devices. You gain a lot of flexibility in the long-run and setting up LVM on top of hardware or software RAID or plain old disks is no longer that difficult.

These days, when I setup disk partitions to hold a server's boot-up files, I only create (2) partitions on the drive. One for the Software RAID-1 mirror set to hold /boot (usually 256-1024MB) and the rest of the drive is a second RAID-1 mirror set that is turned into a LVM physical volume (PV) and assigned to a volume group. I will usually only partition out to about 99% of the drive size if I'm doing Software RAID, because that makes it easier later to put in a different model disk of the same capacity and still have things work. Drives from different manufacturers have slightly different capacities, so you can run into trouble down the road when you go to replace a failed drive if you assumed all drives were exactly the size as your original drives.

Inside that LVM volume group (VG), I create all of my other partitions. These days, that means:
  • / - the "root" partition, usually 16-32GB for CentOS 6
  • /home - Usually starts at 4GB for a server where people won't be logging in much.
  • /opt - 4-24GB
  • /srv - 1-4GB (sub-directories get their own LV later)
  • /tmp - 4-24GB
  • /usr/local - 8-24GB
  • /var - 4-24GB
  • /var/log - 4-24GB
  • /var/spool - 2-4GB to start
  • /var/tmp - 2-8GB to start
And that's just the basic operating system file systems. For things like squid, e-mail, web servers, samba shares, etc., each of those will get its own LV, allocated from the server-wide volume group.

Follow ups:
  • GRUB2 understands mdadm (software RAID) and LVM. So we will eventually be able to put /boot in an LVM volume. But the GRUB that ships with RHEL6 and CentOS6 is still GRUB 0.97.

Sunday, August 18, 2013

TLS: SSL_read() failed: error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca: SSL alert number 48

Here's a fun error message that we're getting on our mail server at the office:

Aug 15 10:52:26 fvs-pri dovecot: imap-login: Disconnected (no auth attempts in 1 secs): user=<>, rip=172.30.0.221, lip=172.30.0.1, TLS: SSL_read() failed: error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca: SSL alert number 48, session=

The odd thing is that using public SSL testing tools (such as the one at DigiCert) do not indicate any problems with the mail server's SSL configuration. And this only seems to affect some clients, and is possibly only acting up with Dovecot. So my guess is that Apache/OpenSSL is configured correctly, but Dovecot is not.

The key to figuring this out is the "openssl s_client" command:

openssl s_client -connect mail.example.com:143 -starttls imap

This showed us that the openssl library was having problems validating the server's certificate, because the intermediate certificates were not also stored in the certificate file that gets sent to the client. The solution is to adjust the file pointed to by Dovecot's "ssl_cert" argument and add your certificate vendor's intermediate certificates to the end of the file.

The order of the certificates inside that file matters. Your server certificate needs to be first, then list the rest of the certificates in order as you move up the certificate chain to the root CA.

Wednesday, August 14, 2013

LUKS: /dev/mapper "read failed after 0 of 4096 at 0: Input/output error"

We're using external USB drives for our backups, protected using LUKS/cryptsetup. On our 3-4 year old Opteron 2210 HE CPU running at 1.8GHz, we estimate that LUKS can perform about 60-70 MB/s per CPU core. We mount the LUKS volumes automatically (at server boot) by listing them in /etc/cryptsetup and using a key-file instead of having to enter a password and autofs handles the automatic mounting/dismounting of the ext4 file system inside the LUKS volume.

It all works very well, until you remove the USB device and then run pvscan/lvscan commands of LVM. Which then throws the following errors:
# pvscan
  /dev/mapper/USBOFFSITE12B: read failed after 0 of 4096 at 999201636352: Input/output error
  /dev/mapper/USBOFFSITE12B: read failed after 0 of 4096 at 999201693696: Input/output error
  /dev/mapper/USBOFFSITE12B: read failed after 0 of 4096 at 0: Input/output error
  /dev/mapper/USBOFFSITE12B: read failed after 0 of 4096 at 4096: Input/output error
  PV /dev/md12   VG vg13   lvm2 [2.67 TiB / 821.62 GiB free]
  Total: 1 [2.67 TiB] / in use: 1 [2.67 TiB] / in no VG: 0 [0   ]

If 'autofs' would allow us to execute a script after it dismounts the drive from inactivity, it might be a good option for us. But I'm not sure that is possible.

Another option would be that at the end of the daily backup script which copies files off to the attached device, we dismount it automatically.

A third option would be to check every hour and see whether the /dev/mapper/NAME is no longer in use, then tell cryptsetup to dismount it. The command to check that might be "dmsetup ls --tree -o uuid,open | grep -i 'CRYPT-LUKS'".

Still exploring options at this point. I need to do some more testing first. I'm also searching for a way to auto-open LUKS volumes upon device insertion.

Monday, August 12, 2013

Auto-mounted external USB drives with Linux autofs daemon

This explains how to use 'autofs' to automatically mount external USB hard drives at a predictable path under CentOS 6 (but probably also works for CentOS 5, RHEL 5, RHEL 6, etc.). On one of my Ubuntu desktops; file systems on USB drives get automatically mounted, but that requires a GUI environment to be installed. On our servers at work, we generally do not install a GUI environment. We also had some special requirements in order to use the USB drives as backup targets:
  • The USB drive needs to mount at a standard location. Such as /mnt/offsite/LABEL or something. This way the backup scripts are not overly complex.
  • Mounting needs to be automatic, without any user intervention.
  • Ideally, the file system should be dismounted when not in use. That way the user who swaps out the backup drives only needs to check the drive activity light before knowing that it is safe to swap out the drive.

So the standard Ubuntu method of doing things via Gnome comes close, but I explored other options as well. The one I settled on is called 'autofs'. It is a script/daemon that is found in the standard CentOS 6 repositories, so you just need to run "yum install autofs".

Configuration of the autofs daemon consists of two parts:

A. You need to find and edit the master configuration file for autofs, also called the 'master map file'. Under CentOS 6, this is located at /etc/auto.master, or you can look at '/etc/sysconfig/autofs ' to find out the configuration file location.

If you want to mount your USB backup drives at /mnt/offsite/XYZ, then your auto.master file only needs to contain the following:

# USB backup drives
/mnt/offsite            /etc/auto.offsite       --timeout=1800

As you can see, you tell autofs the location, what configuration file to look at for devices that need to be mounted, and optional arguments such as custom timeout settings. Note that you need to create the location directory by hand (e.g. "# mkdir /mnt/offsite") before starting autofs.

B. The second part of configuration is telling autofs which devices should be automatically mounted. It is best if you use either the UUID of the partition (/dev/disk/by-uuid/XYZ) or the device ID (/dev/disk/by-id/XYZ-part0).

OFFSITE1 -fstype=ext4,rw,noatime,data=journal,commit=1 
    :/dev/disk/by-uuid/b5c1db0d-776f-499b-b4f2-ac53ec3bf0ef

Please note that the above should be all on line line, I have broken it up for clarity.

The only real mount options that you need is "fstype=auto,rw,noatime". I have added "data=journal,commit=1" to make the ext4 file system on the drive a bit more resilient.

One limitation of autofs is that if you have multiple USB drives to be mounted, each one needs its own unique mount point (/mnt/offsite/OFFSITE1, /mnt/offsite/OFFSITE2, etc.). However, you could decide to mount all of the drives at the same location if you give them all the same UUID. But I'm not sure how well autofs would deal with two drives, having the same UUID, being hooked up to the server at the same time.



After editing your mapping file, you (probably) need to restart autofs. Assuming that you have done everything correctly, attempting to list the contents 'ls -l /mnt/offsite/OFFSITE1' will cause the drive to be automatically mounted. After the timeout period expires, the drive will automatically dismount.

Wednesday, May 29, 2013

Dovecot fails to compile against PostgreSQL 9.2

If you are trying to compile dovecot against PostgreSQL, then you are probably running configure with the "--with-pgsql" option.  Except that if you installed PostgreSQL via the PGDG repository on Centos 6, you are probably stuck with the following error:

checking for shadow.h... yes
checking for pam_start in -lpam... no
checking for auth_userokay... no
checking for pg_config... NO
checking for PQconnectdb in -lpq... no
configure: error: Can't build with PostgreSQL support: libpq not found


The root cause is that pg_config is not in the PATH statement.  So you should add "/usr/pgsql-9.2/bin" to your PATH statement before calling ./configure.  If you are not sure where pg_config is located, try "find / -name pg_config".

#!/bin/bash
PATH=/usr/pgsql-9.2/bin:$PATH
export PATH
./configure \
        --with-pgsql


Once you do the above, the dovecot 2.2 configure script will run to completion.

Saturday, May 18, 2013

FSVS automated snapshots

One common use for FSVS is to make automated snapshots of portions of your Linux file system.  Such as monitoring changes to log files, executables, or data directories.  The downside of this is that, even if nothing as changed, FSVS will still generate a SVN commit.  So if you are running FSVS on an hourly basis through the day, your SVN log will be cluttered with hundreds of commits that contain no useful information.

The following is an example of how we keep track of changes to a /cfmc directory on the server.  This runs hourly and is our primary backup against data loss on this server.  Because FSVS and SVN only send the differences across the wire, it's a very efficient method.  And since this is protecting client data, we're going to version control it and keep it for years.

The magic trick is the FCOUNT= line which runs FSVS and looks to see whether there were any changes to files in the monitored directory tree.  If it found changes, then we go ahead and do an automated commit.

#!/bin/sh
# Only executes FSVS if FSVS reports outstanding changes

FSVS_CONF=~/.fsvs-conf
FSVS_WAA=~/.fsvs-waa
export FSVS_CONF FSVS_WAA

cd /cfmc

FCOUNT=`/usr/local/bin/fsvs | grep -v 'dir.*\.$' | wc -l`

if [ $FCOUNT -gt 0 ] ; then
    /usr/local/bin/fsvs ci -m "Automatic FSVS snapshot"
else
    echo "Nothing changed"
fi

Wednesday, February 13, 2013

Backing up SVN (SubVersion) repository directories

When backing up subversion (SVN) repositories, I find it best to use a bash shell script to search for the SVN repositories.  These can then be passed to the svnadmin hotcopy command or the svnadmin dump command to dump out each repository by itself.

First off, you should define a few variables at the top of your bash shell script.  The key one is ${BASE} which lets you define the location of your SVN repositories. 

# BASE should be location of SVN repositories (no trailing slash)
# such as: BASE=`pwd` or BASE="/var/svn"
BASE="/var/svn"


FIND=/usr/bin/find
GREP=/bin/grep
RM=/bin/rm
SED=/bin/sed


Next is the bit of find/grep/sed magic that turns the list of directories that contain SVN repositories into a list of repository directories.  In this particular case, we are searching for the item named 'current' at a maximum depth of 3 directories deep, then making sure it is 'db/current' in the full pathname.  Last, we sort the list of paths so that we process things in alphabetical order.

DIRS=`$FIND ${BASE} -maxdepth 3 -name current | \
    $GREP 'db/current$' | $SED 's:/db/current$::' | $SED "s:^${BASE}/::" | \
    sort`

As an alternative to processing in alphabetical order, you can use the following perl fragment to randomize the order of the directories.  The advantage of this is that if your backup script breaks for some reason, in the middle of the backup, you have a far higher chance that directory backups at the bottom of the list won't be too far out of date (they might be a few days old, but probably not a few months old).  This is an especially good idea if you are sending the backups out over a WAN link using rsync.

We also, in order to speed up our backups, only search for repositories modified on-disk in the last 15 days.

DIRS=`$FIND ${BASE} -maxdepth 3 -name current -mtime -15 | \
    $GREP 'db/current$' | $SED 's:/db/current$::' | $SED "s:^${BASE}/::" | \
    perl -MList::Util -e 'print List::Util::shuffle <>'`


The loop portion is simply (this particular example shows how to use "svnadmin verify"):

for DIR in ${DIRS}
do

    echo "verifying ${DIR}"
    svnadmin verify --quiet ${BASE}/${DIR}
    status=$?
    if [ $status -ne 0 ]; then
        echo "svnadmin verify FAILED with status: $status"
    else
        echo "svnadmin verify succeeded"
    fi

    echo ""
done

Hope these tricks help.

Tuesday, January 15, 2013

mdadm: Using bitmaps to speed up rebuilds

As SATA drives have gotten larger, the chance of a minor error creeping in during a RAID rebuild has greatly increased.  For the new 2-4 terabyte models, assuming a rebuild rate of 100 MB/s, a mirror rebuild can take 5-10 hours.  The situation is even worse for RAID 5 and RAID 6 arrays where you have to update multiple discs and rebuild times tend to scale out with the total size of the array.

One low-level solution I have been using is to partition the drive into smaller segments (usually 1/4 to 1/2 of the drive capacity), use mdadm Software RAID across each segment, then put all the segments into a single LVM Volume Group (VG).  The advantage is that it's simple and often only a single segment of the drive has to be re-sync'd from scratch if there is a power outage or other glitch during the rebuild.

The other (probably better) solution is to use the mdadm --bitmap option (kernel.org link).  This allows the array to keep track of which blocks are dirty (not yet sync'd to all disks) or clean.  It speeds up resync operations greatly if there is a power failure or glitch during the write operation.  The main disadvantage is that you are looking at three write operations whenever you change a bit of data.  First, mdadm has to mark the bit relating to that section of the disk as dirty.  Second, it writes out the data.  Third, it has to go back and mark the bit as clean.  This can severely impact performance.

By default, when using internal bitmaps, mdadm splits the disk into as many chunks as possible given the small size of the bitmap area.  For smaller partitions, this can be as small as 4MiB, but you can also specify larger values with the "--bitmap-chunks=NNNN" argument.  For larger drives, you will want to consider chunk sizes of at least 16-128MiB.

Warnings:

- I've run into a situation where my version of mdadm (v2.6.9 - 10th March 2009, Linux version 2.6.18-194.32.1.el5) would cause the machine to lock up hard when removing a bitmap.  Another machine has a newer CentOS5 kernel (2.6.18-308.16.1.el5xen) and experienced no issues.  So make sure you are running a fairly recent kernel.

Instructions:

In order to add bitmaps to an existing Software RAID array, the array must be "clean".  The command is simply:

# mdadm --grow --bitmap=internal --bitmap-chunks=32768

If you want to resize the bitmap chunk, you must first remove the existing bitmap:

# mdadm --grow --bitmap=none

Performance:

I did some testing on a system I had access to which had a 7 drive RAID-10 array (6 active spindles, 1 spare) using 7200 RPM 500GB SATA drives.  Values are in KB/sec using bonnie++ as the test program (15GB test size).

#1 No bitmap:
Seq Write: 139035
Seq ReWrite:  43732
Seq Read: 76221

#2 bitmap size of 4096KiB
Seq Write: 109720 (27% lower)
Seq ReWrite: 40179
Seq Read: 72917

#3 bitmap size of 16384KiB
Seq Write: 127924 (8.7% lower)
Seq ReWrite: 40734
Seq Read: 73870

#4 bitmap size of 65536KiB
Seq Write: 124694 (12% lower)
Seq ReWrite: 40674
Seq Read: 74501

As can be seen, the larger chunk sizes do not impact sequential write performance as much.

Tuesday, September 18, 2012

MultiPar (the spiritual successor to QuickPar)

We archive a lot of data onto CD/DVD, which has never been a reliable medium even if you use the high quality Taiyo Yuden media.  As a result of this issue where a CD/DVD can become unreadable over the course of years / decades, you have to take one of two approaches:

1) Burn a second (or third) copy of every CD/DVD that you create.  The primary downside is that you double the number of disks that you have to keep track of.  If you store the disks in two separate geographical locations, this is not necessarily a bad thing.  But back when media was far more expensive, this also drove up your costs a lot.  You still need to create some sort of checksum / verification data at the file level so that you can validate your archives down the road (such as MD5 or SHA1 hashes).

2) Add some sort of parity / error recovery data to the disk contents.  While CD/DVD media both include Reed-Solomon error correction at the sector level, you can't always get information about how clean the disc is and whether or not it is failing.  In many cases, the first sign of trouble occurs after the point where the built-in error correction is no longer able to do its job.  So you use a program like WinRAR, QuickPAR, par1 or par2 command line programs, or something else to create additional error correction data and add it to the data being written to the media.

An important concept when dealing with long term archival is "recovery window".  In most cases, when media starts to fail, it is a progressive condition where only a few sectors will have issues at the start.  As time goes on, more and more sectors will fail verification and less and less data will be recoverable.  The exception to this is if the TOC (table of contents) track goes bad, which will then require the use of special hardware in order to read any data off of the media.

In the case of the above approaches:

1) Multiple copies -- The recovery window is from the point that you find out that one of the copies has failed until you make a copy of one of the remaining copies that is still valid. Depending on where the physical media is located, this might be a quick process, or it might require a few days to transport media between locations.  The problem comes when multiple copies are experiencing data loss, because you will need to hope that the same files/sectors on both media are not corrupt on all copies.

Note that the multiple copies approach is only recoverable at the "file" level in most archive situations.  Most verification codes are calculated at the file level, which means a file is either completely good or completely bad.  Unless the file contains internal consistency checks, you cannot combine two damaged files to create a new undamaged copy.

2) Error correction data -- Again, the recovery window starts at the point in time where you first discover an issue.  But because the error correction data lives on the disk next to the raw data, you are able to immediately determine whether the media has failed to the point where data is actually lost.  Some of the tools (QuickPar in particular) used to create verification data can even recover disks where the file system has been corrupted by digging through the block level data and pulling out good blocks.

Note that the two approaches are not exclusive to each other.  For the truly paranoid, creating two copies of the media along with dedicating 5-20% of the media's capacity to error correction will give you lots of options when dealing with "bit rot".

So, back to the original point of the posting...

We used to use QuickPar to create our recovery data.  It was written for Windows XP and had a nice GUI which made it quick to bundle up a bunch of files and create recovery data for those files.  Speed was fairly good, but it never did multi-threading nor did it ever support subdirectories.  It has also not been updated since the 2003-2004 timeframe, so is a bit of a "dead" project.

The successor to QuickPar, for those of us wanting a Windows program with a GUI, seems to be MultiPar.  I stumbled across this from Stuart's Blog posting about MultiPar.  Even though the download page is written in Japanese, MultiPar does have an English GUI option.  Just look for the green button in the middle of the page which says "Download now" and look at the filename (such as "MultiPar121_setup.exe").

Thursday, June 14, 2012

Windows 2003: Loses connection to a network share on a Windows 7 machine

This has been perplexing us at work for a bit this week.  We have a Windows 2003 server which attempts to send its backup logs to a desktop PC.  It used to send it to a Windows XP machine, but that has now been replaced with a Windows 7 Professional workstation.

The problem is that everything works fine for 10-20 minutes at a time.  Then the Windows 2003 server will lose connection to the Windows 7 desktop and you will be unable to map to the share points on the Win7 machine until you reboot the Windows 7 desktop.

On the Windows 2003 server you will see:

C:> net view \\hostname.example.com
System error 58 has occurred.

The specified server cannot perform the requested operation.

This will also show up in the error log on the Windows 7 as:

Error 2017: The server was unable to allocate from the system nonpaged pool because the server reached the configured limit for nonpaged pool allocations.

The fix for this is two-fold and is all performed on the Win7 machine.  It requires editing a pair of registry entries on the Win7 machine.

1) HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management\LargeSystemCache

This gets changed from "0" (zero) to "1" (one).

2) HKLM\SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters\Size

This gets changed from "1" to "3".

Reference links:

Windows 7 Nonpaged Pool Srv Error 2017
Systems Internals Tips and Trivia

Saturday, May 12, 2012

SpringSource Tool Suite, Spring Roo 1.2.1 shell will not start

The dreaded "Please stand by until the Roo Shell is completely loaded." message when setting up a new machine with SpringSource Tool Suite 2.9.1 can be very frustrating (Spring Roo 1.2.1).  This is on a fairly new install of Windows 7 64bit and the symptom will be that the Roo shell never manages to get past the "Please stand by" message.

Another symptom is that if you attempt to create a new Spring Roo project using the wizard, the only things that will show up in the Package Explorer is the "JRE System Library" tree.  The wizard will not be able to create the standard src/main/java, src/test/java, or the pom.xml files.

The problem boils down to permission issues in Windows 7 when running as a restricted user and Spring Roo's desire to create files/folders under "C:\Program Files\springsource".

The simple workaround is to go to "C:\Program Files\springsource", go into properties, then the Security tab.  Click the UAC "Edit" button and give the "Users" group permission to Modify (and Write) files under the springsource folder.

Once you restart STS 2.9.1, things should now work if you attempt to open up the Spring Roo shell.

A second issue that you may run into is that by default, Spring Roo expects to create a Java 1.6 project, not a Java 1.7 project.  If you have the 1.7 JDK installed, you may need to also install the older 1.6 JDK (or figure out how to tell Spring Roo to create 1.7 projects by default).