2015-05-02

Bandwidth throttling in FreeNAS for external traffic

I have a FreeNAS box running in my home. It is the target for my laptop to backup, it has some files I share with Christy, it stores all my photos, and so on. All in all it's great.

I recently set up the s3cmd plugin to back up stuff to S3. Despite having redundant disks it's always good to have a backup. It takes days to upload all my photos to S3 but that's fine - it doesn't impact regular use of the NAS.

Unfortunately, the backup process saturates my home DSL upstream link. Nothing else in the house could do anything reasonable on the internet while the backup process was going - just surfing normal websites would time out. 

I wanted to limit the bandwidth the backups could consume to leave enough for the rest of the house. If this were a professional situation, I would suggest setting up QoS on the router so that it could use 100% of the link when nobody else was but when other traffic came through it would take second place. But this is my home, and I don't have a fancy router that can do QoS. Throttling on the NAS just makes sense.

At the same time, I don't want to just throttle all traffic; I still want it to run full speed when I mount the NAS from my laptop on my local network.

I don't have a lot of internet: 6Mb down, 1Mb up. I decided that I would let backups take 80% of the upstream bandwidth, since most things like browsing the internet and streaming movies send very little data - most of it is received.

The FreeNAS folks argue that it's not a NAS's job to limit its bandwidth usage - it should focus on serving data as quickly as possible. (see https://bugs.freenas.org/issues/5666). While it's always their prerogative to decide what features go in to the product, to me that seems to ignore the large number of people that use it at home without things like fancy QoS enabled routers.

Well, that's fine, because FreeNAS runs on FreeBSD, which means that it has ipfw and dummynet, top-of-the-line tools for doing bandwidth shaping. There are many guides out there on the net that can help you do fancy things with them.

Steps I took to limit the upstream bandwidth my FreeNAS uses when interacting with the public net, while still allowing it to serve data full speed on my LAN:

* Enable dummynet, the FreeBSD traffic shaping module:
edit /boot/loader.conf and add 'dummynet_load="YES"' at the end of the file
run kldload dummynet to load the module
* Set up a pipe that restricts traffic to 800Kbps (80% of my 1Mb link)
ipfw pipe 1 config bw 800Kbit/s 
* Send outgoing traffic destined for anywhere other than the local network through the pipe
ipfw add pipe 1 ip from 10.0.0.0/8 to not 10.0.0.0/8 out
That does it.

A couple of notes.
  • Most home networks are in the 192.168.0.0/24 range rather than the 10.0.0.0/8 range. Change that line to match your local network range. 
  • Watching bandwidth usage in realtime helps you see what works and what doesn't. The systat view of bandwidth utilization was super helpful: systat -ifstat 1
  • It helps to have a load running as you're doing this - start a large upload then muck with these commands while watching systat
  • I haven't tried a reboot. I'm going to bet that this setting will not persist and I've got to do something else to make it stick.

2015-03-10

Setting Nagios Downtime from a script

I don't know why I've never done this until now.

Create a script "nagios_downtime" or whathaveyou:

#!/bin/bash

# check for usage
if [ $# -ne 4 ]
then
 echo "Usage: $0 <host> <service> <duration_in_sec> <message>"
 echo "Example: nagios_downtime web3 check_http 3600 'downtime reason'"
 echo "must run as nagios user"
 exit 1
fi

# snarf arguments
host=$1
svc=$2
dur=$3
message=$4

# calculate timestamps for now + duration
start=$(date +%s)
end=$((start + dur))

# initiate downtime
echo "[${start}] SCHEDULE_SVC_DOWNTIME;${host};${svc};${start};${end};1;0;0;nagiosadmin;${message}" > /var/lib/nagios3/rw/nagios.cmd

# print saying you did it
echo "$(date): scheduling downtime for ${svc} on ${host} for ${dur} seconds"

then when you're taking an action (say a backup) and want to programmatically set downtime from within your backup script, just shell out to sudo -u nagios nagios_downtime db5 check_mysql_replication 600 'downtime for backup'

For more information look at the Nagios External Command documentation. Specifically, this script uses the command to schedule service downtime.

keywords: nagios service downtime command line

p.s. this is not a resilient script. It doesn't check any input. Don't run this behind xinetd and think you have an API for remotely setting downtime in nagios.

Removing Accidental Chef Attributes

If you've accidentally set attributes on a bunch of nodes in a way that breaks your system, you want to delete them. This can happen when you accidentally include a cookbook you didn't mean to and have some attributes set in the attributes/default.rb file.

Here's how to fix it:

for i in $(cat /tmp/hosts); do
  echo -n "$i: "
  knife exec -E "nodes.transform('name:$i') {|n| puts n.hostname ; n.normal_attrs['my_bad_attribute_name'].delete('self')}"
done

Note: replace the nested array format with underscores. In other words node['my']['bad']['attribute']['name'] becomes my_bad_attribute_name