2017-03-25

Trying to parse key value log lines in go

I want to build a honeytail parser that will take log lines that look like

key1=val1 key2=val2 key3=val3...

There are a few additional constraints

  • if the value is quoted, I'd like it to have the quotes stripped
  • if the value has an equals sign in it, that's cool
  • if the value is multiple words, that's fine even if it's not quoted.  (I won't go so far as to say that unquoted space separated strings should be allowed to have equals signs embedded.)


I found a library on the net to do this: github.com/kr/logfmt.

Here is the code playing with it

Unfortunately, it had a few other ideas.

  • a bare word (no equals sign attached) should be treated as a key with no value
  • equals signs embedded in strings should be treated as separators
You can see these rules coming through in test cases 2 and 4 in the output

Ah well.  It was an interesting attempt.


(one test case that's missing: does a number in quotes turn in to a number (float or int) or stay a string?)

2017-01-19

Working with ulimits and open file descriptors

Yet another in the long list of things I always forget how to do between needing to do them...

Cheatsheet to working with ulimits and too many open files.
list how many open file descriptors a process has
lsof -P [pid] | wc -l
ls /proc/[pid]/fd | wc -l (cheaper)
look at limits for a specific process
cat /proc/[pid]/limits
set limits for a user
futz with /etc/security/limits.conf or /etc/security/limits.d/*
show ulimits for the current process
ulimit -a
set number of open file handle limit for the current process (and processes spawned from this one)
ulimit -n 16384 (note - only root can raise hard limits)
Hope that helps, future self.

2016-10-10

Working with the MySQL slow query log on RDS with the AWS Go SDK

In order to let you submit RDS logs directly to Honeycomb, I've been exploring how to use the go RDS SDK to tail the mysql slow query log. It's fascinating the things I've discovered that I didn't find in the docs. (All of this is probably there somewhere, I just failed to find it.)
  • The docs say the logs rotate every hour but they don't say it's on the hour or what the rotating style is. The current log (slowquery.log) is rotated to slowquery.log.# where # is the current hour (0-23) UTC. So at 18:00 UTC, the file slowquery.log.18 is written with the logs from 17:00 to 18:00 (almost. more later)
  • When asking for a portion of the log file, you can hand it an index from which to start reading and the number of lines you would like to retrieve (both optional). You get back a struct with a LogFileData *string containing log lines and a Marker *string containing the index for the next segment of the log file.  The Marker looks like #:# where the first number is the hour and the second is the distance in to the file.
Asking for current data is rather straightforward. First ask for one line of  the current log file with no marker. You'll get back the most recent line (which you throw away) and the current marker. Wait a bit, then ask for the current file with the marker, and you'll get all lines since that first one. Repeat.

There are two places things get interesting: log rotation (aka the top of the hour) and asking for older data.

Let's say it's 17:58 UTC and you're happily tailing content, fetching additional lines every 5 seconds. You get to 18:00:02 or so and the marker switches from 18:xxxx to 19:0 and suddenly you're not getting any new data anymore. You wait and wait and ... at 18:05:00, you start getting data! but it's from 18:05:00; the data from between 18:00 and 18:05 is missing.

You take the last marker you got from your tail before data stopped flowing (with the 18: prefix) and use that marker as an index into slowquery.log.18. This gets you the data from between 18:00:02 and 18:05:00, then stops.  Or! Once some amount of time (not necessarily the full 5 minutes) you can ask the slowquery.log for data since the 18:offset marker and you'll get the data that's missing.

Back you go to tailing the slowquery.log.

Ok, so here's the next fun part - you can hand a marker to either the previous hour or the current hour to AWS when asking for the slowquery.log and it will give you data since then. If you do this after the 5 minute log rotation period has ended, you get all the intermediate data too! However, if you ask it for data where the number before the colon is not the current or previous hour, it ignores that and hands you back current data at the offset specified by the second half of the marker.

You can ask for data from older files (the numbered log files) and pass them a marker, but in that case it always ignores the hour portion of the marker. It will happily return the hour portion you gave it, but it doesn't affect the data you get. In other words, when asking for data from slowquery.log.8 passing it a marker of 12:123456 and 4:123456 are both treated as though they were 8:123456.

To poke around this for yourself, you have to have RDS configured to write the slow query log to a file. If that's the case, grab the DB instance identifier aws rds describe-db-instances --output table --region us-east-1 | grep DBInstanceIdentifier to list identifiers) and the log file aws rds describe-db-log-files --db-instance-identifier <ID> --output table --region <region> | sort -n to show log files.

Once you've got your instance identifier, log file, and region handy, download this go code[1] and give it a whirl:
go run tailrdslog.go --identifier myrdsinstanceid
Try passing it different markers and log files
go run tailrdslog.go -i prod-mysql -l slowquery/mysql-slowquery.log.10 -m 14.123456
and see what happens!

It is especially interesting to run this across an hour boundary to see what happens as the log rolls over. The script attempts to grab some of the data hiding in the previous log until everything to return to normal at 5 past the hour.

It will output a line every 5s (the default polling interval) with
  • the current time
  • the starting marker
  • the length of the content downloaded
  • the ending marker
  • the first timestamp encountered in the slow query log
  • the last timestamp encountered in the slow query log
  • (optional) sleep duration before the next poll
Oct  7 21:35:43 starting from 22:20523279, got 37424 content, next section from 22:20560703. first time is 21:35:39, last time is 21:35:43 sleeping 5 sec...
Oct  7 21:35:49 starting from 22:20560703, got 59056 content, next section from 22:20619759. first time is 21:35:43, last time is 21:35:48 sleeping 5 sec...
Oct  7 21:35:54 starting from 22:20619759, got 54649 content, next section from 22:20674408. first time is 21:35:48, last time is 21:35:53 sleeping 5 sec...
Oct  7 21:35:59 starting from 22:20674408, got 42533 content, next section from 22:20716941. first time is 21:35:54, last time is 21:35:59 sleeping 5 sec...
One nice thing to notice - the starting marker plus the content length gives you the ending marker. \o/

[1] I know, and I'm sorry. It's terrible code. It's sort of OK because it's scratch code just used to explore the API. But I should be giving the world better examples. It doesn't check pointers aren't nil before derefenencing them. The control flow is awkward. Ifs are nested too deeply. Code is repeated. I encourage you to do none of those things when you cargo cult snippets in to your production application. :)

2016-04-01

Simulating network delay on OSX

As of Mac OSX Yosemite, the tool to manipulate the networking stack changed from ipfw to pf.
There are many guides out there on the internet for both ipfw and pf, but it took me a while to sort through which ones are still applicable. Also, the most common use case is bandwidth throttling (just like I wrote about when throttling my FreeNAS uploads), but I was interested in doing some network testing.

Goal: see how my client/server app behaved with a network delay of 500ms by inserting a delay between every packet.
Tools:
  • pfctl - the utility to manipulate the PF packet filter
  • dnctl - the utility to manipulate dummynet, the part that actually inserts delay
Summary:
enable pf
pfctl -E
add a temporary extra ruleset (called "anchor") named "deeelay
(cat /etc/pf.conf && echo "dummynet-anchor \"deeelay\"" && echo "anchor \"deeelay\"") | sudo pfctl -f -
add a rule to the deeelay set to send any traffic to my app (running on port 8081) to dummynet pipe 1
echo "dummynet in quick proto tcp from any to any port 8081 pipe 1" | sudo pfctl -a deeelay -f -
Add a rule to dummynet pipe 1 to delay every packet by 500ms
sudo dnctl pipe 1 config delay 500
View your all PF rules
sudo pfctl -sa
View your PF dummynet anchors
sudo pfctl -sd
View the rules in your PF dummynet anchor
sudo pfctl -a "deeelay" -sd
View your dummynet pipe
sudo dnctl show
When you're done, reset everything to its default state
sudo pfctl -f /etc/pf.conf
View the results:

Use tshark (terminal) or wireshark (GUI) to view your traffic and verify that there is delay inserted into the conversation.
Read the dnctl and pfctl man pages for more goodies.

2016-03-31

Log all mysql queries

For the record, to log all mysql queries (as opposed to the slow query log, which only runs "slow" queries, even when you set the threshold to 0) on a running database (i.e. without a restart):
mysql> set global general_log=1;
Query OK, 0 rows affected (0.02 sec)

mysql> select @@global.general_log;
+----------------------+
| @@global.general_log |
+----------------------+
|                    1 |
+----------------------+
1 row in set (0.00 sec)

mysql> select @@global.general_log_file;
+--------------------------------+
| @@global.general_log_file      |
+--------------------------------+
| /usr/local/var/mysql/peach.log |
+--------------------------------+
1 row in set (0.00 sec)
then tail -f /usr/local/var/mysql/peach.log to get your stream of queries.

2016-03-23

Copying Playlists from iTunes to Plex Media Server on FreeNAS

I've managed to copy one playlist from iTunes to my Plex media server running on FreeNAS. It was an ordeal.

For the next time I try, here are the juicy bits:

Prep the m3u file
  • iTunes -> File -> Library -> Export Playlist
    • select 'm3u' as the file format
  • open the playlist.m3u file and replace all ^Ms with real carriage returns
  • remove any files that have non-ascii characters (eg ü, ç, ñ, é, à, etc.)
Get the Converter
  • https://github.com/dakusan/PlexPlaylistImporter 
  • The README says it requires python 3 but I got it to run with python 2.7.8
  • I had to make some changes
    • syntax error in print("...", end="")
    • munge path at line 104 to change the path necessary because of the chrooted plex jail
    • add some debugging print statements
Run the converter
  • python imp.py ./playlists/playlist.m3u 90s "/mnt/bluepool/jails/plexmediaserver_1/var/db/plexdata/Plex\ Media\ Server/Plug-in\ Support/Databases/com.plexapp.plugins.library.db" 

2016-03-17

Sublime Text CLI doesn't work after upgrading to El Capitan

I suffered this problem on my new laptop: a staple of opening files in sublime from the command line stopped working:
$ subl -n /tmp/foo.sh

I tracked it down to a few things:
  • it worked correctly in a new shell
  • it was only broken inside tmux
Searching yielded lots of half answers that were confusing about reattach-to-user-namespace. My solution:
  • brew install reattach-to-user-namespace
  • alias subl='reattach-to-user-namespace subl'
Assuming you have your subl symlink set up and in your path,  these changes will make it work from within tmux.