svn tricks and rails on sundays 7
I've got a few projects that I work on when I get the time. Since I usually work on all of them at the same time, it seems none of them moves forward very fast. I got curious to see how much work I am actually doing over time, and came up with a few little SVN hacks.
First, get the svn logs, pipe into a file:
% cd <head_of_the_svn_tree>
% svn log -q | egrep '^r' > activity.csv
Right, that gives us a file with all of the project checkins. The 'egrep' part strips out all of the annoying dashes that come with the svn log. The data looks like of like this:
r2 | danielw | 2006-12-20 00:38:13 +0200 (Wed, 20 Dec 2006)
r1 | danielw | 2006-12-20 00:33:41 +0200 (Wed, 20 Dec 2006)
Now, with some command-line tricks I can break down the activity a little more:
% svn log -q | egrep '^r' | cut -d '|' -f 2 | sort | uniq -c | sort -n
This breaks down the log and counts the number of checkins per person. You can point it to a URL as well. Results on one of my SVN trees gives something like this:
6 carl
123 danielw
What I am really interested in is how this activity progresses over time. I don't know how to do this on the command line, but SQL could do this in no time. We need to create a database and a table to hold the data. In postgres, like this:
% createdb work_activity
% psql -d work_activity
work_activity => create table svn_activity (revision varchar, who varchar, date timestamp);
Now we need to populate this with data. Since the end of that SVN line has got some funny timestamps, we'll get AWK to strip that out for us. Also, since the standard postgres column delimiter is the tab (\t), we'll delimit our records like that. Also, let's use the rails project to get more interesting stats.
% svn log -q http://svn.rubyonrails.org/rails/trunk > activity_rails.txt
% cat activity_rails.txt | egrep '^r' | awk '{print $1"\t"$3"\t"$5}' > activity_rails.data
This puts all of the data into a file, which we can now load into the DB in a single easy command:
% psql -d work_activity -c 'COPY svn_activity FROM STDIN' < activity_rails.data
Now it's all in the database, and we can do loads of fancy queries on it:
% psql -d work_activity -c "select date_trunc('month', date), count(*) from svn_activity group by 1 order by 1;"
date_trunc | count
---------------------+-------
2004-11-01 00:00:00 | 30
2004-12-01 00:00:00 | 259
2005-01-01 00:00:00 | 218
2005-02-01 00:00:00 | 219
2005-03-01 00:00:00 | 227
2005-04-01 00:00:00 | 199
2005-05-01 00:00:00 | 99
2005-06-01 00:00:00 | 172
2005-07-01 00:00:00 | 304
2005-08-01 00:00:00 | 63
2005-09-01 00:00:00 | 263
2005-10-01 00:00:00 | 306
2005-11-01 00:00:00 | 265
2005-12-01 00:00:00 | 93
2006-01-01 00:00:00 | 79
2006-02-01 00:00:00 | 163
2006-03-01 00:00:00 | 347
2006-04-01 00:00:00 | 162
2006-05-01 00:00:00 | 60
2006-06-01 00:00:00 | 116
2006-07-01 00:00:00 | 96
2006-08-01 00:00:00 | 162
2006-09-01 00:00:00 | 216
2006-10-01 00:00:00 | 130
2006-11-01 00:00:00 | 139
2006-12-01 00:00:00 | 97
2007-01-01 00:00:00 | 155
2007-02-01 00:00:00 | 92
2007-03-01 00:00:00 | 101
2007-04-01 00:00:00 | 65
2007-05-01 00:00:00 | 192
2007-06-01 00:00:00 | 115
2007-07-01 00:00:00 | 39
2007-08-01 00:00:00 | 43
2007-09-01 00:00:00 | 278
2007-10-01 00:00:00 | 236
2007-11-01 00:00:00 | 105
Looks like a very healthy project. Ok, let's find out on what day of the week rails developers have been most prolific:
psql -d work_activity -c "select extract(dow from date) as day, count(*) from svn_activity group by 1 order by 1;"
day | count
-----+-------
0 | 1040
1 | 969
2 | 874
3 | 755
4 | 790
5 | 688
6 | 789
(7 rows)
Day 0 is sunday! Thanks for the hard work, guys.
