For all those little papers scattered across your desk
I stand for the Constitution, for due process, and for community that takes care of each other.
Here’s some statistics on commits I’ve authored collected from git-cstat
over a few work and open-source
repositories.
Note the logarithmic x-axis (words).
With my Dotfiles included:
Without my Dotfiles included:
init
, chore: upgrade dependencies
, things like
that. Occasionally my shortest contribution to a project is long, though! I
think this happens most when I contribute to open source or innersource
projects where I need to spell out my reasoning more clearly regardless of the
commit.My average average drops from a little over 40 to a little over 20 when
including my Dotfiles. That makes sense: my Dotfiles have a long life, going
back to before I use Git in the way I do today. Even now there are plenty of
~10 word commit messages and only a few long ones: for example, using my
fields
and bucket10
1
scripts:
λ g cstat len | fields 1 | bucket10 | sort -n
1 21 ██
2 191 █████████████████████
3 648 ███████████████████████████████████████████████████████████████████████
4 380 ██████████████████████████████████████████
5 346 ██████████████████████████████████████
6 263 █████████████████████████████
7 172 ███████████████████
8 111 ████████████
9 68 ███████
10 178 ████████████████████
20 209 ███████████████████████
30 101 ███████████
40 73 ████████
50 16 ██
60 18 ██
70 14 ██
80 15 ██
90 7 █
100 13 █
200 4
700 1
A comment marks the Dotfiles data
# work
# Short commits: "init", "chore: upgrade dependencies", etc.
# Extremely long commits: squashed several reasonable lengths
min: 1
max: 55
avg: 11.8571
min: 3
max: 662
avg: 50.4
min: 4
max: 179
avg: 41.8889
min: 208
max: 356
avg: 282
min: 1
max: 202
avg: 43.5833
min: 23
max: 3995
avg: 1141.25
min: 2
max: 8046
avg: 193.755
min: 2
max: 1145
avg: 62.1368
min: 1
max: 27
avg: 6.42222
min: 2
max: 251
avg: 55.1667
min: 7
max: 68
avg: 20.2222
min: 93
max: 598
avg: 299
min: 4
max: 9
avg: 7
min: 2
max: 279
avg: 104.364
min: 4
max: 98
avg: 33.3636
min: 3
max: 120
avg: 21.1
min: 1
max: 105
avg: 13.8854
min: 1
max: 215
avg: 10.2741
min: 1
max: 160
avg: 15.6222
min: 4
max: 464
avg: 115.13
# OSS
min: 1
max: 730
avg: 17,1961
min: 41
max: 343
avg: 103,4
min: 2
max: 102
avg: 23,9048
# Dotfiles
min: 1
max: 705
avg: 10,3589
This is also in my bin
directory.
It would arguably have been more efficient to read the data into a tidy dataframe from the outset, but I was more confident in this version of the parsing code, and it works.
The empty racket/gui
require is to make it easy to replace (save-pict out)
with show-pict
when you want just see the results. I experimented with an
interface to either show or save the picture, and I didn’t come up with anything
I particularly liked.
PS Apparently the syntax highlighter here (rouge) doesn’t know how to parse Racket’s reader syntax for (Perl-ish) regular expressions.
#! /usr/bin/env racket
#lang racket
; vim: ft=racket
(require (only-in racket/gui)
pict
threading
data-frame
sawzall
graphite)
(define-values (data-file out)
(command-line
#:args (data-file out)
(values data-file out)))
(define-values (min max avg)
(for/lists (min? max? avg?
#:result (values (filter values min?)
(filter values max?)
(filter values avg?)))
([line (in-lines (open-input-file data-file))])
(match line
[(pregexp #px"^min: ([[:digit:].]+)" (list _ (app string->number min))) (values min #f #f)]
[(pregexp #px"^max: ([[:digit:].]+)" (list _ (app string->number max))) (values #f max #f)]
[(pregexp #px"^avg: ([[:digit:].]+)" (list _ (app string->number avg))) (values #f #f avg)]
[_ (values #f #f #f)])))
(define df
(make-data-frame #:series (list (make-series "min" #:data (list->vector min))
(make-series "max" #:data (list->vector max))
(make-series "avg" #:data (list->vector avg)))))
(~> df
(pivot-longer everything #:names-to "stat" #:values-to "words")
(graph #:data _
#:mapping (aes #:y "stat" #:x "words")
#:y-label "Stat" #:x-label "Number of words"
#:x-transform logarithmic-transform
#:title "Commit message length (words)"
(boxplot #:invert? #t #:show-outliers? #t)
(points))
(save-pict out))
Based on that old post about Perl. ↩