Junk Drawer Logo Junk Drawer

For all those little papers scattered across your desk

I stand for the Constitution, for due process, and for community that takes care of each other.

Personal Commit Statistics

D. Ben Knoble on 19 Sep 2025 in Blog

Here’s some statistics on commits I’ve authored collected from git-cstat over a few work and open-source repositories.

Min, max, average commit length in words

Note the logarithmic x-axis (words).

With my Dotfiles included:

Commit stats distributions with Dotfiles

Without my Dotfiles included:

Commit stats distributions without Dotfiles

Commentary

Raw Data

A comment marks the Dotfiles data

# work
# Short commits: "init", "chore: upgrade dependencies", etc.
# Extremely long commits: squashed several reasonable lengths
min: 1
max: 55
avg: 11.8571
min: 3
max: 662
avg: 50.4
min: 4
max: 179
avg: 41.8889
min: 208
max: 356
avg: 282
min: 1
max: 202
avg: 43.5833
min: 23
max: 3995
avg: 1141.25
min: 2
max: 8046
avg: 193.755
min: 2
max: 1145
avg: 62.1368
min: 1
max: 27
avg: 6.42222
min: 2
max: 251
avg: 55.1667
min: 7
max: 68
avg: 20.2222
min: 93
max: 598
avg: 299
min: 4
max: 9
avg: 7
min: 2
max: 279
avg: 104.364
min: 4
max: 98
avg: 33.3636
min: 3
max: 120
avg: 21.1
min: 1
max: 105
avg: 13.8854
min: 1
max: 215
avg: 10.2741
min: 1
max: 160
avg: 15.6222
min: 4
max: 464
avg: 115.13

# OSS
min: 1
max: 730
avg: 17,1961
min: 41
max: 343
avg: 103,4
min: 2
max: 102
avg: 23,9048
# Dotfiles
min: 1
max: 705
avg: 10,3589

Pict script

This is also in my bin directory.

It would arguably have been more efficient to read the data into a tidy dataframe from the outset, but I was more confident in this version of the parsing code, and it works.

The empty racket/gui require is to make it easy to replace (save-pict out) with show-pict when you want just see the results. I experimented with an interface to either show or save the picture, and I didn’t come up with anything I particularly liked.

PS Apparently the syntax highlighter here (rouge) doesn’t know how to parse Racket’s reader syntax for (Perl-ish) regular expressions.

#! /usr/bin/env racket
#lang racket
; vim: ft=racket

(require (only-in racket/gui)
         pict
         threading
         data-frame
         sawzall
         graphite)

(define-values (data-file out)
  (command-line
   #:args (data-file out)
   (values data-file out)))

(define-values (min max avg)
  (for/lists (min? max? avg?
              #:result (values (filter values min?)
                               (filter values max?)
                               (filter values avg?)))
             ([line (in-lines (open-input-file data-file))])
    (match line
      [(pregexp #px"^min: ([[:digit:].]+)" (list _ (app string->number min))) (values min #f #f)]
      [(pregexp #px"^max: ([[:digit:].]+)" (list _ (app string->number max))) (values #f max #f)]
      [(pregexp #px"^avg: ([[:digit:].]+)" (list _ (app string->number avg))) (values #f #f avg)]
      [_ (values #f #f #f)])))

(define df
  (make-data-frame #:series (list (make-series "min" #:data (list->vector min))
                                  (make-series "max" #:data (list->vector max))
                                  (make-series "avg" #:data (list->vector avg)))))

(~> df
    (pivot-longer everything #:names-to "stat" #:values-to "words")
    (graph #:data _
            #:mapping (aes #:y "stat" #:x "words")
            #:y-label "Stat" #:x-label "Number of words"
            #:x-transform logarithmic-transform
            #:title "Commit message length (words)"
            (boxplot #:invert? #t #:show-outliers? #t)
            (points))
    (save-pict out))

Notes

  1. Based on that old post about Perl


Tags:

Categories: Blog

Load Comments
Previous Next
Back to posts