Junk Drawer Logo Junk Drawer

For all those little papers scattered across your desk

Commit subject case in Git history

D. Ben Knoble on 04 Jan 2025 in Blog

Some insist that commit subjects be capitalized. Does this actually happen in practice?

Here’s a quick analysis of the Git source repository (see my fields script or use your own):

git log --oneline --no-decorate | fields 2 |
    tee >(grep -c '^[[:lower:]]' >lower) >(grep -c '^[[:upper:]]' >upper) >/dev/null
diff -y lower upper
# => 44019 | 30636

Looks roughly even, with a slight preference for lower case words (~59%). But if we filter out automatic messages from merges and reverts:

git log --oneline --no-decorate | fields 2 |
    tee >(grep -c '^[[:lower:]]' >lower) \
      >(grep -v -e Merge -e Revert | grep -c '^[[:upper:]]' >upper) >/dev/null
diff -y lower upper
# => 44019 | 11423

Well, that’s just above 79% preference for lowercase.

Measurements taken from the branch I happened to be on, which is daee763610 (completion: repair config completion for Zsh, 2024-12-29).

Note that the total from the first report (74655) undercounts the number of commits by about 1109 according to git rev-list --count HEAD. This is exactly the number of commits output by

git log --oneline --no-decorate | fields 2 | grep -cv -e '^[[:lower:]]' -e '^[[:upper:]]'

Here’s a histogram of the starts that got omitted by my simple measure (remove -c on the previous grep and pipe to sort | uniq -c | sort -rn):

 892 [PATCH]
  36 .mailmap:
  19 [PATCH
  14 "git
  12 .gitignore:
   9 .gitattributes:
   4 __attribute__:
   4 2.36
   4 *:
   3 [patch]
   3 .mailmap
   3 --walk-reflogs:
   3 --pretty=format:
   3 *.sh:
   3 *.h:
   3 *.c
   3 *.[ch]:
   3 'git
   3 "log
   3 "Assume
   2 {fetch,upload}-pack:
   2 0th
   2 -u
   2 --dirstat:
   2 *.[ch]
   2 (trivial)
   2 $EMAIL
   2 "remote
   2 "rebase
   2 "make
   1 {upload,receive}-pack
   1 {reset,merge}:
   1 {lock,commit,rollback}_packed_refs():
   1 {cvs,svn}import:
   1 {builtin/*,repository}.c:
   1 `git
   1 _XOPEN_SOURCE
   1 _GIT_INDEX_OUTPUT:
   1 \n
   1 [fr]
   1 ?alloc:
   1 3%
   1 2.3.2
   1 .gitignore
   1 .github/workflows/main.yml:
   1 .github/PULL_REQUEST_TEMPLATE.md:
   1 ./configure.ac:
   1 -Wuninitialized:
   1 -Wold-style-definition
   1 --utf8
   1 --summary
   1 --prune
   1 --name-only,
   1 --format=pretty:
   1 --dirstat-by-file:
   1 --base-path-relaxed
   1 *config.txt:
   1 *.h
   1 (various):
   1 (squash)
   1 (short)
   1 (revert
   1 (encode_85,
   1 (cvs|svn)import:
   1 (Hopefully)
   1 'make
   1 'git-merge':
   1 'build'
   1 $GIT_COMMON_DIR:
   1 "test"
   1 "reset
   1 "needs
   1 "lib-diff"
   1 "init-db"
   1 "git-tag
   1 "git-push
   1 "git-merge":
   1 "git-fetch
   1 "git-apply
   1 "git-add
   1 "git"
   1 "format-patch
   1 "diff
   1 "current_exec_path"
   1 "core.sharedrepository
   1 "color.diff
   1 "checkout
   1 "branch
   1 "blame
   1 "assume
   1 "add
   1 

Just for kicks, if I try the 2nd word of the commit message in those cases:

git log --oneline --no-decorate | fields 2 3 |
    grep -v -e '^[[:lower:]]' -e '^[[:upper:]]' | fields 2 |
    tee >(grep -c '^[[:lower:]]' >lower) >(grep -c '^[[:upper:]]' >upper) >/dev/null
diff -y lower upper
# => 491 | 567

We’re still undercounting by 51, but the above doesn’t modify our results much (essentially still within the rounding I used). The 51 commits ought to be in the noise. Even attributing them all as upper case, the rounded percentages don’t change.


Tags:

Categories: Blog

Load Comments
Previous Next
Back to posts