For all those little papers scattered across your desk
This is not a regex tutorial. Those are in the manpages.
I got my official machine running last night, but actually worked on it for the first time today. I’m still using the CentOS laptop because that’s where my current code is, and I haven’t configured the VPN yet. But for the things I was doing (Slack, email, web, even getting my Dotfiles setup), I felt much better.
Ubuntu terminal even supports truecolor and italics out of the box!
Yes, it’s really pronounced that way. Apparently the language is revered for both it’s simplicity and it’s use in networking situations. I wouldn’t know; we use it for some test suits (part of gash, the whole setup was never explained to me). And today I had to use a little to fix a broken build.
It was broken because of…
So, here’s the deal. Like I said, I’m not going to give a tutorial on regex. The
manpages are actually pretty good at it. But I am going to point you to them,
just so you can enjoy the hell many flavors of regex. (None of them taste
good.)
POSIX gives us:
You can read about them at man re_format
and man regex
.
Most UNIX-like utilities (sed
, grep
) use BREs, with a flag (usually -E
)
for extended. Whether or not it is enhanced can be gleaned from the manpages
(does it, for example, support black magic). Basically, man
and info
the
tool.
After that, we basically have the following
man perlrequick
):help pattern
)help(re)
)Don’t get me started on escape-character hell–no one should have to read
"\\"\\\\"
and try to figure out what it’s doing. Heck, certain formats even
have special rules and tables for when characters have special meaning or need
escaped (vim’s (very)?magic
is famous for this).
Does it support back references? Capture groups? What the hell is a negative
look-behind? Why did vim decide to give us \zs
and \ze
(they’re brilliantly
useful, by the way) ? And what about non-greedy modifiers ??
Can I use my favorite feature of all, shortcut character classes? This one
really gets me steamed up–shortcuts like \w
for word-like characters and \d
for digits are really helpful. It’s shorter than the equivalent [:digit:]
,
which often has to become [[:digit:]]
for true RE use.
What I recently discovered, though, is that sed
natively supports certain
shortcuts. And digit isn’t one of them.
So, the regex in the post of this title will match the number of days I’ve
worked up ‘til now, but \d
won’t. At least not in sed
. And tomorrow, it gets
even better.