Posted on

Perl Regular Expression \K Trick

Regular expressions are a frequently useful tool in our profession, and Perl is probably the most advanced arena for testing your ability to wield regexes.  That’s because Perl has the most feature-rich regular expressions out there (that I know of anyways).  There’s always some new trick to learn about Perl regexes.

Case in point: \K.  Let’s say you want to replace the end of every line that begins with ‘Parent Commit:’, where that string is followed by whitespace and a forty-character hash.  You want to replace the hash.  But you have to hold on to the beginning of the string.  Here’s one way to go about it:

s/^Parent Commit:\s+[0-9a-f]{40}$/Parent Commit: $new_hash/gi

This works, but repeating ‘Parent Commit’ is duplication we would like to avoid.

s/^(Parent Commit:)\s+[0-9a-f]{40}$/$1 $new_hash/gi

Here we capture the beginning of the string so that we can use insert it inside of the replacement part.  This prevents us from having to manually copy the text, but—and maybe this is just me—having to capture that text is annoying.  It kinda feels like a waste of a group.

Enter \K.  When Perl sees this meta-character it throws away everything that it has matched up to that point.  This lets the regex engine continue with a clean slate.  In the context of s///, it means that our replacement won’t affect anything before the \K, because Perl will have forgotten about it.  That means we can write the regex above in the form

s/^Parent Commit:\s+\K[0-9a-f]{40}$/$new_hash/gi

After the \K we are left matching only the hash.  The ‘Parent Commit:\s+’ section gets ignored and we end up performing


except the initial part of the string will still be left intact after the replacement.  This way we don’t need to repeat ‘Parent Commit’ or use a capture group to prevent it from getting replaced.

Anyone have any other regex tricks or tips?  Please share if you do.