Avoid naughty words in hex digest strings by changing the set of characters
Hexadecimal strings typically contain 0–9 and a–f, which gives plenty of opportunity for (naughty) words to form (Batchelder, 2007), e.g. “deadbabe”.
If we also accept “9” as a substitute for “g,” then even more (naughty) words can be built, e.g. “defacedfa9”. Your imagination is the limit.
To avoid accidentally generating words, we can remove vowels (A E I O U, and arguably Y) from the list of possible characters, and additionally remove characters that can stand in for vowels (0 1 3 4).
The set of characters we use for representing hexadecimal strings therefore becomes shifted: instead of
0123456789abcdef, we now have
Performing this replacement is trivial with
tr on the command line:
% echo 1337cafebabe | tr 0123456789abcdef 256789bcdfghjklm 577cjgmlhghl
String#tr does the same:
'1337cafebabe'.tr('0123456789abcdef', '256789bcdfghjklm') # => "577cjgmlhghl"
Remove ambiguous characters
We can go further and remove ambiguous characters as well. While not necessary, it is helpful when writing down hexadecimal strings (especially when using non-conventional characters, like we do).
Ambiguous characters include B D G J Q S Z and 0 1 2 5 6 8 (with 0 and 1 already removed because they can substitute a vowel). The set of characters we use for representing hexadecimal strings has shifted again: instead of
256789bcdfghjklm (which replaced
0123456789abcdef), we now have
We can use it the same way. Command line:
% echo 1337cafebabe | tr 0123456789abcdef 79cfhklmnprtvwxz 9ffmvrzxtrtx
'1337cafebabe'.tr('0123456789abcdef', '79cfhklmnprtvwxz') # => "9ffmvrzxtrtx"
79cfhklmnprtvwxz is exactly 16 characters and ends with
z. A wonderful coincidence.
Batchelder, Ned. “Hex Words,” March 20, 2007. https://nedbatchelder.com/text/hexwords.html.