Knowing how to write RegEx is crucial for creating or customizing modules or rules for Fail2Ban, Snort, etc.

RegEx searches text by comparing it against a pattern syntax that defines what to match.

A great online tool for testing expressions in real time is RegExr [Link].

Basics

  • /abc/
    • Searches for abc in the text and stops at the first match.
  • /abc/g
    • Searches for abc in the text. The /g flag continues searching for additional matches throughout the text.
  • /abc/gi
    • Makes the search case-insensitive.
  • /a+/g
    • Matches one or more consecutive a‘s.
  • /ab?/g
    • Matches a followed by an optional b. The ? makes the preceding character optional.
  • /ab*/g
    • The * matches zero or more b‘s after a.
  • /.b/g
    • The . is a wildcard that matches any single character followed by b. It does not match a literal period.
  • /\./g
    • Matches a literal period. The same applies to other special characters like ()[]{} — prefix them with a backslash (e.g. \(\)\[\]\{\}) to match them literally instead of using them as syntax.
  • /\.$/g
    • Matches a period at the end of the text.
  • /\.$/gm
    • Matches a period at the end of each line.
  • /^abc/g
    • Matches abc at the beginning of the text.
  • /^abc/gm
    • Matches abc at the beginning of each line in multiline text.
  • /\w/g
    • Matches any word character.
  • /\W/g
    • Matches any non-word character.
  • /\s/g
    • Matches any whitespace character.
  • /\S/g
    • Matches any non-whitespace character.
  • /\w{5}/g
    • Matches exactly 5 word characters.
  • /\w{5,}/g
    • Matches 5 or more word characters.
  • /\w{5,8}/g
    • Matches between 5 and 8 word characters.
  • /\d/g
    • \d matches any digit (0-9).
  • /[aáàãăâ]bc/g
    • Matches any character from the list followed by bc.
  • /[a-zA-Z0-9]/g
    • Matches characters using ranges.
  • /[^0-9]/g
    • Matches any character not in the list.
  • /(abc|xyz)/g
    • A group using the | (or) operator to match either alternative.
  • /(x|y|z){2,3}/g
    • Requires two or three consecutive characters from the group to match.
  • /(?<=acb)./g
    • A positive lookbehind that matches anything preceded by abc, without including abc in the match.
  • /(?<!acb)./g
    • A negative lookbehind that matches anything not preceded by abc.
  • /.(?=acb)/g
    • A positive lookahead that matches anything followed by abc.
  • /.(?!acb)/g
    • A negative lookahead that matches anything not followed by abc.

Expressions

  • /(?<name1>abc)(?<name2>xyz)(?:mnt)/
    • Assigns named groups to each part of the match: abc becomes name1, xyz becomes name2, and mnt is an unnamed group. In a find and replace, you can reference these names to reorder the groups, for example using $name2$name1 to swap them.
  • /(\+?[1-9]{1,3}[ -]?)?\(?\d{3}\)?[ -]?\d{3}[ -]?\d{4}/
    • Validates phone numbers in a variety of formats, including:
      • 1234567890
        123-456-7890
        123 456 7890
        (123) 456 7890
        1 (123) 456 7890
        +1(123)4567890
        +55 1234567890
        +551234567890
  • /[a-z0-9-._%+]{1,50}@[a-z0-9-._]{1,50}\.[a-z]{2,}/
    • Validates email addresses.

SOURCES

Excellent deep dive into Regular Expressions [Link].