Knowing how to write RegEx is crucial for creating or customizing modules or rules for Fail2Ban, Snort, etc.
RegEx searches text by comparing it against a pattern syntax that defines what to match.
A great online tool for testing expressions in real time is RegExr [Link].
Basics
- /abc/
- Searches for abc in the text and stops at the first match.
- /abc/g
- Searches for abc in the text. The /g flag continues searching for additional matches throughout the text.
- /abc/gi
- Makes the search case-insensitive.
- /a+/g
- Matches one or more consecutive a‘s.
- /ab?/g
- Matches a followed by an optional b. The ? makes the preceding character optional.
- /ab*/g
- The * matches zero or more b‘s after a.
- /.b/g
- The . is a wildcard that matches any single character followed by b. It does not match a literal period.
- /\./g
- Matches a literal period. The same applies to other special characters like ()[]{} — prefix them with a backslash (e.g. \(\)\[\]\{\}) to match them literally instead of using them as syntax.
- /\.$/g
- Matches a period at the end of the text.
- /\.$/gm
- Matches a period at the end of each line.
- /^abc/g
- Matches abc at the beginning of the text.
- /^abc/gm
- Matches abc at the beginning of each line in multiline text.
- /\w/g
- Matches any word character.
- /\W/g
- Matches any non-word character.
- /\s/g
- Matches any whitespace character.
- /\S/g
- Matches any non-whitespace character.
- /\w{5}/g
- Matches exactly 5 word characters.
- /\w{5,}/g
- Matches 5 or more word characters.
- /\w{5,8}/g
- Matches between 5 and 8 word characters.
- /\d/g
- \d matches any digit (0-9).
- /[aáàãăâ]bc/g
- Matches any character from the list followed by bc.
- /[a-zA-Z0-9]/g
- Matches characters using ranges.
- /[^0-9]/g
- Matches any character not in the list.
- /(abc|xyz)/g
- A group using the | (or) operator to match either alternative.
- /(x|y|z){2,3}/g
- Requires two or three consecutive characters from the group to match.
- /(?<=acb)./g
- A positive lookbehind that matches anything preceded by abc, without including abc in the match.
- /(?<!acb)./g
- A negative lookbehind that matches anything not preceded by abc.
- /.(?=acb)/g
- A positive lookahead that matches anything followed by abc.
- /.(?!acb)/g
- A negative lookahead that matches anything not followed by abc.
Expressions
- /(?<name1>abc)(?<name2>xyz)(?:mnt)/
- Assigns named groups to each part of the match: abc becomes name1, xyz becomes name2, and mnt is an unnamed group. In a find and replace, you can reference these names to reorder the groups, for example using $name2$name1 to swap them.
- /(\+?[1-9]{1,3}[ -]?)?\(?\d{3}\)?[ -]?\d{3}[ -]?\d{4}/
- Validates phone numbers in a variety of formats, including:
- 1234567890
123-456-7890
123 456 7890
(123) 456 7890
1 (123) 456 7890
+1(123)4567890
+55 1234567890
+551234567890
- 1234567890
- Validates phone numbers in a variety of formats, including:
- /[a-z0-9-._%+]{1,50}@[a-z0-9-._]{1,50}\.[a-z]{2,}/
- Validates email addresses.
SOURCES
Excellent deep dive into Regular Expressions [Link].