20.9. Summary of regular expression notation

Inform 7 Home Page / Documentation

§20.9. Summary of regular expression notation

MATCHING

Positional restrictions

^	Matches (accepting no text) only at the start of the text
$	Matches (accepting no text) only at the end of the text
\b	Word boundary: matches at either end of text or between a \w and a \W
\B	Matches anywhere where \b does not match

Backslashed character classes

\char	If char is other than a-z, A-Z, 0-9 or space, matches that literal char
\\	For example, this matches literal backslash "\"
\n	Matches literal line break character
\t	Matches literal tab character (but use this only with external files)

\d	Matches any single digit
\l	Matches any lower case letter (by Unicode 4.0.0 definition)
\p	Matches any single punctuation mark: . , ! ? - / " : ; ( ) [ ] { }
\s	Matches any single spacing character (space, line break, tab)
\u	Matches any upper case letter (by Unicode 4.0.0 definition)
\w	Matches any single word character (neither \p nor \s)

\D	Matches any single non-digit
\L	Matches any non-lower-case-letter
\P	Matches any single non-punctuation-mark
\S	Matches any single non-spacing-character
\U	Matches any non-upper-case-letter
\W	Matches any single non-word-character (i.e., matches either \p or \s)

Other character classes

.	Matches any single character
<...>	Character range: matches any single character inside
<^...>	Negated character range: matches any single character not inside

Inside a character range

e-h	Any character in the run "e" to "h" inclusive (and so on for other runs)
>...	Starting with ">" means that a literal close angle bracket is included
\	Backslash has the same meaning as for backslashed character classes: see above

Structural

\|	Divides alternatives: "fish\|fowl" matches either
(?i)	Always matches: switches to case-insensitive matching from here on
(?-i)	Always matches: switches to case-sensitive matching from here on

Repetitions

...?	Matches "..." either 0 or 1 times, i.e., makes "..." optional
...*	Matches "..." 0 or more times: e.g. "\s*" matches an optional run of space
...+	Matches "..." 1 or more times: e.g. "x+" matches any run of "x"s
...{6}	Matches "..." exactly 6 times (similarly for other numbers, of course)
...{2,5}	Matches "..." between 2 and 5 times
...{3,}	Matches "..." 3 or more times
....?	"?" after any repetition makes it "lazy", matching as few repeats as it can

Numbered subexpressions

(...)	Groups part of the expression together: matches if the interior matches
\1	Matches the contents of the 1st subexpression reading left to right
\2	Matches the contents of the 2nd, and so on up to "\9" (but no further)

Unnumbered subexpressions

(# ...)	Comment: always matches, and the contents are ignored
(?= ...)	Lookahead: matches if the text ahead matches "...", but doesn't consume it
(?! ...)	Negated lookahead: matches if lookahead fails
(?<= ...)	Lookbehind: matches if the text behind matches "...", but doesn't consume it
(?<! ...)	Negated lookbehind: matches if lookbehind fails
(> ...)	Possessive: tries to match "..." and if it succeeds, never backtracks on this
(?(1)...)	Conditional: if \1 has matched by now, require that "..." be matched
(?(1)...\|...)	Conditional: ditto, but if \1 has not matched, require the second part
(?(?=...)...\|...)	Conditional with lookahead as its condition for which to match
(?(?<=...)...\|...)	Conditional with lookbehind as its condition for which to match

IN REPLACEMENT TEXT

\char	If char is other than a-z, A-Z, 0-9 or space, expands to that literal char
\\	In particular, "\\" expands to a literal backslash "\"
\n	Expands to a line break character
\t	Expands to a tab character (but use this only with external files)
\0	Expands to the full text matched
\1	Expands to whatever the 1st bracketed subexpression matched
\2	Expands to whatever the 2nd matched, and so on up to "\9" (but no further)
\l0	Expands to \0 converted to lower case (and so on for "\l1" to "\l9")
\u0	Expands to \0 converted to upper case (and so on for "\u1" to "\u9")

	Start of Chapter 20: Advanced Text
	Back to §20.8. Replacements
	Onward to Chapter 21: Lists: §21.1. Lists and entries