- Published on
A Practical Guide to Regular Expressions for Developers
- Authors
- Name
Regular expressions (regex) are pattern-matching strings that can validate input, extract data, and transform text with a single line of code. They're available in every major programming language and are indispensable for log analysis, form validation, data parsing, and code refactoring.
Use Intoolhub's Regex Tester to test every example in this guide live, with real-time match highlighting.
The Basics
A regular expression is a sequence of characters that defines a search pattern. At its simplest:
/hello/
This matches the literal string "hello" anywhere in the input.
Character Classes
Character classes match one character from a set:
| Pattern | Matches |
|---|---|
[abc] | a, b, or c |
[a-z] | any lowercase letter |
[A-Z] | any uppercase letter |
[0-9] | any digit |
[^abc] | anything except a, b, or c |
Shorthand classes
| Shorthand | Equivalent | Meaning |
|---|---|---|
\d | [0-9] | digit |
\D | [^0-9] | non-digit |
\w | [a-zA-Z0-9_] | word character |
\W | [^a-zA-Z0-9_] | non-word character |
\s | [ \t\r\n\f] | whitespace |
\S | [^ \t\r\n\f] | non-whitespace |
. | (anything) | any character except newline |
Quantifiers
Quantifiers control how many times a pattern must match:
| Quantifier | Meaning |
|---|---|
* | 0 or more |
+ | 1 or more |
? | 0 or 1 (optional) |
{3} | exactly 3 |
{3,} | 3 or more |
{3,6} | between 3 and 6 |
Example — match a US phone number:
/\d{3}[-.\s]?\d{3}[-.\s]?\d{4}/
Matches: 555-867-5309, 555.867.5309, 5558675309
Anchors
Anchors assert position, not characters:
| Anchor | Matches |
|---|---|
^ | Start of string (or line in multiline mode) |
$ | End of string (or line in multiline mode) |
\b | Word boundary |
\B | Non-word boundary |
Example — validate that a string is exactly a 5-digit ZIP code:
/^\d{5}$/
Groups and Capturing
Parentheses group patterns and capture matches:
const match = '2025-02-24'.match(/(\d{4})-(\d{2})-(\d{2})/)
// match[1] = "2025", match[2] = "02", match[3] = "24"
Named groups (modern syntax)
const { year, month, day } = '2025-02-24'.match(
/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/
).groups
Non-capturing group
Use (?:...) when you need grouping for quantifiers but don't need to capture:
/(?:https?|ftp):\/\//
Flags
| Flag | Meaning |
|---|---|
g | Global — find all matches, not just the first |
i | Case-insensitive |
m | Multiline — ^ and $ match line boundaries |
s | Dotall — . also matches newline |
Practical Examples
Email validation (simplified)
/^[^\s@]+@[^\s@]+\.[^\s@]+$/
URL matching
/https?:\/\/(www\.)?[-a-zA-Z0-9@:%._+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_+.~#?&/=]*)/
Extract all hex colors from CSS
/#([0-9a-fA-F]{3}|[0-9a-fA-F]{6})\b/g
Validate a UUID v4
/^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$/i
Strip HTML tags
const stripped = html.replace(/<[^>]*>/g, '')
Common Pitfalls
Greedy vs. lazy matching
.* is greedy — it matches as much as possible. Add ? to make it lazy:
/<.+>/ // greedy: matches the entire "<b>bold</b>"
/<.+?>/ // lazy: matches just "<b>"
Catastrophic backtracking
Nested quantifiers like (a+)+ can cause exponential runtime on certain inputs. Always test regex against adversarial inputs before using in production.
Escaping special characters
These characters have special meaning and must be escaped with \ when used literally:
. * + ? ^ $ { } [ ] | ( ) \
Testing Regex
The Regex Tester on Intoolhub lets you:
- Enter a pattern and test string side by side
- See all matches highlighted in real time
- Toggle flags (g, i, m, s) with checkboxes
- View captured groups in a structured table