wat

www.thebigduck.us

String Pattern Matching

Often used for data validation

Test against pattern (true/false)

Find matching groups
(and possibly replace content)

Regex Engines

Same language, engines.

For example: PHP native versus PCRE

POSIX (standard), BRE/ERE (Linux (grep)), Perl, java.util.regex, XRegExp (JS)

What do they look like?

/the regex goes in here/

`the regex goes in here`

"the regex goes in here"

"/the regex goes in here/"

Okay, but what does it really look like?

/foo/.test("foobar");

preg_match("/foo/", "foobar");

/foo/ =~ "foobar"

"foobar" =~ /foo/

Simple Character Test

/a/.test("JavaScript Rules");  // true

/z/.test("JavaScript Rules");  // false

Testing versus Matching

true/false vs. matched groups

/a/.test("JavaScript Rules");   // true
"JavaScript Rules".match(/a/);  // ["a"]

"JavaScript Rules".match(/z/);  // null

Where's the other "a"?

"JavaScript Rules".match(/a/);  // ["a"]

We need to make it global!

"JavaScript Rules".match(/a/g);  // ["a", "a"]

The "g" here is called a "flag", the "global" flag.

Other Flags

"JavaScript Rules".match(/r/);  // ["r"]

// global
"JavaScript Rules".match(/r/g);  // ["r"]

Case Insensitive

// case-insensitive AND global
"JavaScript Rules".match(/r/ig);  // ["r", "R"]

Multiple Characters

Any Character (.)

Classes, Ranges, and Repetition

[ ], ?, +, *, { }

Negation

[^]

Escaping Characters

\ . ? * + - | [ ] ( ) { } ^ $

Grouping and Alternation

( | )

Group Matching Caution

Creating a matched group adds processing time!

Replacement

This differs from engine to engine!

"I like {{fav}} the most".replace(/{{fav}}/, "dogs");

"I like dogs the most"

"jordan@jordankasper.com"
    .replace(/([a-z]+)@([a-z.]+)/i, "https://$2/users/$1");

"https://jordankasper.com/users/jordan"

Whitespace & Shorthands

Shorthand Example

Anchors

^ and $

Anchors

^ and $

Efficiency Warnings

Watch out for:

Too many groups (use non-matching)
Large source texts (use anchors)
Unstructured data (like HTML)

HTML Warning

HTML is not well structured, be careful!

Resources

jordankasper.com/regex-101

Playground: regex101.com
Various Engine Info: Wikipedia
Reference: regular-expressions.info
Diagramer: regexper.com
Dead Tree Version: "Mastering Regular Expressions"
Fun: regexcrossword.com

Look Arounds

They are not matches, but assertions.

Positive Look Ahead

/(the)(?=\sfat)/i

The fat cat sat on the mat

Negative Look Ahead

/(the)(?!\sfat)/i

The fat cat sat on the mat

Look Behind

Same idea, but looking backward instead of forward.

Not supported in all languages!
(notabley: JavaScript)

Look Behind

/(?<=the\s)([a-z]at)/i

The fat cat sat on the mat

We also have negative look behind with: (?<!)

/^[Rr]eg(ular\s)?[Ee]x(p|pressions?)?$/

wat

String Pattern Matching

What Language?

What Language?

What Engine?

Regex Engines

What do they look like?

Okay, but what does it really look like?

Simple Character Test

Testing versus Matching

Where's the other "a"?

Other Flags

Multiple Characters

Any Character (.)

Classes, Ranges, and Repetition

Negation

Escaping Characters

Grouping and Alternation

Group Matching Caution

Replacement

Whitespace & Shorthands

Shorthand Example

Anchors

Anchors

Efficiency Warnings

HTML Warning

Resources

Thank You!

/^[Rr]eg(ular\s)?[Ee]x(p|pressions?)?$/

Bonus Content!

Look Arounds

Positive Look Ahead

Negative Look Ahead

Look Behind

Look Behind