I think most of us would agree that :
"Regex" is one of those concepts in computer science that no matter how many times you practice , you may have to go back to your notes or google about it. π
What are regex/regular expressions β π
A regular expression defines a search pattern for strings. The abbreviation for regular expression is regex. The search pattern can be anything from a simple character, a fixed string or a complex expression containing special characters describing the pattern. The pattern defined by the regex may match one or several times or not at all for a given string.
π₯ Common matching symbols:
Regular Expression | Description |
---|---|
. | matches any character |
^regex | Finds regex that must match at the beginning of the line. |
regex$ | Finds regex that must match at the end of the line. |
[abc] | Set definition, can match the letter a or b or c. |
[abc][vz] | Set definition, can match a or b or c followed by either v or z. |
[^abc] | When a caret appears as the first character inside square brackets, it negates the pattern. This pattern matches any character except a or b or c. |
[a-d1-7] | Ranges: matches a letter between a and d and figures from 1 to 7, but not d1. |
X | Z |
XZ | Finds X directly followed by Z. |
$ | Checks if a line end follows. |
π₯ Meta characters:
Regular Expression | Description |
---|---|
\d | Any digit, short for [0-9] |
\D | A non-digit, short for [^0-9] |
\s | A whitespace character, short for [ \t\n\x0b\r\f] |
\S | A non-whitespace character |
\w | A word character, short for [a-zA-Z_0-9] |
\W | A non-word character [^\w] |
\S+ | Several non-whitespace characters |
\b | Matches a word boundary where a word character is [a-zA-Z0-9_] |
π― [NOTE]: These meta characters have the same first letter as their representation, e.g., digit, space, word, and boundary. Uppercase symbols define the opposite.
π₯ Quantifier:
A quantifier defines how often an element can occur. The symbols ?, *, + and {} are qualifiers.
Regular Expression | Description |
---|---|
* | Occurs zero or more times, is short for {0,} |
+ | Occurs one or more times, is short for {1,} |
? | Occurs no or one times, ? is short for {0,1}. |
{X} | Occurs X number of times, {} describes the order of the preceding liberal |
{X,Y} | Occurs between X and Y times |
*? | ? after a quantifier makes it a reluctant quantifier. It tries to find the smallest match. This makes the regular expression stop at the first match. |
π― [NOTE] - The backslash \ is an escape character in Java Strings. That means backslash has a predefined meaning in Java. You have to use double backslash \ to define a single backslash. If you want to define \w, then you must be using \w in your regex. If you want to use backslash as a literal, you have to type \\ as \ is also an escape character in regular expressions.
π₯ Redefined methods on String for processing regular expressions.
Regular Expression | Description |
---|---|
s.matches("regex") | Evaluates if "regex" matches s. Returns only true if the WHOLE string can be matched. |
s.split("regex") | Creates an array with substrings of s divided at occurrence of "regex". "regex" is not included in the result. |
s.replaceFirst("regex"), "replacement" | Replaces first occurance of "regex" with "replacement. |
s.replaceAll("regex"), "replacement" | Replaces all occurances of "regex" with "replacement. |
Practice your Regex Skills here
Cheatsheet
Some resources gathered from the discussions panel:
regex101
regexr
ihateregex.io
Thank you Madza , Jakeer for the links.
Some of my other posts:
Concept | link |
---|---|
Java Access Modifiers | goto Article |
Java Generics | goto Article |
Java Regex | goto Article |
Java Streams Api | goto Article |
Please leave a β€οΈ if you liked this post!
A π¦ would be great!
And feel free to let me know in the discussions if you think i missed something.
HAVE A GOOD DAY!
Top comments (9)
To me regex has always been regex101.com and regexr.com π€£π€£
ihateregex.io
Have you tried this
Thank for it
Thanks dude
that one too, yes :)
Thank you for this. Very helpful.
What regex? There's more different syntaxes for regular expressions than I can count; something that works in perl might not work in vim; grep allows selecting one of several regex flavours, etc.
Interesting article. A nice regex visualizer / tester: extendsclass.com/regex-tester.html
Salam,
Thanks helpfull