DEV Community

Cover image for 🚀 Human-Regex: Write Readable Regular Expressions Like English
Ridwan Ajibola
Ridwan Ajibola

Posted on • Edited on

🚀 Human-Regex: Write Readable Regular Expressions Like English

Created by Ridwan Ajibola

Sick of trying to understand those confusing regex patterns? Let's change that.

// Before: Cryptic regex for password validation  
const passwordRegex = /^(?=.*\d)(?=.*[!@#$%^&*])(?=.*[a-zA-Z]).{8,}$/;

// After: Human-readable with human-regex  
const humanPasswordRegex = createRegex()
  .startAnchor()
  .hasDigit()          // (?=.*\d)
  .hasSpecialChar()    // (?=.*[!@#$%^&*])
  .hasLetter()         // (?=.*[a-zA-Z])
  .anyCharacter()      // .
  .atLeast(8)          // {8,}
  .endAnchor()
  .toRegExp();
Enter fullscreen mode Exit fullscreen mode

🔗 GitHub | 📦 npm


Why I Built This

Regex is powerful, but let’s be honest: patterns like /^[\w-\.]+@([\w-]+\.)+[\w-]{2,4}$/ look like someone fell asleep on their keyboard. When I first learned JavaScript, regex was my #1 frustration – and every developer I asked shared this pain.

So I asked: What if we could write regex in plain English?

Human-Regex was born, created by Ridwan Ajibola, a utility library that turns regex patterns into chainable, self-documenting code.


How It Works

Traditional Regex → Human-Regex

Traditional Regex Human-Regex Equivalent
/\d/ .digit()
/[a-zA-Z]/ .letter()
/(?=.*\d)/ .hasDigit()
^ / $ .startAnchor() / .endAnchor()
const emailRegex = createRegex()
  .startAnchor()
  .word().oneOrMore()        // [a-zA-Z0-9_]+
  .literal('@')              // @
  .word().oneOrMore()        // [a-zA-Z0-9_]+
  .literal('.')              // .
  .letter().atLeast(2)       // [a-zA-Z]{2,}
  .endAnchor()
  .toRegExp();
Enter fullscreen mode Exit fullscreen mode

No more guessing what [a-zA-Z0-9_] or {2,} means. The code explains itself.


Key Features

✅ Readable Syntax
Methods like .hasDigit() and .startAnchor() make patterns self-documenting.

✅ Chainable Design
Build complex patterns step-by-step, just like writing a sentence.

✅ Lightweight
Just 1.0 kB gzipped. (1.4kB new version)

✅ Open Source
MIT licensed. Contributions welcome!


Get Started in 60 Seconds

Install the library:

   npm install human-regex
Enter fullscreen mode Exit fullscreen mode

Build your first regex:

const urlRegex = createRegex()
  .startAnchor()
  .protocol()     // https?://
  .www().optional() // (www\.)?
  .word().oneOrMore()
  .literal('.')
  .tld()          // com|org|net
  .toRegExp();
Enter fullscreen mode Exit fullscreen mode

Why This Matters

Onboarding: New developers understand regex logic at a glance.

Maintenance: No more regex archaeology to update old code.

Collaboration: Teams spend less time decoding patterns.


Join the Movement

Regex doesn’t have to be scary. Try Human Regex and:
Star the GitHub repo — it really helps! 🚀
🐞 Report bugs or request features
💡 Contribute code or documentation

Let’s make regex accessible to everyone!

Top comments (23)

Collapse
 
pengeszikra profile image
Peter Vivo

Good work, but I missing the captialLetter() function because now in this password example can be passed without using capitalLetter.

If you try to make a harder regexp example, sure to found a few more missing function.

Check this code: github.com/Pengeszikra/flogon-gala...
it is a markdown parser part of this game: dev.to/pengeszikra/javascript-grea... , try to recreate those regexp with your module and you will be found a missings.

Collapse
 
rajibola profile image
Ridwan Ajibola

Good! I will make sure to include the missing methods in the new release.

Thanks

Collapse
 
manuchehr profile image
Manuchehr

Skill issues to be honest

Collapse
 
rajibola profile image
Ridwan Ajibola

Thanks

Collapse
 
rajibola profile image
Ridwan Ajibola

I’d appreciate it if you could star the repo.

Collapse
 
msamgan profile image
Mohammed Samgan Khan

this is cool man, like really cool...

Collapse
 
rajibola profile image
Ridwan Ajibola

Thanks a lot @msamgan! Your contributions are really appreciated. If you find this project useful, it would mean a lot if you could star the repo, it helps others discover it too!

Collapse
 
rajibola profile image
Ridwan Ajibola

Thanks a lot @msamgan, I’d appreciate it if you could star the repo.

Collapse
 
wizard798 profile image
Wizard

Man, this is just amazing, gonna start using this, big applause

Collapse
 
rajibola profile image
Ridwan Ajibola

Really appreciate it. I’d love it if you could star the repo; it helps a lot!

Collapse
 
nabous profile image
Mohamed Nabous • Edited

Finally i won't get the excuse by my peers that regex is too hard!!

Great work!!!

Collapse
 
rajibola profile image
Ridwan Ajibola

I’d really appreciate it if you could star the repo—it would be a huge help!

Thanks!

Collapse
 
adesina_abdulraheem_f0ded profile image
Adesina Abdulraheem

Wow! This is a great discovering and an insight to the developers world!

Collapse
 
rajibola profile image
Ridwan Ajibola

Thanks

Collapse
 
rajibola profile image
Ridwan Ajibola

I’d appreciate it if you could star the repository

Collapse
 
hafiz_abdullahi_421eb0176 profile image
hafiz abdullahi

Great work, sir.

Collapse
 
rajibola profile image
Ridwan Ajibola

Thanks

Collapse
 
textbrew profile image
TextBrew

Like this project!

Collapse
 
bbkr profile image
Paweł bbkr Pabian • Edited

I see a bug / inconsistency:

If .digit() is matching \d then it converts to Decimal_Number Unicode property. For example (I'm not familiar with JS so I'll use Raku):

$ raku -e 'say "1๖" ~~ /\d+/;'   # DIGIT ONE and THAI DIGIT SIX codepoints
1๖」

$ raku -e 'say "1๖" ~~ /<:Decimal_Number>+/;' # same character class
1๖」

$ raku -e 'say "1๖" ~~ /<:digit>+/;' # also the same
1๖」

Enter fullscreen mode Exit fullscreen mode

Then .letter() should convert consequently to Letter Unicode property, not a-z. For example:

$ raku -e 'say "aت" ~~ /<:Letter>+/';    #  LATIN SMALL LETTER A and ARABIC LETTER TEH
aت」
Enter fullscreen mode Exit fullscreen mode

You should not make implicit ASCII / non-ASCII assumptions, where one method works differently than the other sibling.

Another bug you have is anchoring:

$ perl -E 'say "match" if "a\n" =~ /a$/' # oops!
match
Enter fullscreen mode Exit fullscreen mode

Token $ means end of logical string. What you are probably looking for is \z:

$ perl -E 'say "match" if "a" =~ /a\z/'
match

$ perl -E 'say "match" if "a\n" =~ /a\z/'
$  # no match, most likely expected result
Enter fullscreen mode Exit fullscreen mode

I don't want to discourage you, but I really dislike those "regex to human" modules. They make code crazy error-prone, because - as I just shown - you don't see explicitly what you are matching. Things get worse when you are working on multi language stack and you want to exchange your PCRE regexps with someone using other language. Basically all "Why This Matters" points are just the opposite - new developers will not understand regexes more, there will be more archeology because you will need to decipher additional layer of abstraction, and collaboration will be more difficult.

My advice would be to stick directly (or at least closely) to Unicode properties. Drop ambiguous method letter() and add Uppercase_Letter() mapping directly to Lu property. And build modifiers on top of that like Uppercase_Letter('ascii')orUppercase_Letter('script'=>'Latin')`. Otherwise this will be false friend - module that is supposed to make your life easier but it introduces weird errors and security risks because it hides too much assumptions under the hood.

Collapse
 
rajibola profile image
Ridwan Ajibola

This is explicitly for JavaScript and not for any languages

Collapse
 
bbkr profile image
Paweł bbkr Pabian • Edited

Sure. I pointed out universal issues. Imagine developer joining some project that uses this module. If he/she already has regular expression experience this interface will be confusing, because your assumption of what "letter" or "endAnchor" are is completly different than what those things mean in terms of Unicode properties and PCRE standard.

Same goes for "tld". Your module does not match TLDs. It only matches what you consider to be TLD. Exactly 3 items out of 1589 currently known TLDs, so right out of the box it has 99.81% failure rate.

I'm not trying to be mean, I'm just saying that pseudo-standards or partially implemented specs are universally bad and sooner or later backfire in every project.

Collapse
 
code42cate profile image
Jonas Scholz

I never really understood whats so hard about Regex? I think if you just learn the grammar once you dont really forget it anymore and its perfectly understandable. Nice idea anyway:)

Collapse
 
rajibola profile image
Ridwan Ajibola

Thanks Jonas