Yuan Gao

Posted on Dec 19, 2020 • Edited on Dec 22, 2020

Advent of Code 2020: Day 19 abusing PEG grammar in Python the way it's not supposed to

#adventofcode #python

Another quick one today.

The Challenge Part 1

Link to challenge on Advent of Code 2020 website

Today's challenge provides a list of rules to parse and apply to messag... wait why does this seem so familiar? Oh, it's because the way the challenge describes rules is exactly the same way PEG grammars are written, which we just used in Day 18 (and Day 07, and Day 04).

Essentially the challenge wants you to implement expression grammar, the exact thing we were writing to solve previous day's challenges whenever we used PEG.

An example of the rules given was:

0: 4 1 5
1: 2 3 | 3 2
2: 4 4 | 5 5
3: 4 5 | 5 4
4: "a"
5: "b"

PEG

So this got me thinking... these rules look a LOT like PEG, what if we just turned it into PEG? The particular flavour of PEG that Parsimonious uses would look like this:

RULE0 = RULE4 RULE1 RULE5
RULE1 = ((RULE2 RULE3) / (RULE3 RULE2))
RULE2 = ((RULE4 RULE4) / (RULE5 RULE5))
RULE3 = ((RULE4 RULE5) / (RULE5 RULE4))
RULE4 = "a"
RULE5 = "b"

So, we can just write a function that does this conversion!

import re
def convert(line):
    line = line.strip().replace(":", " =")
    line = re.sub(r"(\d+)", r"RULE\1", line)
    line = re.sub(r"= (.*) \| (.*)$", r"= ((\1) / (\2))", line)
    return line

Now, we can convert the challenge rules to PEG, and have Parsimonious use it:

from parsimonious.grammar import Grammar

data = open("input.txt").read().splitlines()
rules = data[:137]

rules.sort(key = lambda x: int(x.split(":")[0]))
converted = "\n".join(map(convert, rules))
grammar = Grammar(converted)

With grammar now containing the challenge rules, we can run it on inputs:

grammar.parse("aaaabb")

If this is successful, it returns a bunch of node information. If there's a string that isn't valid, it raises a ParseError

So we just need to wrap this in a function that catches this ParseError and returns True if successful, False if not:

from parsimonious import ParseError
def parse(line):
    try:
        grammar.parse(line)
    except ParseError:
        return False
    return True

That's it! The whole code is:

from parsimonious.grammar import Grammar
from parsimonious import ParseError
import re

def convert(line):
    line = line.strip().replace(":", " =")
    line = re.sub(r"(\d+)", r"RULE\1", line)
    line = re.sub(r"= (.*) \| (.*)$", r"= ((\1) / (\2))", line)
    return line

data = open("input.txt").read().splitlines()
rules, messages = data[:137], data[138:]

rules.sort(key = lambda x: int(x.split(":")[0]))
converted = "\n".join(map(convert, rules))
grammar = Grammar(converted)

def parse(line):
    try:
        grammar.parse(line)
    except ParseError:
        return False
    return True

print("sum", sum(map(parse, messages)))

Surprisingly short for this late. But perhaps we did something we weren't supposed to. Sneaky.

The Challenge Part 2

In the second part of the challenge, a couple rules are switched over to create loops. Since we've sorted the rules anyway, swapping them out is easy:

rules[8] = "8: 42 | 42 8"
rules[11] = "11: 42 31 | 42 11 31"

Finally, there's a slight issue with how PEG deals with recursive matching, and it's a similar problem to the one discussed in the challenge text itself, and so we just have to insert a little manual workaround on rule 31 due to the way it tries to consume the entire message and fail to return.

The entire code looks like this:

from parsimonious.grammar import Grammar
from parsimonious import ParseError
import re

def convert(line):
    line = line.strip().replace(":", " =")
    line = re.sub(r"(\d+)", r"RULE\1", line)
    line = re.sub(r"= (.*) \| (.*)$", r"= ((\1) / (\2))", line)
    return line

data = open("input.txt").read().splitlines()
rules, messages = data[:137], data[138:]

rules.sort(key = lambda x: int(x.split(":")[0]))
rules[8] = "8: 42 | 42 8"
rules[11] = "11: 42 31 | 42 11 31"
converted = "\n".join(map(convert, rules))
grammar = Grammar(converted)

def parse(line):
    try:
        grammar.parse(line)
    except ParseError:
        if "RULE31" in repr(err):
            return True
        return False
    return True

print("sum", sum(map(parse, messages)))

I'm not too happy with the RULE31 hack, I wonder if other PEG grammars or EBNF parsers would have the same issue. However, the solution was quick, and I didn't have to write any parsers by hand.

Onward!

DEV Community

Advent of Code 2020: Day 19 abusing PEG grammar in Python the way it's not supposed to

The Challenge Part 1

PEG

The Challenge Part 2

Top comments (0)

Read next

Building SaaS Faster with Ercas for SaaS: A Template for Indie Hackers

Trusted publishing ‐ It has never been easier to publish your python packages

Navigating the world of Harry Potter with Knowledge Graphs

Transform Any Image into a Sketch with Python 🚀