DEV Community

Peter Strøiman
Peter Strøiman

Posted on

Go-DOM - A headless browser written in Go.

Having too little to do sometimes results in a crazy idea, and this time; it was to write a headless browser in Go, with full DOM implementation and JavaScript support, by embedding the v8 engine.

It all started working with writing an HTMX application, and the need to test it that made me curious if there was a pure Go implementation of a headless browsser.

Searching for "go headless browser" only resulted in search results talking about automating a headless browser, i.e. using a real browser like Chrome of Firefox in headless mode.

But nothing in pure Go.

So I started building one.

Why A Headless Browser in Go?

It may seem silly because writing a headless browser will never work like a real browser; and as such wouldn't really verify that your application works correctly in all the browsers you have decided to support. Neither does this allow you to get nice features such as screenshots of the application when things stop working.

So why then?

The Need for Speed!

To work in an effective TDD loop, tests must be fast. Slow test execution discourages TDD, and you loose the efficiency benefits a fast feedback loop provides.

Using browser automation for this type of verification has severe overheads, and such tests are typically written after the code was written; and as such, they no longer serve as a help writing the correct implementation; but are reduced to a maintenance burden after the fact; that only occasionally detect a bug before your paying customers do.

The goal is to create a tool that supports a TDD process. To be usable, it needs to run in-process.

It needs to be written in Go.

Less Flaky Tests

Having the DOM in-process enables writing better wrappers on top of the DOM; which can help providing a less erratic interface for your tests, like testing-library does for JavaScript.

Rather than depending on CSS classnames, element IDs, or DOM structure, you write your tests in a user-centric language, like this.

Type "me@example.com" in the textbox that has the label, "Email"

Or in hypothetical code.

testing.GetElement(Query{
  role: "textbox",
  // The accessibility "name" of a textbox _is_ the label
  name: "Email",
}).type("me@example.com")
Enter fullscreen mode Exit fullscreen mode

This test doesn't care if the label is implemented as <label for="...">, <input aria-label="Email">, or <input aria-labelledby="email-label">.

This decouples verification of behaviour from UI changes; but it does enforce that the text, "Email" is associated with the input field in accessible way. This couples the test to how the user interacts with the page; including those relying on screen readers for using your page.

This achieves the most important aspect of TDD; to write tests coupled to concrete behaviour.1

Although it's probably technically possible to write the same tests for an out-of-process browser; the benefit of native code is essential for the type of random access of the DOM you most likely need for these types of helpers.

An example: JavaScript

To exemplify the type of test, I will use a similar example from JavaScript; also an application using HTMX. The test verifies a general login flow from requesting a page requiring authentication.

It's a bit long, as I've combined all setup and helper code in one test function here.

it("Redirects to /local after a successful login", async () => {
  // Setup - stub the authentication, and create a stubbed user
  // using a test helper
  sinon
    .stub(auth, "authenticate")
    .withArgs({
      email: "jd@example.com",
      // matchPassword helper is used, as passwords are wrapped in a class
      // preventing accidental disclosure in logs, console out, etc.
      password: matchPassword("s3cret"),
    })
    .resolves(
      auth.AuthenticateResult.success(createUser({ firstName: "John" })),
    );
  const url = `http://127.0.0.1:${port}/auth/login?redirectUrl=%2Flocal`;
  // Request private page. This _should_ generate a redirect
  const wrapper = await DOMWrapper.open(url); // Just a helper around jsdom
  const browser = wrapper.browser;
  // Once HTMX is ready, it emits an `htmx:load` event. Then verify that it was 
  // correctly redirected.
  await wrapper.waitFor("htmx:load");
  expect(wrapper.url.pathname).to.equal("/auth/login");
  // Use testing-library to fill out and submit the form
  let screen = wrapper.screen;
  const username = screen.getByRole("textbox", { name: "Email" });
  const password = screen.getByLabelText("Password");
  await userEvent.type(username, "jd@example.com");
  await userEvent.type(password, "s3cret"); // password has no role
  // Wait for a new `htmx:load` event, while clicking the submit button
  // at the same time.
  await wrapper.runAndWaitFor(
    ["htmx:load"],
    userEvent.click(screen.getByRole("button", { name: "Sign in" })),
  );
  // After the new new page has been loaded, verify that the username
  // is displayed (i.e. the stubbed user is used), and the correct
  // URL is used.
  screen = testingLibrary.within(browser.window.document.body);
  const heading = screen.getByRole("heading", { level: 1 });
  expect(heading.innerHTML).to.equal("Hi, John");
  expect(wrapper.url.pathname).to.equal("/local");
});
Enter fullscreen mode Exit fullscreen mode

In simple terms the test does the following:

  1. Stub out the authentication function, simulating a successful response.
  2. Request a page that requires authentication
  3. Verify that the browser redirects to the login page, and the browser URL is updated. 2
  4. Fill out the form with the expected values, and submit.
  5. Verify that the browser redirects to the originally requested page, and it shows information for the stubbed user.

Internally the test starts an HTTP server. Because the this runs in the test process, mocking and stubbing of business logic is possible. The test use jsdom to communicate with the HTTP server; which both parse the HTML response into a DOM, but also executes client-side script in a sandbox which has been initialised, e.g. with window as the global scope.3

This enables writing tests of the HTTP layer, where validating the contents of the response is not enough. In this case; that the response is processed by HTMX as intended.

But apart from waiting for some HTMX events, so as to not proceed to early (or too late) the test doesn't actually care about HTMX. In fact, if I remove HTMX from the form, resorting to classical redirects, the test still pass.

(If I remove the HTMX <script> tag completely, the test will timeout waiting for HTMX events)

Speed? Check!

While the previous test was a little slower than desired; it's reasonably fast, completing in typically 150-180ms. This is far too slow for the majority of the test suite, but it's fast enough to serve as a feedback loop while working on that particular feature.

This test is not part of a normal TDD run. They are run when I work on that feature; or when before committing; ensuring nothing broke. Which is a completely normal way to deal with "slow tests".

Potential Speed Improvements

The JavaScript example uses a real HTTP server started on a random port. The server runs in the process of the test runner, which is why we can stub and mock business logic.

In Go, HTTP requests are handled by an http.Handler, making it very easy to consume the HTTP handling logic without actually launching an HTTP server.

And this is something that the go-dom code does handle right now, and currently the test suite runs in zero milliseconds, rounded to the nearest millisecond.4

Mocking and Parallel Tests?

The ability to run parallel tests only depends on your code's ability to run in parallel. As this can consume an http.Handler, each test can create its own handler; each with different dependencies replaced with test doubles as fits the individual test.

This allows you to test the HTTP layer as a whole; using stubbed business logic.

Current State of the Project?

Close to nothing is implemented; the current state is the result of about one and a half day's work. I have a basic streaming tokeniser that can consume an http response stream, which is passed to a parser that returns a Node.

The code can currently process the string <html></html> (no spaces allowed yet) into an HTMLHtmlElement.

Next steps are

  • Improve the parser just slightly, and get a few more element types implemented
  • Embed the v8 engine, addressing the primary uncertainty; how are Go objects made accessible to JavaScript, and how Go code can inspect the result of mutations from JavaScript code.

Future of the Project?

This will most likely die :(

I am not even working on a Go project where this would be valuable (I was working on a node.js project). It was the joy of seeing how jsdom helped serve as a feedback loop for the authentication flow that sparkled a stupid idea that was fun to pursue. And as a person with ADHD, this is a typical pattern for me. I start something that is fun, work on it; until something else hits the radar and intrigues me.

Unless ...

Other developers think this is a good idea and want to help building this.

I believe that such a tool would be extremely helpful for any Go project combining server-side rendering with client-side scripts, including HTMX-based applications.

The project is found here: https://github.com/stroiman/go-dom


  1. The goal of TDD is not to write unit tests. That is an extremely common; but utterly incorrect misconception. 

  2. The important part is not that we are redirected; but that the browser history has the correct entries, providing sensible behaviour of browser back/forward functionality. The test should really have verified the contents of history, or perhaps even used actively used the navigation API to go back and forth. As such; the test could be improved in regards to describing expected behaviour from the user's point of view

  3. Wring a headless browser in JavaScript has an unfair advantage; as your simulated DOM are already valid JavaScript objects. In Go, extra work is necessary to allow client-side scripts to mutate the DOM, and let the outcome be accessible in test code. 

  4. As reported by Ginkgo. This doesn't include the overhead of building and launching, which has a noticeable, but short very delay. 

Top comments (0)