Meursyphus

Posted on Dec 10

"Performance Has Improved!" Can You Really Prove It?

Every frontend developer has experienced this situation:
"I optimized this part, and it definitely feels faster!"
"Hmm... I don't really notice much difference in my environment."

When you open Chrome DevTools and try to measure performance:

The hassle of opening the tools and clicking Record every time
Measurement environments vary with browser cache, CPU state, network conditions
Sharing performance improvement results with colleagues through screenshots
Difficulty tracking whether performance degrades over time

As a result, we end up saying performance has improved based on "feeling."
However, true performance optimization should be provable with objective metrics.

In this article, we'll explore automated performance measurement methods to:

Collect objective performance data in a consistent environment
Track performance changes over time
Establish performance metrics that the entire team can share

💡 Pro Tip: While Chrome DevTools' Performance panel is a powerful tool, it has limitations with manual measurement. Through automation, we can overcome these limitations while still utilizing DevTools' analysis capabilities.

Scientific Experimentation: Reliable Performance Measurement Methods

Did you know you can control Chrome DevTools' Performance panel programmatically?
Using Playwright, you can automate performance measurements through browser automation.

Getting Started with Automated Performance Measurement Using Playwright

test('Capture performance traces', async ({ page, browser }) => {
    // Start Chrome trace recording
    await browser.startTracing(page, {
        path: `./traces/${formatDate(new Date())}.json`
    });

    // Load page and perform actions
    await page.goto('http://localhost:4173/performance-test');
    await page.click('button#start');

    // Record performance measurement points
    await page.evaluate(() => {
        window.performance.mark('Perf:Started');
        window.performance.mark('Perf:Ended');
        window.performance.measure('overall', 'Perf:Started', 'Perf:Ended');
    });

    // End measurement and save results
    await browser.stopTracing();
});

However, measuring just once isn't enough.

💡 A Habit from Physics Lab Days
"Remember, everyone! One measurement is no measurement!"

This is a habit ingrained from writing physics lab reports. 😄
I still vividly remember that measurement error is inversely proportional
to the square root of the number of repeated measurements.

Scientific Approach to Performance Measurement

Applying principles learned from physics experiments to web performance measurement:

Importance of Repeated Measurements
- Standard error decreases by 1/√n as the number of measurements (n) increases
- With 50 repetitions, error rate reduces to about 1/7 compared to a single measurement
Strict Control of Variables
- Just like controlling temperature and humidity in a lab
- Essential to control browser cache, memory state, CPU load
- Must consider V8 engine's JIT compiler optimizations

Complete Code for Reliable Measurements

test('Scientifically measure performance', async () => {
    const results = [];
    const REPEAT_COUNT = 50;  // Sufficient repetitions for reliability

    for(let i = 0; i < REPEAT_COUNT; i++) {
        // Control variables: fresh browser environment each time
        const browser = await chromium.launch({
            args: ['--no-sandbox', '--disable-dev-shm-usage']
        });
        const page = await browser.newPage();

        await browser.startTracing(page, {
            path: `./traces/trace_${i}.json`
        });

        // Performance measurement
        const result = await measurePerformance(page);
        results.push(result);

        await browser.stopTracing();
        await browser.close();  // Reset browser state
    }

    // Statistical analysis of results
    const stats = calculateStats(results);
    console.log(`
        Average: ${stats.mean}ms
        Standard Deviation: ${stats.stdDev}ms
        95% Confidence Interval: ${stats.confidenceInterval}ms
    `);
});

The Pitfall of JIT Compiler: Code Gets Faster with Execution

Here's an interesting case I encountered:

// ❌ Incorrect measurement method: repeated measurements without browser restart
test('Without browser restart', async ({ page }) => {
    const results = [];
    for(let i = 0; i < 50; i++) {
        const startTime = performance.now();
        await runPerformanceTest(page);
        results.push(performance.now() - startTime);
    }
    console.log('Execution time changes:', results);
});

The results showed:

First 10 runs average: 450ms
Middle 10 runs average: 380ms
Last 10 runs average: 320ms

Why does this happen?
It's because the V8 engine's JIT compiler learns execution patterns and optimizes the code. In other words, the same code gets faster with repeated execution.

// ✅ Correct measurement method: restart browser each time
test('With browser restart', async () => {
    const results = [];
    for(let i = 0; i < 50; i++) {
        const browser = await chromium.launch();
        const page = await browser.newPage();

        const startTime = performance.now();
        await runPerformanceTest(page);
        results.push(performance.now() - startTime);

        await browser.close();
    }
    console.log('Execution time changes:', results);
});

The results showed:

First 10 runs average: 445ms
Middle 10 runs average: 442ms
Last 10 runs average: 448ms

💡 Pro Tip:
JIT compiler optimizations do occur in production environments.
However, the goal of performance measurement is to obtain a baseline value,
so we should measure in an environment without these optimizations.

A physics major turned frontend developer applying experimental methodology...
Who would have thought? 😄 Now, let's dive deeper into analyzing Chrome Trace Reports.

Analyzing Chrome Trace Reports

The Challenge: Flitter's Performance Optimization

I was developing the Flitter visualization framework and felt the need for performance optimization. Flitter uses SVG and Canvas to visualize dynamic data, providing various animations and interactions. While these features enrich data representation, they also cause performance overhead.

For example, when data changes, requestFrameAnimation is repeatedly called to redraw SVG elements. Although we optimized the callback function to trigger only once, we didn't realize that calling requestFrameAnimation itself has a cost. This wasn't noticeable in normal cases but caused performance degradation when rendering complex UI repeatedly (e.g., mouse dragging).

To solve this issue, we initially tried to track the execution time of specific functions and optimize animation routines. However, we couldn't confidently prove the effectiveness of performance improvements based on subjective feelings.

"We need a systematic way to record JavaScript execution time and scientifically verify optimization results!"

This conclusion led us to adopt automated performance measurement using Chrome Trace and Playwright.

Collecting Performance Data with Chrome Trace Reports

Chrome Trace Reports record all major events that occur while the browser executes JavaScript code, allowing us to see how much CPU and memory each event consumes. This is particularly useful for analyzing JavaScript function calls and execution times.

Why Use Chrome Trace Reports?

The biggest advantage of Chrome Trace Reports is being able to see the entire function call stack and execution time of each function at a glance. This allows quick identification of specific functions or repetitive code that burden performance. Especially in visualization frameworks like Flitter that interact with multiple elements, finding specific routines that cause performance load is crucial, and Chrome Trace is very helpful in solving these issues.

Automating Chrome Trace with Playwright

Playwright is a useful tool for automating performance measurements in browser environments and can collect performance data in conjunction with Chrome Trace Reports. With Playwright, you can easily save Trace Reports that track events occurring in the browser. The example code below shows how to record performance data from a specific point and save it as a JSON file.

test('Capture performance traces and save JSON file when diagram is rendered', async ({ page, browser }) => {
    // Start Chrome trace recording
    await browser.startTracing(page, {
        path: `./performance-history/${formatDate(new Date())}.json`
    });

    // Navigate to the page to measure performance
    await page.goto('http://localhost:4173/performance/diagram');

    // Mark the start of performance measurement
    await page.evaluate(() => window.performance.mark('Perf:Started'));
    await page.click('button');
    await page.waitForSelector('svg');
    await page.evaluate(() => window.performance.mark('Perf:Ended'));
    await page.evaluate(() => window.performance.measure('overall', 'Perf:Started', 'Perf:Ended'));

    // End measurement and save results
    await browser.stopTracing();
});

Analyzing CpuProfile Events

Although we can generate a JSON-formatted Trace Report using Playwright and Chrome Trace, analyzing this file directly is challenging. The report contains all events that occurred during execution, but it's difficult to intuitively understand each function's execution time or call count. We need to preprocess the data to extract meaningful information.

CpuProfile Events and Preprocessing

CpuProfile events are recorded in the disabled-by-default-v8.cpu_profiler category and track JavaScript function CPU usage. Each event is recorded with a time interval and includes samples and timeDeltas fields, which can be used to infer each function's execution time.

To calculate each function's total execution time, we need to preprocess the data by:

Iterating through samples and timeDeltas to calculate each node's total execution time
Constructing a node hierarchy based on parent-child relationships and aggregating child node execution times to their parents
Extracting function-level execution times and visualizing or identifying optimization points

This preprocessing step provides a clear understanding of each function's resource usage and helps identify performance bottlenecks.

Implementing the ChromeTraceAnalyzer Class

To automate the analysis of Trace Reports, we'll implement the ChromeTraceAnalyzer class. This class will take Trace data as input, calculate each function's execution time, and provide performance metrics.

ChromeTraceAnalyzer Class

class ChromeTraceAnalyzer {
    nodes: any[];

    constructor(trace) {
        this.setConfig(trace);
    }

    // Return the execution time of a specific function (in ms)
    getDurationMs(name: string): number {
        if (!this.nodes) throw new Error('nodes is not initialized');
        const result = this.nodes.find((node) => node.callFrame.functionName === name);
        return result ? result.duration / 1000 : 0; // Convert to milliseconds
    }

    // Set up Trace data and calculate function execution times
    setConfig(trace: any) {
        const { traceEvents } = trace;

        // Filter 'ProfileChunk' events
        const profileChunks = traceEvents.filter((entry) => entry.name === 'ProfileChunk');

        // Extract CpuProfile nodes and sample data
        const nodes = profileChunks.map((entry) => entry.args.data.cpuProfile.nodes).flat();
        const sampleTimes = {};

        // Aggregate sample execution times
        profileChunks.forEach((chunk) => {
            const { cpuProfile: { samples }, timeDeltas } = chunk.args.data;

            samples.forEach((id, index) => {
                const delta = timeDeltas[index];
                sampleTimes[id] = (sampleTimes[id] || 0) + delta;
            });
        });

        // Construct node hierarchy and aggregate child node execution times
        this.nodes = nodes.map((node) => ({
            id: node.id,
            parent: node.parent,
            callFrame: node.callFrame,
            children: [],
            duration: sampleTimes[node.id] || 0
        }));

        // Establish parent-child relationships and aggregate execution times
        const nodesMap = new Map();
        this.nodes.forEach((node) => {
            nodesMap.set(node.id, node);
        });

        this.nodes
            .sort((a, b) => b.id - a.id)
            .forEach((node) => {
                if (!node.parent) return;
                const parentNode = nodesMap.get(node.parent);
                if (parentNode) {
                    parentNode.children.push(node);
                    parentNode.duration += node.duration;
                }
            });
    }
}

Repeated Measurements for Reliability

Although we can analyze Trace data using ChromeTraceAnalyzer, a single measurement is not enough to evaluate performance. JavaScript execution performance can vary due to environmental changes, so we need to repeat measurements multiple times to obtain reliable data.

Repeated Measurement Code Example

test('Capture analyzed trace when diagram is rendered with multiple runs', async () => {
    const COUNT = 10;
    const duration = {
        timestamp: Date.now(),
        runApp: 0,
        mount: 0,
        draw: 0,
        layout: 0,
        paint: 0,
    };

    for (let i = 0; i < COUNT; i++) {
        const browser = await chromium.launch({ headless: true });
        const page = await browser.newPage();
        await page.goto('http://localhost:4173/performance/diagram');

        // Start tracing
        await browser.startTracing(page, {});
        await page.evaluate(() => window.performance.mark('Perf:Started'));
        await page.click('button');
        await page.waitForSelector('svg');
        await page.evaluate(() => window.performance.mark('Perf:Ended'));
        await page.evaluate(() => window.performance.measure('overall', 'Perf:Started', 'Perf:Ended'));

        // Extract Trace data and analyze
        const trace = JSON.parse((await browser.stopTracing()).toString('utf8'));
        const analyzer = new ChromeTraceAnalyzer(trace);
        duration.runApp += analyzer.getDurationMs('runApp') / COUNT;
        duration.mount += analyzer.getDurationMs('mount') / COUNT;
        duration.draw += analyzer.getDurationMs('draw') / COUNT;
        duration.layout += analyzer.getDurationMs('layout') / COUNT;
        duration.paint += analyzer.getDurationMs('paint') / COUNT;

        await browser.close();
    }

    console.log('**** Average Execution Time ****');
    console.log(`runApp: ${duration.runApp}ms`);
    console.log(`mount: ${duration.mount}ms`);
    console.log(`draw: ${duration.draw}ms`);
    console.log(`layout: ${duration.layout}ms`);
    console.log(`paint: ${duration.paint}ms`);
    console.log('********************************');
});

Visualizing Performance Data

By repeating measurements and analyzing Trace data, we can obtain reliable performance metrics. To better understand performance changes over time, we can visualize the data using Flitter.

Flitter-Based Stacked Bar Chart

We created a stacked bar chart using Flitter to display the average execution time of each function over multiple runs. This chart helps us identify performance bottlenecks and track changes in execution time.

Chart Configuration

X-axis: Measurement dates (track performance changes over time)
Y-axis: Average execution time (ms)
Stacked bars: Each function's execution time (color-coded)

Performance Changes Over Time

By analyzing the chart, we can see how performance changes over time:

Date-based performance changes: We can track how performance changes on each measurement date.
Function-level execution time: We can identify which functions consume the most resources and optimize them accordingly.

The current Flitter library can be used as an efficient visualization solution in situations where performance optimization is important. You can check the related code on GitHub, and if you found this helpful, please give it a Star and thumbs up on GitHub! If you show interest in this project, we plan to continue developing and sharing performance improvements and various features.

GitHub: https://github.com/meursyphus/flitter/blob/latest/packages/test/tests/tracking-performance.test.ts

Docs: Flitter Docs

Thank you for reading!