ka li

Posted on Jan 17

From Pain to Gain: Building a Browser-Based Log Analysis Tool

#android #tooling #webdev #programming

After countless frozen editors and crashed terminals while analyzing mobile app logs, I built something different. Here's the journey.

The Challenge

As mobile developers, these scenarios are all too familiar:

500MB log files freezing our editors
Nested zip files from QA teams
Multiple keywords to track simultaneously
What should be quick analysis turning into hours of frustration

Browser to the Rescue

Instead of fighting with traditional tools, I explored modern browser capabilities:

File System Access API for large files
Web Workers for background processing
IndexedDB for efficient data storage

Smart Multi-Pattern Highlighting

One of the key challenges in log analysis is highlighting multiple patterns simultaneously. Traditional approaches often fail when dealing with overlapping patterns or nested highlights. Here's how we solved it:

// Smart highlighting with style stacking
function highlightText(content, patterns) {
  const events = [];

  // Collect all matches and their styles
  // Traditional approach of replacing patterns one by one would break
  // when patterns overlap, as earlier highlights would be treated as content
  patterns.forEach(({pattern, style}) => {
    const regex = new RegExp(pattern, 'gi');
    let match;
    while ((match = regex.exec(content)) !== null) {
      events.push(
        { pos: match.index, type: 'start', style },
        { pos: regex.lastIndex, type: 'end', style }
      );
    }
  });

  // Sort events by position and type
  // Critical: end events must come before start events at same position
  // to properly handle cases like: |end1|end2|start3|start4|
  events.sort((a, b) => {
    if (a.pos !== b.pos) return a.pos - b.pos;
    return a.type === 'end' ? -1 : 1;
  });

  // Apply styles with stacking support
  // Using a stack-like structure (activeStyles) to track all active styles
  // at any given position, enabling proper style inheritance
  let activeStyles = [];
  let result = '';
  let lastPos = 0;

  events.forEach(event => {
    const segment = content.slice(lastPos, event.pos);
    if (segment) {
      // Merge all active styles for overlapping regions
      // This ensures all applicable styles are combined correctly
      const styles = activeStyles.reduce((acc, style) => ({...acc, ...style}), {});
      result += activeStyles.length 
        ? `<span style="${styleToString(styles)}">${segment}</span>`
        : segment;
    }

    // Maintain the stack of active styles
    if (event.type === 'start') activeStyles.push(event.style);
    else activeStyles = activeStyles.filter(s => s !== event.style);

    lastPos = event.pos;
  });

  return result;
}

Why This Approach?

Traditional highlighting methods have several limitations:

Sequential Processing Problem
- Traditional: Process patterns one after another, replacing text with HTML
- Issue: Earlier highlights become part of the content, breaking later patterns
- Our Solution: Track all matches first, then process them together
Style Stacking Challenge
- Traditional: Can't properly handle overlapping highlights
- Issue: Later highlights override earlier ones
- Our Solution: Stack styles in overlapping regions
Performance Concerns
- Traditional: Multiple passes through the text, each replacing content
- Issue: O(n*m) complexity where n is text length and m is pattern count
- Our Solution: Single pass collection, single pass rendering

This implementation ensures:

Correct handling of overlapping patterns
Proper style inheritance in overlapped regions
Maintainable and efficient processing
Safe HTML output without breaking existing tags

The Virtual List Challenge

While developing LogDog, I encountered an interesting limitation in existing virtual list solutions. After testing more than 10 popular virtual list components, I found they all hit a wall when dealing with truly massive datasets - typically around 2^24 rows (about 16.7 million entries).

The root cause? Most virtual list implementations rely on the browser's native scrollbar and require the entire dataset upfront as an array. This approach fails spectacularly when analyzing gigabyte-sized log files.

I've implemented a different approach that breaks through this limitation. While the complete solution deserves its own article (coming soon!), the key insight was reimagining how virtual lists handle scrolling and data sourcing.

Stay tuned for a deep dive into:

Breaking the 2^24 limit
Custom scrolling implementation
Efficient data streaming
Memory management techniques

Results

In real-world usage:

Load time: <1s for most files
Memory usage: ~100MB for 1GB log file
Search speed: Real-time for most patterns

Key Learnings

The development process taught me that:

Modern browsers are surprisingly capable
Proper chunking is crucial for large file handling
UI responsiveness requires careful architecture
Smart memory management makes all the difference

Try It Yourself

I've made this tool available online:

🔧 Use it now: LogDog

What's Next?

I'm working on:

More built-in coloring rules for various analysis tasks
Performance optimization patterns
Latency analysis templates

Share your log analysis challenges in the comments! What patterns would you find most useful for your analysis tasks?

If you found this helpful, give LogDog a try and let me know your thoughts!

DEV Community

From Pain to Gain: Building a Browser-Based Log Analysis Tool

The Challenge

Browser to the Rescue

Smart Multi-Pattern Highlighting

Why This Approach?

The Virtual List Challenge

Results

Key Learnings

Try It Yourself

What's Next?

Top comments (0)

Read next

Stop sharing your screen, start sharing your website

Building DBChat - Explore and Evolve Your DB with Simple Chat (Part 1)

How to Integrate Stack and Bottom Tab Navigator in React Native

Intel Gaudi NPU Matches NVIDIA GPU Performance at 30% Lower Cost in AI Workload Tests