After countless frozen editors and crashed terminals while analyzing mobile app logs, I built something different. Here's the journey.
The Challenge
As mobile developers, these scenarios are all too familiar:
- 500MB log files freezing our editors
- Nested zip files from QA teams
- Multiple keywords to track simultaneously
- What should be quick analysis turning into hours of frustration
Browser to the Rescue
Instead of fighting with traditional tools, I explored modern browser capabilities:
- File System Access API for large files
- Web Workers for background processing
- IndexedDB for efficient data storage
Smart Multi-Pattern Highlighting
One of the key challenges in log analysis is highlighting multiple patterns simultaneously. Traditional approaches often fail when dealing with overlapping patterns or nested highlights. Here's how we solved it:
// Smart highlighting with style stacking
function highlightText(content, patterns) {
const events = [];
// Collect all matches and their styles
// Traditional approach of replacing patterns one by one would break
// when patterns overlap, as earlier highlights would be treated as content
patterns.forEach(({pattern, style}) => {
const regex = new RegExp(pattern, 'gi');
let match;
while ((match = regex.exec(content)) !== null) {
events.push(
{ pos: match.index, type: 'start', style },
{ pos: regex.lastIndex, type: 'end', style }
);
}
});
// Sort events by position and type
// Critical: end events must come before start events at same position
// to properly handle cases like: |end1|end2|start3|start4|
events.sort((a, b) => {
if (a.pos !== b.pos) return a.pos - b.pos;
return a.type === 'end' ? -1 : 1;
});
// Apply styles with stacking support
// Using a stack-like structure (activeStyles) to track all active styles
// at any given position, enabling proper style inheritance
let activeStyles = [];
let result = '';
let lastPos = 0;
events.forEach(event => {
const segment = content.slice(lastPos, event.pos);
if (segment) {
// Merge all active styles for overlapping regions
// This ensures all applicable styles are combined correctly
const styles = activeStyles.reduce((acc, style) => ({...acc, ...style}), {});
result += activeStyles.length
? `<span style="${styleToString(styles)}">${segment}</span>`
: segment;
}
// Maintain the stack of active styles
if (event.type === 'start') activeStyles.push(event.style);
else activeStyles = activeStyles.filter(s => s !== event.style);
lastPos = event.pos;
});
return result;
}
Why This Approach?
Traditional highlighting methods have several limitations:
-
Sequential Processing Problem
- Traditional: Process patterns one after another, replacing text with HTML
- Issue: Earlier highlights become part of the content, breaking later patterns
- Our Solution: Track all matches first, then process them together
-
Style Stacking Challenge
- Traditional: Can't properly handle overlapping highlights
- Issue: Later highlights override earlier ones
- Our Solution: Stack styles in overlapping regions
-
Performance Concerns
- Traditional: Multiple passes through the text, each replacing content
- Issue: O(n*m) complexity where n is text length and m is pattern count
- Our Solution: Single pass collection, single pass rendering
This implementation ensures:
- Correct handling of overlapping patterns
- Proper style inheritance in overlapped regions
- Maintainable and efficient processing
- Safe HTML output without breaking existing tags
The Virtual List Challenge
While developing LogDog, I encountered an interesting limitation in existing virtual list solutions. After testing more than 10 popular virtual list components, I found they all hit a wall when dealing with truly massive datasets - typically around 2^24 rows (about 16.7 million entries).
The root cause? Most virtual list implementations rely on the browser's native scrollbar and require the entire dataset upfront as an array. This approach fails spectacularly when analyzing gigabyte-sized log files.
I've implemented a different approach that breaks through this limitation. While the complete solution deserves its own article (coming soon!), the key insight was reimagining how virtual lists handle scrolling and data sourcing.
Stay tuned for a deep dive into:
- Breaking the 2^24 limit
- Custom scrolling implementation
- Efficient data streaming
- Memory management techniques
Results
In real-world usage:
- Load time: <1s for most files
- Memory usage: ~100MB for 1GB log file
- Search speed: Real-time for most patterns
Key Learnings
The development process taught me that:
- Modern browsers are surprisingly capable
- Proper chunking is crucial for large file handling
- UI responsiveness requires careful architecture
- Smart memory management makes all the difference
Try It Yourself
I've made this tool available online:
- 🔧 Use it now: LogDog
What's Next?
I'm working on:
- More built-in coloring rules for various analysis tasks
- Performance optimization patterns
- Latency analysis templates
Share your log analysis challenges in the comments! What patterns would you find most useful for your analysis tasks?
If you found this helpful, give LogDog a try and let me know your thoughts!
Top comments (0)