DEV Community

Cover image for Optimizing Line Chart Performance with LTTB Algorithm
Saeed
Saeed

Posted on

Optimizing Line Chart Performance with LTTB Algorithm

When working with large datasets, rendering all points in a line chart can cause significant performance issues. For example, plotting 50,000 data points directly can overwhelm the browser and make the chart unresponsive. Tools like amCharts and ApexCharts struggle with such datasets, while ECharts performs better but still isn't optimized for extremely large datasets.

To address this, I combined ECharts with the Largest Triangle Three Buckets (LTTB) algorithm. LTTB is a downsampling algorithm that reduces data points while preserving the visual structure of the dataset.

In this post, Iā€™ll explain how the LTTB algorithm works step by step with an example of reducing 10 points to 4. For each step, I'll include the relevant code, explanations, and resulting data transformations.

LTTB Algorithm Code
Hereā€™s the full code for the LTTB algorithm:

export function lttb(data: Point[], threshold: number): Point[] {
  const dataLength = data.length;
  if (threshold >= dataLength || threshold === 0) {
    return data; // No downsampling needed
  }

  const sampled: Point[] = [];
  let sampledIndex = 0;
  const bucketSize = (dataLength - 2) / (threshold - 2);
  let a = 0; // Start point
  let nextA = 0;

  sampled[sampledIndex++] = data[a]; // Add the first point

  for (let i = 0; i < threshold - 2; i++) {
    let avgX = 0;
    let avgY = 0;
    let avgRangeStart = Math.floor((i + 1) * bucketSize) + 1;
    let avgRangeEnd = Math.floor((i + 2) * bucketSize) + 1;
    avgRangeEnd = avgRangeEnd < dataLength ? avgRangeEnd : dataLength;
    const avgRangeLength = avgRangeEnd - avgRangeStart;

    for (; avgRangeStart < avgRangeEnd; avgRangeStart++) {
      avgX += data[avgRangeStart].x;
      avgY += data[avgRangeStart].y;
    }

    avgX /= avgRangeLength;
    avgY /= avgRangeLength;

    let rangeOffs = Math.floor((i + 0) * bucketSize) + 1;
    const rangeTo = Math.floor((i + 1) * bucketSize) + 1;
    const pointAX = data[a].x;
    const pointAY = data[a].y;

    let maxArea = -1;

    for (; rangeOffs < rangeTo; rangeOffs++) {
      const area = Math.abs(
        (pointAX - avgX) * (data[rangeOffs].y - pointAY) -
        (pointAX - data[rangeOffs].x) * (avgY - pointAY)
      );
      if (area > maxArea) {
        maxArea = area;
        nextA = rangeOffs;
      }
    }

    sampled[sampledIndex++] = data[nextA]; // Add the most important point
    a = nextA;
  }

  sampled[sampledIndex++] = data[dataLength - 1]; // Add the last point

  return sampled;
}
Enter fullscreen mode Exit fullscreen mode

Example: Reducing 10 Points to 4 Points
Weā€™ll reduce the following dataset of 10 points to 4 points using the LTTB algorithm:

Original Dataset

const data = [
  { x: 1, y: 10 },
  { x: 2, y: 20 },
  { x: 3, y: 15 },
  { x: 4, y: 25 },
  { x: 5, y: 30 },
  { x: 6, y: 20 },
  { x: 7, y: 40 },
  { x: 8, y: 35 },
  { x: 9, y: 45 },
  { x: 10, y: 50 }
];
const threshold = 4; // Reduce to 4 points
Enter fullscreen mode Exit fullscreen mode

Step 1: Add the First Point

sampled[sampledIndex++] = data[a]; // Add the first point
Enter fullscreen mode Exit fullscreen mode

āœ… The algorithm always starts by adding the first data point. This ensures that the downsampled dataset begins correctly.

šŸ“Š Resulting Data:

sampled = [
  { x: 1, y: 10 }
];
Enter fullscreen mode Exit fullscreen mode

Step 2: Calculate the Bucket Size

const bucketSize = (dataLength - 2) / (threshold - 2);
Enter fullscreen mode Exit fullscreen mode

āœ… The bucket size determines how many data points fall into each "bucket." In this case:

bucketSize=dataLengthāˆ’2thresholdāˆ’2=10āˆ’24āˆ’2=4\text{bucketSize} = \frac{\text{dataLength} - 2}{\text{threshold} - 2} = \frac{10 - 2}{4 - 2} = 4

So, each bucket contains 4 data points.

Step 3: First Bucket (Points 2ā€“5)

let avgX = 0;
let avgY = 0;
let avgRangeStart = Math.floor((i + 1) * bucketSize) + 1;
let avgRangeEnd = Math.floor((i + 2) * bucketSize) + 1;
avgRangeEnd = avgRangeEnd < dataLength ? avgRangeEnd : dataLength;
const avgRangeLength = avgRangeEnd - avgRangeStart;

for (; avgRangeStart < avgRangeEnd; avgRangeStart++) {
  avgX += data[avgRangeStart].x;
  avgY += data[avgRangeStart].y;
}

avgX /= avgRangeLength;
avgY /= avgRangeLength;
Enter fullscreen mode Exit fullscreen mode

āœ… The algorithm calculates the average point of the first bucket (points 2ā€“5). This is done by summing their x and y values and dividing by the number of points.

avgX=2+3+4+54=3.5\text{avgX} = \frac{2 + 3 + 4 + 5}{4} = 3.5

avgY=20+15+25+304=22.5\text{avgY} = \frac{20 + 15 + 25 + 30}{4} = 22.5

Step 4: Find the Most Significant Point

let rangeOffs = Math.floor((i + 0) * bucketSize) + 1;
const rangeTo = Math.floor((i + 1) * bucketSize) + 1;
const pointAX = data[a].x;
const pointAY = data[a].y;

let maxArea = -1;

for (; rangeOffs < rangeTo; rangeOffs++) {
  const area = Math.abs(
    (pointAX - avgX) * (data[rangeOffs].y - pointAY) -
    (pointAX - data[rangeOffs].x) * (avgY - pointAY)
  );
  if (area > maxArea) {
    maxArea = area;
    nextA = rangeOffs;
  }
}
Enter fullscreen mode Exit fullscreen mode

āœ… The algorithm identifies the point in the bucket that forms the largest triangle with the average point and the previously selected point.

For this bucket, (2,20) has the largest area

sampled = [
  { x: 1, y: 10 },
  { x: 2, y: 20 }
];
Enter fullscreen mode Exit fullscreen mode

Step 5: Second Bucket (Points 6ā€“9)

The algorithm repeats the process for the second bucket (points 6ā€“9), computing:

The most significant point in this bucket is (6, 20).

avgX=6+7+8+94=7.5\text{avgX} = \frac{6 + 7 + 8 + 9}{4} = 7.5

avgY=20+40+35+454=35\text{avgY} = \frac{20 + 40 + 35 + 45}{4} = 35

šŸ“Š Updated Data:

sampled = [
  { x: 1, y: 10 },
  { x: 2, y: 20 },
  { x: 6, y: 20 }
];
Enter fullscreen mode Exit fullscreen mode

Step 6: Add the Last Point

sampled[sampledIndex++] = data[dataLength - 1]; // Add the last point
Enter fullscreen mode Exit fullscreen mode

āœ… The algorithm always ends by adding the last point.

šŸ“Š Final Data:

sampled = [
  { x: 1, y: 10 },
  { x: 2, y: 20 },
  { x: 6, y: 20 },
  { x: 10, y: 50 }
];
Enter fullscreen mode Exit fullscreen mode

Conclusion
By combining ECharts with the LTTB algorithm, I was able to downsample my dataset efficiently and improve chart performance significantly. If you work with large datasets, this method is a great way to balance performance and visual accuracy. Let me know if youā€™ve tried this approach or if you have other techniques to share!

Top comments (0)