DEV Community

Attila Večerek
Attila Večerek

Posted on

Effective Pragmatism: Observability

Just like in the previous post, let's start with the definition.

Observability is the ability to understand the internal state or condition of a complex system based solely on knowledge of its external outputs, specifically its telemetry.1

There are a few key terms we need to understand:

  • internal state or condition: is a particular service up and running, are the requests being processed, queued, or dropped, is the service operating as expected, etc?
  • external outputs and telemetry: logs, metrics, and traces are the three pillars of observability and the main types of telemetry. They are considered external outputs because applications emit these data, for example, by writing them to the file system or sending them elsewhere over the network.

Observability provides businesses with deep visibility into their systems' inner workings and overall state. The importance of observability varies depending on a company's stage of development, with different priorities emerging at each phase.

A gradient of increasing complexity based on the development stage of a company listing startups, scale-ups, corporates, enterprises, and mega-corporation, or conglomerates and the various reasons they might want to prioritize observability

  1. Startup: Observability is moderately important at this stage. The primary focus is on rapid development, but it's still crucial to identify and resolve critical errors quickly to maintain early customer trust. Observability tools and manual code instrumentation help engineers debug issues in production efficiently.
  2. Scale-up: As the company grows, rapid scaling becomes the top priority. In addition to quick debugging, observability tools help prevent production issues as more customers are onboarded. They also assist in detecting scaling challenges, such as slow database queries. The ability to generate telemetry data easily from application code becomes essential.
  3. Corporate, enterprise, and mega-corp: At this level, observability is a mission-critical software quality. Downtime can cost millions of dollars per hour, leading to financial loss, reputational damage, and eroded customer trust. Observability helps prevent downtime by providing key signals for automated recovery actions (self-healing capabilities). Additionally, it enhances cost efficiency by offering insights into resource utilization (CPU, RAM, network bandwidth, etc.) across individual workloads.

Observability presents one major challenge for businesses: cost. The key is finding the right balance between cost and insight.

A common mistake companies make is becoming overly reliant on a single vendor, which can lead to vendor lock-in2. Depending on the observability provider, businesses may have limited control over their expenses. For example, if all logs are sent directly to a vendor, charges will apply for every log line and trace ingested into their system. In such cases, the only available cost-control mechanism may be adjusting the sampling rate, which reduces indexing costs but does not eliminate ingestion fees.

OpenTelemetry3 is a vendor-neutral framework for emitting metrics, traces, and logs. It supports head sampling4, allowing us to decide whether to sample or discard a trace or span before the data is ingested by a vendor.

Effect natively integrates with OpenTelemetry, providing built-in APIs for emitting logs5, metrics6, and traces7.

In the industry, it is common to use separate libraries for each telemetry type. Additionally, different dependencies within a project may rely on different sub-dependencies for emitting telemetry data. Each dependency may also configure telemetry output differently — for example, some may emit unstructured logs, while others produce structured logs in various formats. Inconsistent telemetry data is more difficult to analyze. Furthermore, applications can end up with multiple redundant dependencies scattered throughout the dependency tree. This leads to:

  • An increased attack surface, making the system more vulnerable to security risks.
  • Additional boilerplate, complicating seamless telemetry integration.
  • A higher maintenance burden, adding unnecessary complexity and overhead.

The following is a list of popular libraries for producing telemetry:

In this article, I will focus on tracing to highlight the benefits of using Effect but the same benefits apply across all types of telemetry.

Tracing

In my opinion, traces are the most universal form of telemetry. A single trace can reveal:

  • How long an operation takes
  • Which processes are executed within the operation
  • Which process is the bottleneck
  • Which process failed
  • And, thanks to trace-log correlation8, possibly why

Traces can also serve as the foundation for derived metrics, allowing us to visualize request and error rates, latency percentiles, and other key performance indicators. Given their depth of insight, it's no surprise that traces are often at the forefront of a company's observability strategy and the primary telemetry type used in Application Performance Monitoring (APM)9.

Let's revisit our example from the previous post about creating an article. First, let's see how we would add manual instrumentation using a conventional tracing client.

const createArticle = (input: unknown, currentUser: User, tracer: Tracer) => {
  return tracer.trace("createArticle", { tags: { input, currentUser } }, async (span) => {
    try {
      authorize(currentUser, ["READ_CATEGORY", "WRITE_ARTICLE"], tracer)
      const { categoryId, ...article } = parseInput(input)
      await enforceCategoryExists(categoryId, tracer)
      await enforceArticleLimitInCategory(categoryId, tracer)
      const persistedArticle = await persistArticle(article, tracer)

      return { status: 201, article: persistedArticle }
    } catch (e) {
      if (e instanceof /* ??? */) {}
    }
  })
}
Enter fullscreen mode Exit fullscreen mode

These are the key issues with this approach:

  1. Nested callbacks reduce readability

    • Wrapping the entire function body inside tracer.trace introduces deep nesting.
    • When overused, this reduces code readability, which is a crucial software quality that directly impacts developer productivity.
  2. Manual propagation of the tracer instance

    • We must explicitly pass the tracer instance to the instrumented function.
    • If we want all underlying actions to be instrumented, we must continue passing it down. In frontend development, this pattern is often called prop drilling10.

While this may not seem like a major issue in an isolated example, consider a large-scale codebase with hundreds or thousands of files. Repetitive manual instrumentation like this compounds over time, leading to reduced engineering productivity.

There’s a better way — the Effect way. 🚀

const createArticle = (input: unknown, currentUser: User) => Effect.gen(function* () {
  // happy path
}).pipe(
  Effect.withSpan("createArticle", { attributes: { input, currentUser } })
)
Enter fullscreen mode Exit fullscreen mode

In Effect, manual instrumentation works through composition. We simply add a call to Effect.withSpan11 within the pipeline, following the happy path logic. Additionally, there's no need to manually pass a reference to a tracer — Effect maintains a reference to the tracer client for us, eliminating unnecessary boilerplate.

Let's visualize12 an example trace of executing createArticle.

A screenshot showing the  raw `createArticle` endraw  span and all of its child spans including the span attributes.

Notice that input and currentUser are listed under the attributes of the createArticle span. But what if we wanted currentUser to be added to all spans, not just the one being annotated? With the conventional approach, we would need to pass the tracer reference alongside every annotation. However, with Effect, we can simply use Effect.annotateSpans to achieve this seamlessly.

const createArticle = (input: unknown, currentUser: User) => Effect.gen(function* () {
  // happy path
}).pipe(
  Effect.withSpan("createArticle", { attributes: { input } }),
  Effect.annotateSpans({ currentUser }),
)
Enter fullscreen mode Exit fullscreen mode

Run example

Here's the result after this change:

A screenshot showing the  raw `createArticle` endraw  span and all of its child spans. It demonstrates that the  raw `currentUser` endraw  span annotation is present in all spans.

As you can see, currentUser is now present in all spans.

Why is this useful? Observability platforms index spans and their attributes, making them easy to search and filter. By annotating all spans with a user ID, for example, we can track user-specific patterns, identify performance bottlenecks affecting specific users, and uncover valuable clues while troubleshooting various issues.

Exporting Traces

In the previous examples, we demonstrated how traces are generated. The next step is to show how to export13 them.

Using the conventional approach, a tracer must be instantiated and passed down to all functions that require manual instrumentation. This often comes with critical constraints. For example, Datadog's Node.js tracing library must be imported and initialized before any other module for its auto-instrumentation feature to work14.

Almost no one gets this right on the first try, leading to long debugging sessions and a significant hit on developer productivity when setting up telemetry. This requirement also impacts project structure.

// tracer.ts
import tracer from "dd-trace"

export default tracer.init()

// index.ts
import tracer from "./tracer.js"
Enter fullscreen mode Exit fullscreen mode

The tracer must be initialized in a separate file to prevent hoisting and ensure the correct load order. Then, in index.ts, the tracer must be imported before any other module.

An alternative approach is to initialize the tracer inside the main function and dynamically import the rest of the project, but this adds even more complexity and overhead.

Unlike conventional approaches, Effect simplifies the setup process.

Exporting traces to the IDE

If you're a VS Code user, you can even view traces directly within your IDE. All you need to do is install the Effect DevTools15 extension and provide it as a dependency when executing your program.

import { DevTools } from "@effect/experimental"
import { NodeRuntime, NodeSocket } from "@effect/platform-node"
import { Effect, Layer } from "effect"

const DevToolsLayer = DevTools.layerWebSocket().pipe(
  Layer.provide(NodeSocket.layerWebSocketConstructor)
)

const program = Effect.void

program.pipe(
  Effect.provide(DevToolsLayer),
  NodeRuntime.runMain
)
Enter fullscreen mode Exit fullscreen mode

The above example shows the simplest program (Effect.void) being run. The focal points of this example are:

  1. the construction of the DevToolsLayer dependency,
  2. its injection into the program using Effect.provide.

We will explore dependency injection in more detail in the upcoming post on testability.

The main takeaway is that Effect allows complete flexibility in structuring our projects. However, it is important to note that this is not a direct comparison to conventional approaches — Effect does not attempt to auto-instrument third-party libraries. Instead, we achieve the same benefits by embracing Effect-first libraries that generate telemetry for us.

This marks a fundamental shift in the JavaScript ecosystem and how libraries are designed. Traditionally, observability has been treated as an afterthought, leaving the responsibility of instrumentation to tracing libraries. This creates friction, as tracing libraries must continuously adapt to changes in the libraries they instrument. For full coverage, every library would require a dedicated auto-instrumentation module for each tracing client — Datadog, Sentry, OpenTelemetry, and others. This approach is difficult to scale and demands significant ongoing maintenance.

Effect challenges the status quo by flipping this responsibility. Instead of relying on external tracing libraries, each library takes full ownership of its own telemetry. Effect then provides a unified platform, ensuring that all telemetry data — whether traces, logs, or metrics — is exported consistently according to the application's configuration.

This approach is clearly beneficial for application developers, but what about library authors? Why should they take ownership of their library’s telemetry?

For one, it lowers the barrier to contribution. Open-source libraries used within commercial products often face challenges when debugging issues, as reproducing certain problems may require replicating part of the commercial environment — including sensitive customer data. However, if the library owns its telemetry, contributors can provide relevant telemetry samples in issue descriptions without exposing sensitive information. This can:

  • Help pinpoint the root cause more quickly.
  • Provide insights into how to reproduce the issue.
  • Reveal instrumentation gaps for further investigation.

Additionally, when a library owns its telemetry, users can more easily locate relevant sections of the source code for debugging, modification, or further exploration. This enhances transparency and improves the overall experience of contributing back to libraries.

One obvious challenge with this approach is cost control. Applications should have full control over which libraries emit telemetry and when. With Effect’s built-in configuration management, libraries can easily regulate their telemetry emission. Even if a library does not explicitly provide such configuration options, OpenTelemetry's head sampling can serve as a fallback solution, ensuring cost efficiency without sacrificing observability.

Exporting traces to a collector

Exporting traces to an OpenTelemetry collector is just as straightforward. Simply replace DevTools with NodeSdk and use the official OpenTelemetry libraries to configure the span processor.

import { NodeSdk } from "@effect/opentelemetry"
import { NodeRuntime } from "@effect/platform-node"
import { BatchSpanProcessor } from "@opentelemetry/sdk-trace-base"
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-http"
import { Effect, Layer } from "effect"

const NodeSdkLayer = NodeSdk.layer(() => ({
  resource: { serviceName: "example" },
  spanProcessor: new BatchSpanProcessor(new OTLPTraceExporter())
}))

const program = Effect.void

program.pipe(
  Effect.provide(NodeSdkLayer),
  NodeRuntime.runMain
)
Enter fullscreen mode Exit fullscreen mode

Additionally, we can switch between the two approaches based on the runtime environment:

// imports, DevToolsLayer, NodeSdkLayer

const RuntimeEnvConfig = Config.literal("development", "staging", "production")

const TelemetryLayer = Effect.gen(function* () {
  const env = yield* RuntimeEnvConfig("RUNTIME_ENV")

  return env === "development"
    ? DevToolsLayer
    : NodeSdkLayer
}).pipe(Layer.unwrapEffect)

// program

program.pipe(
  Effect.provide(TelemetryLayer),
  NodeRuntime.runMain
)
Enter fullscreen mode Exit fullscreen mode

Summary

Let's recap the key takeaways about observability:

  • Systems generate external outputs — telemetry — such as traces, logs, and metrics.
  • Telemetry provides observability, allowing us to understand system behavior and state.
  • Observability helps businesses debug production issues, detect errors and performance bottlenecks before customers do, assist engineers in cost reduction efforts, and prevent costly downtime.
  • Businesses often generate large volumes of telemetry data, which can be expensive.
  • A common challenge is vendor lock-in, which exacerbates costs by making it difficult, time-consuming, and expensive to switch providers.
  • OpenTelemetry introduces vendor-neutral observability.
  • Effect natively integrates with OpenTelemetry, improving readability and eliminating prop drilling.
  • In the current JavaScript ecosystem, popular libraries are instrumented separately by each tracing client. This approach does not scale and creates maintenance overhead.
  • With Effect, libraries take full ownership of their telemetry, while Effect provides a unified platform that ensures consistent telemetry production across all dependencies. This eliminates the scalability and maintenance challenges in both the ecosystem and our applications.
  • Effect enables developers to visualize telemetry data directly within the IDE. This improves developer productivity because telemetry data can be verified before changes are deployed to an environment with an OTEL collector.

Observability is a core feature of Effect, not an afterthought. Combined with its robust error management capabilities, it provides a comprehensive approach to error detection and prevention.

In the previous post, I briefly touched on dependency management. In this post, I explained how Effect maintains an internal reference to the tracer, which is closely tied to its dependency management system. In the next post, we’ll explore testability, diving deeper into how Effect handles dependencies and the advantages this approach offers to both application and library developers.


  1. What is observability? 

  2. Vendor lock-in 

  3. OpenTelemetry 

  4. Head sampling 

  5. Effect logging 

  6. Effect metrics 

  7. Effect tracing 

  8. OpenTelemetry log correlation 

  9. Application Performance Monitoring 

  10. What is prop drilling in React? 

  11. Effect.withSpan 

  12. Visualizing traces 

  13. Trace exporters 

  14. Datadog: Import and initialize the tracer 

  15. Effect DevTools 

Top comments (0)