I've recently been helping out with a support request from a client and it prompted me to write about logging and error handling. I noticed that the logs for this application were of no help to me and so I set out to make some quick improvements. It made me think about error handling and logging and the things that I have learned over the years.
In this blog I'll be sharing a list of things you can do to improve your error handling and logging. This list is very opinionated and some of these topics may resonate more with you than other things. Through a series of code snippets and small refactors, I'm going to improve error handling and explain why the refactor is an improvement.
The service connector
We're going to start with a small snippet of code, consisting of two methods:
public void ProcessBusinesses()
{
var data = GetFromExternalSource();
if (data.Length == 0) return;
SaveData(data);
}
private string GetFromExternalSource()
{
try
{
var response = _httpClient.GetAsync("api/businesses").Result;
if (!response.IsSuccessStatusCode) return null;
return response.Content.ReadAsStringAsync().Result;
}
catch (Exception e)
{
_logger.LogError(e, "Something went wrong while fetching data from external service");
return null;
}
}
In this code snippet, we may observe the following flaws:
- Inproper use of async/await
- Inconsistent error flows
- Error swallowing
- Insufficient management of expectations
This snippet of code is responsible for logs like this:
Inproper use of async/await
The code snippet makes several calls to .Result
. This is a bad practice for multiple reasons: your code may experience deadlocks and failures are obscured with AggregateException
s. Instead of calling .Result
, we should make our methods async, like this:
// ๐ Replace return type with 'async Task'
public async Task ProcessBusinessesAsync()
{
// ๐ await the result from external source
var data = await GetFromExternalSourceAsync();
if (data.Length == 0) return;
await SaveDataAsync(data);
}
// ๐ Replace return type with 'async Task<string>'
private async Task<string> GetFromExternalSourceAsync()
{
try
{
// ๐ await the task from 'GetAsync'
var response = await _httpClient.GetAsync("api/businesses");
if (!response.IsSuccessStatusCode) return null;
// ๐ await the task from 'ReadAsStringAsync'
return await response.Content.ReadAsStringAsync();
}
catch (Exception e)
{
_logger.LogError(e, "Something went wrong while fetching data from external service");
return null;
}
}
This is better for the following reasons:
- Without calls to
.Result
, you should have a reduced risk of deadlocks - Should
.GetAsync
or.ReadAsStringAsync
throw an exception, it will no longer be wrapped in anAggregateException
when it's caught. You'll get more clear errors in your logs.
But what if you really can't make the public method async for some reason? I recommend that you go async as far as possible, but make sure that exceptions cannot propagate from an async context into a sync context. That is: a call to .Result
should never throw an exception. You should use a different communication device to communicate failures. More on that later.
Inconsistent error flows and error swallowing
The code sample uses an HttpClient
. Connections to external resources may fail for several reasons. What happens when the SSL certificate isn't valid on the external resource? What happens if the external resource returns an error with HTTP 500 statuscode? We can improve the code snippet by using a consistent error flow and immediately also reduce the complexity like this:
private async Task<string> GetFromExternalSourceAsync()
{
try
{
var response = await _httpClient.GetAsync("api/businesses");
// ๐ replace 'IsSuccessStatusCode' with '.EnsureSuccessStatusCode()'
response.EnsureSuccessStatusCode();
return await response.Content.ReadAsStringAsync();
}
catch (Exception e)
{
_logger.LogError(e, "Something went wrong while fetching data from external service");
return null;
}
}
This is better for the following reasons:
- An HTTP 500 response from the resource now results in an exception and will therefore be logged. We are no longer at risk of accidentally swallowing the error.
- Errors are now handled in a single flow, reducing the complexity of this code.
Insufficient management of expectations
I've written before about managing expectations with nullable reference types. You might find it interesting:
Let's take a look at the signature of GetFromExternalSource
:
private async Task<string> GetFromExternalSourceAsync()
The method does not tell anything about how it deals with errors. We can therefore only assume that whatever the method returns is part of the happy flow. I prefer to deal with this by either handling the error or rethrowing.
Rethrow the error
The easiest way to improve is by just rethrowing the error by changing the code as follows:
private async Task<string> GetFromExternalSourceAsync()
{
try
{
var response = await _httpClient.GetAsync("api/businesses");
response.EnsureSuccessStatusCode();
return await response.Content.ReadAsStringAsync();
}
catch (Exception e)
{
_logger.LogError(e, "Something went wrong while fetching data from external service");
// ๐ replace 'return null' with 'throw'
throw;
}
}
By doing this, we gain the following benefits:
- The method only returns data in the happy flow
- Consumer code can not obscure the error with transient failures
Handle the error
Alternatively, you may choose to deal with the error instead of rethrowing it. You might want to do this if your code has no access or knowledge about your logging solution for example. In the example, we ideally handle the error in the consumer code. To improve our code, we first need to introduce a communication device for failures:
public class Result
{
private readonly Exception _exception;
public const string InvalidOperationTemplate = "Cannot get {0} when {1} is {2}.";
protected Result(bool success, Exception exception = null)
{
Success = success;
_exception = exception;
}
public bool Success { get; }
public Exception Exception => !Success ? _exception : throw new InvalidOperationException(string.Format(InvalidOperationTemplate, nameof(Exception), nameof(Success), true));
public static Result CreateSuccess()
=> new Result(true);
public static Result CreateFailure(Exception e)
=> new Result(false, e);
public static Result<TValue> CreateSuccess<TValue>(TValue value)
=> new Result<TValue>(true, value);
public static Result<TValue> CreateFailure<TValue>(Exception e)
=> new Result<TValue>(false, default, e);
public static async Task<Result<TValue>> ExecuteSafelyAsync<TValue>(Func<Task<TValue>> func)
{
if (func == null) throw new ArgumentNullException(nameof(func));
try
{
return CreateSuccess(await func());
}
catch (Exception e)
{
return CreateFailure<TValue>(e);
}
}
public static async Task<Result> ExecuteSafelyAsync(Func<Task> func)
{
if (func == null) throw new ArgumentNullException(nameof(func));
try
{
await func();
return CreateSuccess();
}
catch (Exception e)
{
return CreateFailure(e);
}
}
}
public class Result<TValue> : Result
{
private readonly TValue _value;
internal Result(bool success, TValue value, Exception exception = null)
: base(success, exception)
{
if (success && value.Equals(default(TValue))) throw new ArgumentNullException(nameof(value));
_value = value;
}
public TValue Value => Success ? _value : throw new InvalidOperationException(string.Format(InvalidOperationTemplate, nameof(Value), nameof(Success), false), Exception);
}
The above code snippet introduces a Result
class. This class can either hold a return value or an exception. It also holds a .Success
property from which we can read whether or not the operation succeeded. This type also contains a convenience utility ExecuteSafelyAsync
, which automatically returns the result of a method as an instance of Result
and automatically catches errors for you so you don't have to do that yourself everywhere.
We might then update GetFromExternalSourceAsync
as follows:
// ๐ Instead of 'string', we return 'Result<string>'
private Task<Result<string>> GetFromExternalSourceAsync()
{
// ๐ Instead of try/catch, we wrap the method inside a 'Result' object
return Result.ExecuteSafelyAsync(() =>
{
var response = await _httpClient.GetAsync("api/businesses");
response.EnsureSuccessStatusCode();
return await response.Content.ReadAsStringAsync();
});
}
Finally, we might change the consumer code to deal with the error:
public async Task ProcessBusinessesAsync()
{
var result = await GetFromExternalSourceAsync();
if (!result.Success)
{
DealWithError(result.Exception);
return;
}
if (result.Value.Length == 0) return;
await SaveDataAsync(result.Value);
}
Now if you're really fancy, you could even apply the monad pattern (with some artistic liberty) and change the code to look something like this:
public async Task<Result> ProcessBusinessesAsync()
{
// ๐ Get the initial result object
var externalResult = await GetFromExternalSourceAsync();
// ๐ Using the monad pattern, you can either continue execution or handle the error
var saveResult = await externalResult.BindAsync(SaveDataAsync, LogErrorAndThrow);
// ๐ The monad pattern allows you to specify how you want to handle failure for each step.
var emailResult = await saveResult.BindAsync(SendEmailAsync, LogErrorAndContinue);
return emailResult;
}
The implementation of the monad pattern is not specified here. You can read more about monads in this article.
If you choose this method, you might benefit like this:
- The signature of
GetFromExternalSourceAsync
communicates potential failure clearly - Consumer code has freedom to deal with errors however they want
- Inproper use of the
Result
object causes an exception that encapsulates the original error, so it takes more effort to hide the failure
Conclusion
We've seen that improper handling of failure may obscure errors. By properly using async/await, using consistent error flows and managing expectations, we can generate more relevant error logs and more dynamically handle failure. After several small refactors, the same failures now generate the following logs:
That is a great improvement, in my opinion.
The API endpoint
We're moving on to a new example snippet. The following snippet is an action on an API controller endpoint:
[HttpPost]
public ActionResult UpdateCalendar(CalendarItem item)
{
try
{
bool success = _calendarService.Update(item);
if (!success)
{
throw new InvalidOperationException($"Calendar item {item.Name} cannot be updated");
}
return Ok();
}
catch (Exception e)
{
_logger.LogError(e);
return new HttpStatusCodeResult(HttpStatusCode.InternalServerError);
}
}
In this code snippet, we may observe the following flaws:
- Logging of potentially sensitive data
- Usage of errors in happy flows
- Obscuring of failure in tracing
- Not fully using the framework
- Not logging for reproduction
This snippet of code may produce logs that look like this:
Logging of potentially sensitive data
Logging personal information is considered a security risk and should therefore be avoided. If your logs accidentally leak, somebody might be able to use the logs to obtain personal information. As you can see in the log screenshot: We are able to see what sort of calendar items users have. We can fix this by simply replacing the name of the calendar item with its ID:
// ๐ Notice that 'item.Name' is replaced with 'item.Id'
throw new InvalidOperationException($"Calendar item {item.Id} cannot be updated");
Not only does this make your logs more secure, it also makes it easier for yourself to identify which exact calendar item in your database is problematic.
Usage of errors in happy flows
Notification fatigue is when you get so many notifications that you are no longer able to distinguish what is important and what isn't. If you throw too many errors and your error graph looks like this:
Then knowing what really needs attention and what doesn't becomes really difficult. It is therefore important to prevent exceptions in happy flows.
In our case, we are missing an important step in this controller: model validation. If we add model validation, we can prevent users from doing invalid operations and potentially significantly reduce the amount of useless errors in our logs:
[HttpPost]
public ActionResult UpdateCalendar(CalendarItem item)
{
// ๐ Ensure there is body content, because modelstate can be true if the body is empty
if (item == null)
return BadRequest("Body content is required");
// ๐ Add model validation to reduce failures due to malformed input
if (!ModelState.IsValid)
return BadRequest(ModelState);
try
{
bool success = _calendarService.Update(item);
if (!success)
{
throw new InvalidOperationException($"Calendar item {item.Id} cannot be updated");
}
return Ok();
}
catch (Exception e)
{
_logger.LogError(e);
return new HttpStatusCodeResult(HttpStatusCode.InternalServerError);
}
}
This change will give you the following benefits:
- The amount of error logs might significantly reduce
- The
BadRequest
return type automatically ensures that your clients receive a semantically correct HTTP 400 response.
Not fully using the framework and obscuring failures in traces
A service like Azure Application Insights can give you detailed insight in failures in your application. Our example code, however, does not make good use of the framework. Application Insights marks any response with HttpStatusCodeResult
as successful, so even if our code were to raise errors, your Application Insights wouldn't be able to tell you. The code may be improved by simply removing the try/catch statement:
[HttpPost]
public ActionResult UpdateCalendar(CalendarItem item)
{
if (item == null)
return BadRequest("Body content is required");
if (!ModelState.IsValid)
return BadRequest(ModelState);
// ๐ The try/catch is removed
bool success = _calendarService.Update(item);
if (!success)
{
throw new InvalidOperationException($"Calendar item {item.Id} cannot be updated");
}
return Ok();
}
This change makes your application better in several ways:
- The framework knows how to deal with exceptions and can return a semantically correct HTTP 500 response automatically
- Your exceptions will show up in tracing software as a failure, instead of as a success
- Your framework might automatically forward unhandled exceptions to your logging framework, so error logging in your entire application is handled in one single location, rather than for each endpoint individually.
Not logging for reproduction
You may notice that the code throws a generic exception when success is false. An exception like that obscures the actual problem and makes it more difficult for you to find out the underlying cause of the issue. The application might be improved like this:
[HttpPost]
public ActionResult UpdateCalendar(CalendarItem item)
{
if (item == null)
return BadRequest("Body content is required");
if (!ModelState.IsValid)
return BadRequest(ModelState);
// ๐ The success variable is removed and the calendar service is modified to throw exceptions upon failure
_calendarService.Update(item);
return Ok();
}
This change makes the code better for several reasons:
- The action method is more simple
- You've potentially reduced the amount of error logs by removing an exception that does not tell you what is wrong
- Exceptions from the service layer provide more accurate details on the nature of the failure and are therefore more reproducable
Conclusion
Making good use of the tools inside your framework allows you to write more straightforward code and takes the work of error handling out of your hands. After these refactor steps, our logs might look more like this:
The visible benefits have convinced me that understanding the framework is a worthwhile exercise.
Additional tips
Reduce clutter in Application Insights
Without configuration, I personally find that the failure overview of Application Insights doesn't really convey the right message. Especially on public websites, the amount of 404 not found results is very high. I prefer to mark 404 Not Found results as success in application insights.
I also discovered that Umbraco uses HTTP 402 to signal the frontend of the backoffice that 2fa validation is required. I do not need to see this in Application Insights, because it is expected behaviour and it clutters the view. For that reason, I wrote a small component that marks these responses as success in application insights:
public class IgnoreClientErrorsTelemetryInitializer
: ITelemetryInitializer
{
public void Initialize(ITelemetry telemetry)
{
switch (telemetry)
{
/* Why is 404 response set as success?
*
* 404 responses have a multitude of causes, but 99.99% of them are not relevant to us.
* Application insights however, still registers them as failures, obscuring more relevant errors in the logs.
* By marking 404 responses as success, we no longer see not-found pages among "failures"
*/
case RequestTelemetry request when request.ResponseCode == "404":
request.Success = true;
break;
/* Why is 402 response set as success?
*
* HTTP 402 stands for "Payment required". Umbraco deliberately uses this statuscode to signal
* the frontend that a 2fa verification is required.
* Since this is expected behaviour and we're not interested in this as a "failure",
* we do not wish to see it among the "failures" in application insights
*/
case RequestTelemetry request when request.ResponseCode == "402":
request.Success = true;
break;
}
}
}
Feel free to use it.
It's OK to ignore certain errors
Raygun allows me to ignore errors if I think they're not relevant. Especially when bots scan an application, it tends to cause a lot of errors that are not really relevant for me. I ignore them so I don't see them in my overview anymore. That allows me to give my attention to the errors that actually impact my users.
Final thoughts
Good error handling and logging is a difficult challenge. Putting effort into improving your logs is in my opinion a good exercise though as it helps you understand where the weaknesses are in your application and allows you to quickly respond to issues. I continue to learn new things about error handling and logging and still hope to grow more.
That is all I wanted to share. Thank you very much for reading and I'll see you in my next blog! ๐
Top comments (0)