DEV Community

Cover image for Nested validation in .NET
Ilya
Ilya

Posted on • Originally published at ilya-chumakov.com

Nested validation in .NET

In this blog's opening post, I discuss the problem of validating nested Data Transfer Objects in modern .NET. Nesting simply means that the root object can reference other DTOs, which in turn can reference others and so on, potentially forming a cyclic graph of unknown size. For each node in the graph, its data properties are validated against a quite typical rule set: nullability, range, length, regular expressions etc.
And for DTO types, let's declare the following conventions:

  • It may have DataAnnotation attributes, including custom ones.
  • It may implement IValidatableObject.
  • It should avoid third-party dependencies if possible.

You may have guessed that the graph is the tricky part. Indeed, a built-in DataAnnotations.Validator doesn't do nested validation by design, and this was a default behaviour for decades. But the fix is trivial, right? Just implement any kind of graph traversal with cycle detection! Well, yes and no. In this post, I compare popular third-party libraries that support nested validation. Looking ahead, there is a big performance difference even among robust production-ready solutions.

There are many ways to define validation rules in .NET, each with its own advantages and disadvantages. For example:

  • Attributes: explicit, useful for OpenAPI document generation.
  • IValidatableObject: more flexible yet still self-contained.
  • External: This is a jack of all trades. It leaves DTOs clean and provides maximum flexibility (FluentValidation is the best example of this approach).
  • Manual validation: the most naive approach, it simply has inlined if clauses without declaring validation rules at all. As a result, it gives unbeatable performance at the cost of scalability, and it doesn't apply to a graph of unknown length/topology. Later it is used as a benchmark baseline.

To finish this long intro and save everyone's time, let me highlight what is not covered in this article:

  • ASP.NET Model Validation. Although it comes with full support for DataAnnotations attributes, it is still an inseparable part of a large and complex framework that deals with both server-side application and Web APIs, ModelState, version backward comparability, etc... a topic that undoubtedly deserves its own article.
  • IOptions<T> validation. Ironically, with the arrival of [ValidateObjectMembers] and [ValidateEnumeratedItems] in .NET 8, OptionsBuilder<TOptions> now supports validation of nested options. And there are now at least 3 different validation algorithms shipped with ASP.NET.

What is validation?

Let's say we're processing a user's registration email address. What should we check?

  • The address should be in the correct format. This is validation.
  • The address domain should not be on our blacklist. This is a business rule.
  • The address should be unique in our database. This is a business rule.

What is the difference? Validation is a pure function. It is deterministic (same input - same output) and has no side effects. That's why looking for a domain in a list is not validation: such lists are subject to change, so they're not deterministic. A good rule of thumb for mere enterprise developers like me:

  • Validation: self-contained (we only need the data from the DTO itself)
  • Business rule: anything that touches mutable data (database, API, file system etc.)

And my advice is: don't mix them up. Validate your input before the control flow even reaches your business domain. Just like ASP.NET does with model binding. Regardless of the application architecture, in many cases you actually want fail fast on invalid/malicious input and avoid unnecessary allocation of your scoped and transient services. Then, testing: covering pure functions with tests is trivial. Well, at least it is way easier to do separately, than mocking a database and couple of APIs for all-at-once validator. Put some effort into the quality of the data coming into your domain, and you'll get a clearer and more concise domain logic.

To go deeper, please read Mark Seemann's Validation and business rules post, discussing the topic in great detail. Let me say a few things about the libraries under consideration, and we can finally get on with the benchmarking.

DataAnnotationsValidator

Our first contender is the DataAnnotationsValidator.NETCore package. It is long dead and has performance issues, so strongly not recommended. However, this library illustrates well the idea behind many home-made solutions:

  • Reflection to read metadata.
  • Recursive depth-first search for traversing a graph.
  • A hash set for cycle detection.

MiniValidation

Alive and well-designed, MiniValidation offers smooth experience in nested validation. While implementing a similar depth-first search for visiting a DTO graph, it adds metadata caching to the mix, resulting in much better performance.

FluentValidation

FluentValidation is undoubtedly the most popular third-party validation library on .NET. It is a robust choice if you need clean POCOs or multiple validation maps per type. However, its performance may surprise you.

Benchmark: DataAnnotation and FluentValidation

Our first benchmark is to validate a fairly typical DataAnnotation-marked DTO, containing both a single nested object and a collection of them (each is expected to be validated):

public class Parent
{
    [Range(1, 9999)]
    public int Id { get; set; }

    [Required(AllowEmptyStrings = false)]
    [StringLength(12, MinimumLength = 12)]
    public string? Name { get; set; }

    [Required]
    public Child? Child { get; set; }

    [Required]
    public List<Child> Children { get; init; } = new(0);
}

public class Child : IChild
{
    [Required]
    public DateTime? ChildCreatedAt { get; set; }

    [AllowedValues(true)]
    public bool ChildFlag { get; set; }
}
Enter fullscreen mode Exit fullscreen mode

Of course, FluentValidation has no use for these attributes, so its validators are created separately while repeating the same rules:

public class ParentValidator : AbstractValidator<Parent>
{
    public ParentValidator()
    {
        RuleFor(x => x.Id).InclusiveBetween(1, 9999);
        RuleFor(x => x.Name).NotEmpty().Length(min: 12, max: 12);
        RuleFor(x => x.Child).NotNull().SetValidator(new ChildValidator());
        RuleForEach(x => x.Children).NotNull().SetValidator(new ChildValidator());
    }
}
public class ChildValidator : AbstractValidator<Child>
{
    public ChildValidator()
    {
        RuleFor(x => x.ChildCreatedAt).NotNull();
        RuleFor(x => x.ChildFlag).Equal(true);
    }
}
Enter fullscreen mode Exit fullscreen mode

Finally, the Manual benchmark uses explicit if checks and serves as a baseline. Each benchmark is runned against of the same Parent collection. There are the results depending on the collection size:

Method Size Mean Allocated Alloc Ratio
Manual 100 3 μs 34 KB 1
MiniValidation 100 162 μs 427 KB 12
DataAnnotationsValidator 100 302 μs 614 KB 17
FluentValidation 100 314 μs 946 KB 27
Manual 1000 33 μs 343 KB 1
MiniValidation 1000 1586 μs 4260 KB 12
DataAnnotationsValidator 1000 3084 μs 6150 KB 17
FluentValidation 1000 3300 μs 9586 KB 27
Manual 10000 342 μs 3437 KB 1
MiniValidation 10000 16237 μs 42619 KB 12
DataAnnotationsValidator 10000 31223 μs 61480 KB 17
FluentValidation 10000 32364 μs 95911 KB 27

Well, DataAnnotationsValidator is expectedly bad, but FluentValidation... is even worse in both time and space! At first I thought there was a bug (there was not). Then I did my best to look for FluentValidation settings that might help to optimise its performance (there weren't any, except "fail fast", see below). The overall result distribution remains the same.
But look at MiniValidation! The same algorithm, but optimised for performance, gives a quite impressive 2x boost over DataAnnotationsValidator.

Benchmark: IValidatableObject

As you probably know, IValidatableObject is an alternative to explicit DataAnnotations attributes, with all the validation logic encapsulated within DTOs. This benchmark uses the same validation rules but implemented in Validate method, so it's all about traversing a graph and calling Validate at each node. FluentValidation is not on the list this time.

public class ChildValidatableObject : IValidatableObject
{
    public DateTime? ChildCreatedAt { get; set; }
    public bool ChildFlag { get; set; }

    public IEnumerable<ValidationResult> Validate(ValidationContext validationContext)
    {
        if (ChildCreatedAt == null)
        {
            yield return new ValidationResult("foo error message #2", 
                new[] { nameof(ChildCreatedAt) });
        }

        if (ChildFlag == false)
        {
            yield return new ValidationResult("foo error message #3", 
                new[] { nameof(ChildFlag) });
        }
    }
}
Enter fullscreen mode Exit fullscreen mode
Method Size Mean Allocated Alloc Ratio
'Manual with IVO.Validate call' 100 21 μs 109 KB 1.00
'MiniValidation + IVO' 100 59 μs 199 KB 1.82
'DataAnnotationsValidator + IVO' 100 151 μs 442 KB 4.04
'Manual with IVO.Validate call' 1000 206 μs 1093 KB 1.00
'MiniValidation + IVO' 1000 565 μs 1992 KB 1.82
'DataAnnotationsValidator + IVO' 1000 1511 μs 4421 KB 4.04
'Manual with IVO.Validate call' 10000 2141 μs 10937 KB 1.00
'MiniValidation + IVO' 10000 6608 μs 19921 KB 1.82
'DataAnnotationsValidator + IVO' 10000 16254 μs 44219 KB 4.04

Again, MiniValidation wins by an even larger margin. Now let's merge the results and look at the overall performance (values rounded for readability):

Method Size Mean Allocated
Manual 10000 342 μs 3437 KB
Manual with IVO.Validate call 10000 2141 μs 10937 KB
MiniValidation + IVO 10000 6608 μs 19921 KB
MiniValidation 10000 16237 μs 42619 KB
DataAnnotationsValidator + IVO 10000 16254 μs 44219 KB
DataAnnotationsValidator 10000 31223 μs 61480 KB
FluentValidation 10000 32364 μs 95912 KB

You may notice that MiniValidation + IValidatableObject give the best results of all third party libraries.

Benchmark: Fail fast

And yet FluentValidation has the feature that other competitors lack: CascadeMode.Stop. It's flexible and can be set at different levels (rule, class, global):

public class FailfastChildValidator : AbstractValidator<Child>
{
    public FailfastChildValidator()
    {
        ClassLevelCascadeMode = CascadeMode.Stop;
        //All the rules are declared as usual
        //...
    }
}
Enter fullscreen mode Exit fullscreen mode
Method Size Mean Allocated
FluentValidation + Fail Fast 10000 9012 μs 38556 KB
FluentValidation 10000 32364 μs 95911 KB

Of course, the fail-fast version is much faster. Most of the time I prefer the full validation report, but fail-fast is an option worth mentioning when talking about performance.

Summary

In this post I discussed the problem of validating nested objects in .NET. Since the built-in DataAnnotations validator doesn't traverse complex properties, we have to rely on third-party libraries for this. I explained the difference between validation and business rules, and why this is important.

As for the benchmark results:

  • The MiniValidation library shows the best overall performance.
  • FluentValidation, despite its popularity, is generally 2x slower. There are some faster alternatives, such as Validot, but I would like to leave the burden of benchmarking to its maintainers.

And don't get me wrong. If you want to decouple your rules from DTOs and get a simple, stable and production-tested solution - just take FluentValidation, because its performance difference is negligible in many cases. If you need self-describing DTOs - stick with MiniValidation. And for performance driven code - inline your checks where possible.

The obvious next step in the development of general purpose validation libraries, is, of course, the adoption of ChatGPT source generators. A validation generator such as this one would potentially eliminate the performance gap between general usage validation libraries and inlined validation. In fact, we already have all the necessary technology shipped with .NET, so stay tuned for news!

All the code from the article is available on Github: https://github.com/ilya-chumakov/PaperSource.DtoGraphValidation.

Top comments (1)

Collapse
 
andremantas profile image
André Mantas

An actual good post, that is not AI generated AND includes benchmarks. This should have more traction for sure.