Code Generating Code

Code that generates code can sometimes be difficult to follow, but it provides huge benefits in saving time and making your end code more maintainable. Today I want to talk a little bit about when and why you would use code that generates code, and what your options are in C#.

There are a few common scenarios that come up where code generating code can be a powerful solution. One scenario is repeating similar logic, such as method signatures. A common case in C#, because you must declare generic types explicitly, is the pattern of offering the same method but with a differing number of generic overloads. For instance, a recent example I wrote is a utility to make generating a strong hash codes given the variables on the type. The method signatures look like this:
int Combine<T1, T2>(T1 value1, T2 value2)
all the way up to
int Combine<T1, ... T16>(T1 value1, ... T16 value16)
I could have wrote this instead as
int Combine(params object[] values)
which I did also create for greater than 16 parameters, but I wanted to avoid unnecessary allocations in common scenarios. By using code generation I was able to write this method once and have it generate the other 15 iterations of it automagically.

(Note: I discovered in hindsight that there is now a System.HashCode type that does the same thing and actually looks very similar. Although I will be removing this type from my code I will continue to use this technique for similar scenarios.)

Code generating code is also useful in scenarios where you need to create some boilerplate when you add new types to your program. For example, if you are writing your own data access or serialization you may need to generate a mapper for each of your DTOs. You could put this on the other programmers on your team to remember to write the mapper when they create a new DTO, remember to update the mapper when they update the DTO, write it correctly, and update all of the mappers when the mapping code is updated, or, you could right a code generator that centralizes the logic and ensures that it stays up to date with the rest of your application.

The last common scenario I will discuss is the reflection based registration type logic that usually happens at the start up of the application, and especially if you are using a container. While this is usually the best place to handle slow reflection based logic, it is not ideal for the boot up time of your application. As we move increasingly more towards smaller and smaller apps that are spun up and down frequently in the cloud, it is important to optimize the start up time. Even more so if there may be no instances of your app running until a request comes in. Instead of performing this reflection at run time it can be performed in advance at compile time to create code that manually registers all relevant types. This buys you the best of both worlds; the convenience and maintainability of automatic registration with the performance of manually registering everything.

In C# you now have two major flavors of code generation; T4 templates and Source Generators. I will not give a comprehensive explanation of both technologies and how to use them here, but rather focus on what scenarios I have found where they have some benefit.

T4 Templates have been around for some time, and are useful in scenarios where you would like to be able to readily see the code and use it with IntelliSense elsewhere in your application. T4 Templates are generally configured to run every time you save the file and generate a cs or similar file with a matching name. The nice thing about having the generated file is that you can inspect and debug the code right in your application as though it is any other code file. The one gotcha is that you should never manually update the code file, but instead should always update the template to create the generated code you want. The last thing I want to discuss with T4 Templates is that they run individually in a sandboxed environment. This makes them run faster and avoids referencing other generated types, but also means that you must manually reference libraries to include in that sandbox, and that there may be some other quirks like the T4 template using a different version of C# than the rest of your application.

Source Generators are a newer technology for code generation, and have a slightly different use case than T4 Templates. Instead of creating code files directly in your project Source Generators inspect the project after it is compiled and add their code directly to the output assemblies. Due to this you can’t directly see a code file, or reference generated types with IntelliSense in your source code. This technique of code generation is less of a means of automating what you could write by hand and is really more of a way to move the performance hit of reflection to compile time instead of run time. Source Generators are usually placed in their own project where you can import the libraries they need and control the version of C#. One scenario you need to be carful when using Source Generators though is referencing your main project from the generator project (not uncommon since you are replacing reflection in the main project). This can cause you to need to build twice on every build, once to make the build available to the source generators, and then again to incorporate the generated source into the output assembly.

I discussed here several common scenarios for using code generating code, and the two major flavors of it in C#, T4 Templates and Source Generators. By giving a high level overview of each I hope to have provided you with some guidance on which path is best to explore for your current project, or just inspired you to learn a little bit more about each of them.

Happy coding!

Leave a Reply

Your email address will not be published. Required fields are marked *