Optimizing Jint part 2: Tools of the trade

I think making performance optimizations is a great example of true engineering task, there should be numbers and evidence, always. You cannot state that something has improved unless you can show hard numbers about either reduced time to run or reduced memory usage (ideally of course both). Otherwise it's just good old feels faster.

In this post we'll examine some tools and checklists when doing performance analysis.


Reducing memory allocations and speeding things up

I'm not going to bore you with the dull old details about generational garbage collection. There are a lot of good resources to learn about it. What we need the keep in mind however, is that every allocation has its cost. You should keep allocations in generation 0 where they are the cheapest, but after that you should also minimize allocations. Every CPU cycle used in garbage collection is away from your program logic.

So speed optimization does not only consist of the usual algorithm and library usage tuning, but also making sure that CLR is not burdened with unnecessary memory allocations.

The tools

So what tools are there to investigate such problems? Well, there is a lot to choose from. For me the go-to tools are JetBrains dotTrace, Visual Studio's Performance Profiler and of course the one and the only BenchmarkDotNet. For you hardcode people there's always PerfView, but I've always felt more comfortable with the ones that have easier / more user friendly UI's to work with.

When you start care about every single instruction the compiler emits and virtual machine needs to run, SharpLab can be invaluable. You can also test new, yet to be released, compiler features.

There's also an interesting contender that's free, CodeTrack, but I haven't played with it much yet. I'm going to give examples of problems found using these tools, but I'm not going to delve to deep into how to use the tools, give their respective product documentation a go.

Running and analyzing, a small checklist

Alwas, always, run binaries that were built in release mode. Set your IDE to Release configuration, set Web.config to debug=false and any other platform specific things you can think of.

BenchmarkDotNet is kind enough to give error when you try to benchmark using debug build. When running .NET Core apps, remember to supply the command line switch.

> dotnet build -c Release

Stop all extra software you have running, including real-time virus scanning etc. For me the checklist is stopping everything from Windows's "down-right" (Docker, virus scanner, Skype etc). Naturally you shouldn't have your IDEs or other heavy machinery running either.

Make sure you are running with consistent CPU speed. Laptops are especially nasty to work with, make sure you are running high performance power plan and processor C-states disabled if possible. I usually even switch of the second monitor, just in case to give the GPU some breathing room.

Creating a small console application to target wanted runtimes usually pays off. Just create the minimal repro to stress the part of the system that is under optimization. I tend to target both .NET Core and full .NET framework to ensure full framework doesn't give a nasty surprise. Generally if you get things optimized on full framework, it should be at least as fast on .NET Core.

JetBrains dotTrace

Easiest is to start with Timeline profiling. Get to know the tool and filter to find interesting code paths. You usually are interested the parts that allocate most memory or take the most time. Categories also give you hints about which sub-system is taking the time or memory (Collections, String, LINQ etc).

After you discovered bottleneck(s) you can switch to sampling and more accurate (slower) methods. At the end line-by-line will give you best insight but is slow to run for bigger test run.

Visual Studio Performance Profiler

This does require a beefier Visual Studio license, but it has it perks. Generally I try it sometimes to check if I've missed something. It's slower to use, but gathers a lot of information. PerfView is working its magic behind the scenes.

BenchmarkDotnet

They have great documentation, just read it, I've got little to add. The next posts will mostly use BenchmarkDotNet to highlight some differences of speed and memory usage depending on the code.

Parameterized benchmarks are nice, you can see how N affects the outcome. Especially nice when you are battling between choosing a list or hash-based collection, it's always depending on the use case.

Running your benchmark against both full framework and .NET Core can give you nice insights about how .NET Core outperforms the old world (it usually always does).

Preparing Jint for optimization journey

The two first pull requests I made to Jint were:
After we have the benchmarks in place, we can start to measure and optimize, if the benchmark results get considerably better, let's open a pull request! The communication with project maintainer is always easier when you have the numbers to back you up. Running hand-rolled a stopwatch loop is isn't good proof as there can be a lot of variance based on environment. 

Next we can finally start discussing performance pitfalls and their remedies.

Comments

Popular posts from this blog

Optimizing Jint part 6: DictionarySlim is coming to town

Running docker-compose against WSL2