Optimizing Jint part 3: Banishing LINQ
So you want your code go fast? Just remove all occurrences of
using System.Linq;
and make the code compile again without it and be done with!
In all seriousness, LINQ is a powerful tool, but it also has it's pitfalls. It really makes a difference if you use LINQ in the wrong places, one being the hot paths of a Javascript interpreter.
Next I'm going to show examples of better performing code - it's neither prettier nor more idiomatic than the LINQ counterpart. So be warned. You should again weigh your use case and whether anything here is really needed.
Jint optimization: usage of LINQ in Jint
Jint used LINQ in some places and made the code quite easy to read. Jint being an interpreter, it made things unnecessarily slow however. When scripts are run there can be millions of iterations in loops, choosing the correct way of looping and collecting data can make a big difference.
Removing LINQ is quite the easy optimization for libraries that have hot paths and intensive calls.
Removing LINQ is quite the easy optimization for libraries that have hot paths and intensive calls.
What's wrong with LINQ
LINQ is a powerful abstraction that allows you to create idiomatic and logical constructs, usually easy to read code. It however, does this with the price of performance. Every abstraction you add, adds performance penalty.
Here's a classic example, that brings out most of the bad things you can think of. Let's filter and transform an array using LINQ and then compare that with boring old procedural c# code that's somewhat optimized for the case and is not using anything fancy.
So the LINQ version is quite idiomatic and gets the job done in much less lines of code, what about the numbers?
I'm deliberately using .NET Framework here, the LINQ version will run a lot faster under .NET Core, but the full framework is still the case we all need to consider as a runtime environment.
Can you spot the difference? There's a closure allocation because outer context is being referenced (the forbidden variable), the ToArray has trouble determining the wanted size and grows with the flow. Then there a are those little things like .Where taking a IEnumerableI'm deliberately using .NET Framework here, the LINQ version will run a lot faster under .NET Core, but the full framework is still the case we all need to consider as a runtime environment.
Context-capturing closures
LINQ over arrays
It might not be directly evident, but LINQ needs to create a wrapper to allow you use the LINQ methods when you access an array. Array does not directly implement IEnumerable so tricks are needed. Let's examine how foreach and LINQ-based functions to return whether array of strings contains a non-null element differs.
LINQ against an array
The LINQ version produces famous <>c classes that you might have seen in task traces. So looking at the following example you can see how the machinery to hold state is produced by the compiler.Regular loop against an array
Compare the earlier to a version that doesn't use LINQ, see how much less code and state to handle.Don't optimize too early (or too late)
I don't believe too much in "don't optimize early" mentality. I believe in:
Understand the use case, common usage patterns, input variance and size - and then optimize or skip optimization accordingly.If the code is in hot path, called a lot and affects the overall system performance - just make it fast to begin with.
There are however those many places where writing LINQ makes code a lot cleaner and idiomatic. When you know performance won't be an issue, just endorse it. It's a good tool as long as you know its limits.
Things to remember and consider
- Non-LINQ version is usually/always faster
- LINQ can capture context and throw memory allocations through the roof
- Prefer concrete (collection) types when using LINQ extensions
- Array might be a bad source in hot paths as it requires a wrapper
Comments
Post a Comment