Benchmarking C#/.Net Direct3D 11 APIs vs native C++

This is a resume from a old post found on internet about benchmarking Direct3D 11.

If you are working with a managed language like C# and you are concerned by performance, you probably know that, even if the Microsoft JIT CLR is quite efficient, It has a significant cost over a pure C++ implementation. If you don’t know much about this cost, you have probably heard about a mean cost for managed languages around 15-20%. If you are really concern by this, and depending on the cases, you know that the reality of a calculation-intensive managed application is more often around x2 or even x3 slower than its C++ counterpart.

In this post, I’m going to present a result of micro-benchmark that measure the cost of calling a native Direct3D 11 API from a C# application, using various API, ranging from SharpDX, SlimDX, WindowsCodecPack.

The Managed (C#) to Native (C++) interop cost

When a managed application needs to call a native API, it needs to:

  • Marshal method/function arguments from the managed world to the unmanaged world
  • The CLR has to switch from a managed execution to an unmanaged environment (change exception handling, stacktrace state…etc.)
  • The native methods is effectively called
  • Than you have to marshal output arguments and results from unmanaged world to managed one.

To perform a native call from a managed language, there is currently 3 solutions:

  • Using the default interop mechanism provided under C# is P/Invoke, which is in charge of performing all the previous steps. But P/Invoke comes at a huge cost when you have to pass some structures, arrays by values, strings…etc.
  • Using a C++/CLI assembly that will perform a marshaling written by hand to the native C++ methods. This is used by SlimDX, WindowsCodePack and XNA.
  • Using SharpDX technique that is generating all the marshaling and interop at compile time, in a structured and consistent way, using some missing CLR bytecode inside C# that is usually only available in C++/CLI

The benchmark was ported under:

  • C++, using raw native calls and Direct3D11 API
  • SharpDX, using Direct3D11 running under Microsoft .NET CLR 4.0 and with Mono 2.10 (both trying llvm on/off). SharpDX is the only managed API to be able to run under Mono.
  • SlimDX using Direct3D11 running under Microsoft .NET CLR 4.0. SlimDX is “NGENed” meaning that it is compiled to native code when you install it.
  • WindowsCodePack 1.1 using Direct3D11 running under Microsoft .NET CLR 4.0

Results

You can see the raw results in the following table. Time is measured for the simple drawing sequence (inside the loop for(i) nbEffects). Lower is better. The ratio on the right indicates how much is slower the tested API compare to the C++ one. For example, SharpDX in x86 mode is running 1,52 slower than its pure C++ counterpart.

Direct3D11 Simple Bench x86 (ms) x64 (ms) x86-ratio x64-ratio
Native C++ (MSVC VS2010) 0.000386 0.000262 x1.00 x1.00
Managed SharpDX (1.3 MS .Net CLR) 0.000585 0.000607 x1.52 x2.32
Managed SlimDX (June 2010 – Ngen) 0.000945 0.000886 x2.45 x3.38
Managed SharpDX (1.3 Mono-2.10) 0.002404 0.001872 x6.23 x7.15
Managed Windows API CodePack 1.1 0.002551 0.003219 x6.61 x12.29

And the associated graphs comparison both for x86 and x64 platforms:

Results are pretty self explanatory. Although we can highlight some interesting facts:

  • Managed Direct3D API calls are much slower than native API calls, ranging from x1.52 to x10 depending on the API you are using.
  • SharpDX is providing the fastest Direct3D managed API, which is ranging only from x1.52 to x2.32 slower than C++, at least 50% faster than any other managed APIs.
  • All other Direct3D managed API are significantly slower, ranging from x2.45 to x12.29
  • Running this benchmark with SharpDX and Mono 2.10 is x6 to x7 times slower than SharpDX with Microsoft JIT (!)