Main image of article Can C# Ever Match C++ for Speed?
Can C# beat out C++ for speed of execution? If you asked me that question, I’d generally say “No.” But interested in testing the idea out, I recently took a program written in C++, converted it to C#, and compared the two. The C# version ran twice as fast—something I wasn’t expecting. Before we plunge into the details of the program, let’s ask if there are areas in which C# will out-speed its competition in this instance. Recent upgrades and feature additions have made C++ a more powerful platform. In past iterations, for example, C++ featured no move semantics, classes had copy constructors, and temporary objects were copies; with C++ 11, classes gained move constructors and move assignment functions, making the language more efficient if you use move semantics. When it comes to strings, C++ also has an advantage, in that C++ chars are 8-bit, whereas C# uses UTF-16 chars that are twice as large. But there’s also no reason why C# (or other platforms such as Java Virtual Machine, for that matter) can’t be faster than C++ with the right optimizations.

Poker Evaluation

I had written a program in C++ 11 to demonstrate std::async. It did so by evaluating a five-card poker hand to see whether it was decent (i.e., anything from pairs up to a royal flush). I'd previously created a text file with one million poker hands generated randomly, with each card defined as two characters (such as AS for Ace of Spades, TD is ten of diamonds, etc). The code read each hand from the file, 10 characters per line, and wrote another text file with the result of each evaluation; it took 3.5 seconds to do all million. Ironically, the asynchronous version that used std::async (and std::future) took 20 percent longer to run, presumably because the extra overhead of fetching the future value was significant compared to the evaluation time. By adding a one-millisecond delay and processing 20,000 cards, it took 35 seconds single-threaded but only 3.5 seconds asynchronous. The code below creates a PokerHand object from the string str that's captured as a parameter by the Lambda expression. MaxThreads is a constant defined as 12 for my computer, which is a six core with hyper-threading. (If you want to full C++ code for this, it’s hosted on GitHub.) [csharp] std::array<std::future<std::string>,MaxThreads-1> futures; auto count = 0; while (count <MaxThreads-1) { if (filein.eof()) break; std::getline(filein, str); rowCount++; futures[count++] = std::async([str]{ PokerHand pokerhand(str); auto result = EvaluateHand(pokerhand); return pokerhand.GetResult(result); }); if (count == MaxThreads-1) { for (auto & e : futures) { fileout << e.get() << std::endl; } count = 0; } } [/csharp] Note that the synchronous version does this simple loop below; a compiler directive, Multi, selects which version is built: [csharp] while (std::getline(filein, str)) { PokerHand pokerhand(str); auto result = pokerhand.EvaluateHand(); pokerhand.WriteResult(fileout, result); rowCount++; } [/csharp] I translated the code more or less line-for-line into C#, and, to my surprise, it only took 1.7 seconds to run. That’s twice as fast on the same hardware and not using async. So either I’m rubbish at writing C++ (quite possibly!) or there’s another explanation. In order to save my C++ reputation, I decided to investigate, looking first at file I/O. Each time round the loop, both versions created an instance of the PokerHand class, then called the EvaluateHand() method. I commented out the bulk of the code from rowCount++ on down to the last brace and added this line immediately after the std::getline line: [csharp] fileout << str << std::endl; [/csharp]

So now it's just a simple "read from the in stream" then "write to the out stream," inside a loop. Running the C++ version took 2.39 seconds for all million rows. When I did the same using the StreamReader() and StreamWriter() classes in C#, it took just 0.11 seconds, over twenty times faster. Clearly the Windows stream I/O classes std::ifstream and std::ofstream are quite slow in comparison to those .NET stream classes. This is the C# code below:

[csharp] var stw = new Stopwatch(); stw.Start(); using (var sw = new StreamWriter(@"results.txt")) { using (var sr = new StreamReader(@"cards.txt")) { while (!sr.EndOfStream) { var str = sr.ReadLine(); sw.WriteLine(str); } sr.Close(); sw.Close(); } } stw.Stop(); Console.WriteLine(@"Took {0:0.00} seconds", stw.ElapsedMilliseconds / 1000.0); [/csharp] Deducting those times from processing means the rest of the C++ code took (3.5 - 2.39) = 1.11 seconds and the C# version took 1.7 - 0.11 = 1.59 seconds. So now the C++ code was clearly executing faster. To investigate further, I reverted the code, then commented out the calls to EvaluateHand() in both versions, instead assigning the HighCard enum value. After subtracting the difference between those times, I found that it was only taking 0.08 seconds for the million C++ EvaluateHand() calls and 0.17 for the C# version. That's fast enough to do 12,500,000 evaluations per second in C++ and about 6 million in C#.

Conclusion

This was an empirical comparison, rather than timing specific instructions, which I was initially tempted to do. It's quite surprising that there is a 20 x difference in the streaming performance. While C# code is unlikely to outperform the same C++ code, in some scenarios it can be very fast. For example, I recently wrote a C# utility to do a frequency analysis, counting how many times every word occurred in a 46 MB text file—it took just five seconds on an older PC.