Main image of article Speed Test 2: Comparing C++ Compilers on Windows
[caption id="attachment_14568" align="aligncenter" width="500"] Hands, typing. Always with the typing.[/caption] In our previous article, we compared a few C++ compilers on Linux. This time we're going to perform a similar set of tests for Windows. One notable aspect of using C++ on Windows is you're typically encouraged to use an IDE such as Microsoft Visual Studio, or else a competing-style IDE such as one from Embarcadero called RAD Studio (which came up from the ashes of the old Borland products, Delphi and C++ Builder). Although it can also be used in a standalone fashion, the Intel compiler integrates into Visual Studio (but not the free Express versions of Visual Studio); there are also some very nice, free IDEs such as CodeLite and Code::Blocks. But to perform tests of the compilers themselves, I removed the IDE variable in favor of focusing only on the command line. Like all things Windows, it can get costly doing C++ development in this environment. However, there are a couple notable exceptions:
  • The free and open-source cygwin system includes a build of the g++ compiler, which I'm including in these tests.
  • The free and open-source mingW also includes a builder of the g++ compiler, which I'm including in these tests.
  • Microsoft makes its Express versions of Visual Studio available for free, including the C++ version. While these are greatly scaled-down, the underlying compiler is the same as the one used in the premium versions of Visual Studio, and I'm including the premium version in these tests.
  • Embacardero, which produces a product we're comparing today called RAD Studio, also has a free compiler called Borland C++ 5.5. I'll have more to say about this product shortly. Short version: Skip it. It's worthless. But the compiler they ship with their RAD Studio proved to be quite impressive.
For the premium compilers, today I'm testing:
  • Intel C++ Compiler, as part of their Parallel Studio XE 2013
  • Microsoft C++ Compiler, as part of Visual Studio
  • Embarcadero C++ 6.70 Compiler that ships with RAD Studio and C++ Builder.
In all cases I'm using the 64-bit versions of the compilers. But let's be clear about something: When I say 64-bit compiler, I'm talking about the generated code. The compiler itself may or may not be a 64-bit application.

A Short Aside About Installing Embarcadero's RAD Studio

Many of the names in the C++ Builder portion of the RAD Studio installation options take me back to the golden age of computers, when the web was brand new and few people had yet heard the letters “www.” I'm talking about the mid 1990s, when I used things such as Interbase and the Borland Database Engine, as well as the VCL controls; these are all available today in the RAD Studio installer, apparently left around to gum up the machinery when Borland dumped its dead dinosaurs on Embarcadero’s front porch when the latter bought the aging technology known as C++ Builder. The installer proudly proclaims that you can build powerful, amazing software with—and I quote—“blazing native performance.” But to use the installer, you have to let it first install Microsoft JSharp Runtime 2.0, which was released six years ago in 2007, and later killed off by Microsoft. Oh, and if you're interested in using this proud RAD Studio tool for distributed programming with this newfangled thing called “The Internet,” it even supports CORBA, the Common Object Request Broker Architecture. I last fussed with CORBA in 1998. Just hearing the name is a flashback to the ‘90s when I sat in a lawn chair at Lollapooza, drank beer, and listened to brand-new bands named Smashing Pumpkins and Green Day. And let's not forget that Embarcadero also offers a free C++ compiler for Windows called BCB32. The download includes an installer that was copyrighted in 2000 by Inprise; the accompanying text brags that it's a powerful “ANSI” compiler and that it's “the high performance foundation and core technology of Embarcadero’s award-winning C++Builder product line,” and that it includes an “ANSI/ISO Standard Template Library.” I'm not going to review this standalone compiler here. Instead I can show it to my son and explain what life was like in the olden days. One odd little side-note before we move on: I was checking out the C++11 features available in Embarcadero's C++ compiler. In one of their blogs, the company says, “C++11 support by BCC64 is based on Clang 3.1; for more information, see http://clang.llvm.org/cxx_status.html.” And indeed, there was a clang executable in their bin directory. I tried to run it, but it would just freeze for a long time before finally doing anything. I'm not sure what was happening there; as such, I used the other one I downloaded separately.

LLVM

There's a semi-official build of the LLVM compiler that integrates with Visual Studio. I installed it and nothing would compile within Visual Studio; I received errors when it tried to compile Visual Studio's own header files. But from the command line, I was able to set up the path and other environment variables, and after that it worked just fine.

The Tests

The last time I published tests of C++ compilers, some people commented about why I would bother testing the times to compile. So let's address that right now. First, I personally have no real use for the compile times, and clearly some readers didn’t, either. But some organizations do, for various reasons. If they determine that two compilers meet all their needs and are essentially equal in every aspect, they will probably consider other factors to help them decide which one to settle on. Price is one possible factor. But if they're compiling thousands of files daily, they may want to make sure the build completes in a reasonable timeframe. As we saw in my last article, the difference in compile times wasn't huge, but the differences were still there. So I'm mostly offering the time-data here for completeness’ sake. In order to test how long a process takes, we ideally want to turn off extraneous software and services that might interfere with the process. With Linux, you can turn off most services and still have a command-line. With Windows, well, that's nearly impossible: shutting down these “essential” services will usually halt the system. Fortunately, we can get around the problem by using a system with multiple cores. Windows doesn't let you devote a single core to a particular process—but if you have, for example, a quad core with hyperthreading, there's a good chance your time-consuming process will get its own core. Also, by letting the usual system processes do their thing while we test, we end up with numbers that will be more realistic on an actual development machine. In each case, I did a full compile to build an executable, and noted the file size. Then I cleaned and did a “compile only,” recording the time as I did so. I repeated the compile-only three times, and recorded all three times. In a couple instances, when something strange happened (such as the compile taking an extraordinarily lengthy amount of time—probably because some other process interfered), I considered it a statistical outlier and didn't include it in the results, instead doing an additional timing to replace the outlier. To measure the time, I tried to find an equivalent to the Linux time command. There isn't really one in Windows. Cygwin includes one, but I didn't want to try to get all these compilers configured under cygwin. Instead, I used Windows Powershell. It includes Measure-Command, which measures how long the command passed to it runs. But unlike the Linux time command, it doesn't provide details on how much of that time is actual processor time, versus time waiting for the operating system (and so on). Instead, it seems to just measure how long it takes from the time the command launches until it finishes. I used the same huge C++ file I did in the previous article. I won't repeat the explanation; instead you can read about it there. Finally, you'll notice I didn't include a test with the multicore library Threading Building Blocks. For the tests today, I'm only examining the compiler times and the size of the object files and executable files. Also, I plan to look at the generated assembly code for the optimizations, and compare how the compilers do there. I've done quite a bit of work with assembly code, and could therefore do an analysis of the generated vectorized SIMD code, and determine which compilers offer the best support for vectorized code, as well as including code in the final executable that detects the existing SIMD features of the host processor, and runs code accordingly.

Embarcadero Compilers

Although I poked fun at some of the tools that come with the Embarcadero RAD studio, I will say that ultimately I was impressed with this compiler in terms of speed and features. I didn't do a full test of its C++11 features, but I did measure some features such as lambda functions (not for this article, but just as a side project out of curiosity). And to my surprise, the compiler handled the C++11 features just fine. The original Borland had a history of building compilers that were known for their speed and modern features for the time. While the Embarcadero RAD Studio might seem a bit silly from today’s perspective, the underlying compiler was actually quite impressive, both with its features and its speed. (Note, however, that the C++11 features were only present in the 64-bit compiler, not the 32-bit compiler.) Here are the results:
Embacadero bcc64 (No optimization. Default is to not include debug information.) Command line: Measure-Command { bcc64 -S test4.cpp } Total Milliseconds for compile only (first try): 3134.0215 Total Milliseconds for compile only (second try): 2936.2765 Total Milliseconds for compile only (third try): 2941.5167 Object file size: 4,663,371 bytes Final executable file size: 770,657 Embacadero bcc64 (full optimization with command-line switch -O3) Command line: Measure-Command { bcc64 -S -O3 test4.cpp } Total Milliseconds for compile only: 1812.0551 Total Milliseconds for compile only: 1823.2381 Total Milliseconds for compile only: 1870.9476 Object file size: 416 bytes Final executable file size: 59,602 bytes

LLVM clang

The clang compiler as usual did quite well in the tests. It wasn't as fast as the bcc64, but with its optimizations it produced small executable files.
clang (No optimization, with command-line switch -O0) Command line: Measure-Command { clang -c -O0 test4.cpp } Total Milliseconds for compile only (first try): 5957.0582 Total Milliseconds for compile only (second try): 5880.9037 Total Milliseconds for compile only (third try): 5851.7289 Object file size: 1,271,660 bytes clang (Full optimization, with command-line switch -O3) Command line: Measure-Command { clang -c -O3 test4.cpp } Total Milliseconds for compile only: 4996.9006 Total Milliseconds for compile only: 4828.7302 Total Milliseconds for compile only: 4818.8122 Object file size: 143 bytes Final executable file size: 3335 bytes

64-bit MingW

The MingW compiler's appeal is that it links with the Microsoft runtime libraries built into Windows, meaning you don't have to ship an extra DLLs with your final program (and it’s also free, which is always good). After building with MingW's g++ compiler, you can test-check out the final executable using Visual Studio's dumpbin program, passing in the /imports option. Doing so shows that the executable links at runtime with msvcrt.dll. (There's also an interesting commentary on their site about the state of free compilers in the Windows world. I encourage you to check it out.) Also, the 64-bit version is maintained separately; I used a distribution found on SourceForge. Once installed, I pointed Powershell's path to the necessary executables.
g++ under 64-bit MingW (No optimization, with command-line switch -O0) Command line: Measure-Command { x86_64-w64-mingw32-g++.exe -O0 -c test4.cpp -o test4.o } Total Milliseconds for compile only: 12232.63 Total Milliseconds for compile only: 12147.5664 Total Milliseconds for compile only: 12179.9838 Object file size: 3,443,685 bytes Final executable file size: 2,112,966 bytes g++ under 64-bit MingW (Full optimization, with command-line switch -O3) Command line: Measure-Command { g++ -c -O3 test4.cpp } Total Milliseconds for compile only: 5114.4662 Total Milliseconds for compile only: 5164.4459 Total Milliseconds for compile only: 5100.8012 Object file size: 864 bytes Final executable file size: 258,462 bytes

64-bit Cygwin

The g++ available with Cygwin works similarly to that of MingW except, instead of linking with the runtime library that comes with Windows, it links with Cygwin's own runtime library, cygwin1.dll (which, again, you can verify using the dumpbin utility). In order to use this compiler, I used powershell and set the path to point to the distribution's own g++ compiler.
64-bit g++ under Cygwin (No optimization) Command line: Measure-Command { g++ -c -O0 test4.cpp } Total Milliseconds for compile only: 14410.9207 Total Milliseconds for compile only: 14457.1481 Total Milliseconds for compile only: 14427.838 Object file size: 3,443,685 bytes Final executable file size: 1,917,623 bytes 64-bit g++ under Cygwin (full optimization) Command line: Measure-Command { g++ -c -O3 test4.cpp } Total Milliseconds for compile only: 6485.9999 Total Milliseconds for compile only: 6486.0892 Total Milliseconds for compile only: 6439.0032 Object file size: 864 bytes Final executable file size: 61,696 bytes

Microsoft C++ Compiler

This compiler can be run at the command-line, although Microsoft clearly expects that most people will be using it from within an IDE, particularly Visual Studio. The command-line program is called cl.exe. To tell it not to link, pass it the /c option. There are several optimization levels, including “minimize space,” “maximize speed” and “maximum optimizations.” With both the minimize space and maximum optimizations, the compiled object file size of the test file was very large—much larger than the other compilers. The final executable with maximum optimization was only a bit larger than the other final executables, but still larger nevertheless. (Remember, although I'm not using any #includes to keep things like iostream out of the picture, there's still a basic runtime that gets linked in, which includes, for example, a start function that calls the main.)
Microsoft cl with Maximum optimization (Note: Testing with “minimum space” optimization only resulted in the same file sizes for both object file and executable file.) Command line: Measure-Command { cl /Ox /c /nologo test4.cpp } Total Milliseconds for compile only: 9378.7025 Total Milliseconds for compile only: 9444.0398 Total Milliseconds for compile only: 9536.6226 Object file size: 815,724 bytes Final executable file size: 48,128 bytes Microsoft cl with no optimization Command line: Measure-Command { cl /c /nologo test4.cpp } Total Milliseconds for compile only: 7889.2335 Total Milliseconds for compile only: 7762.6916 Total Milliseconds for compile only: 7749.9278 Object file size: 986,117 bytes Final executable file size: 126,976 bytes

Intel Compiler

The Intel compiler occasionally “calls home” to an Intel-owned Website to check licensing information. When it does so, it prints out a message about when the current license expires. I didn't use the results when that happens, since it would add time and skew the timing results. Also, the Intel compiler offers several options for optimizing speed, as well as one for limiting code size, as well as one for, as the help states, enabling “speed optimizations, but [disabling] some optimizations which increase code size for small speed benefit.” For my test program with a huge number of templates, none of these optimizations differed in the resulting executable file size, which makes sense since they're mostly focused on speed. We'll test them separately.
Intel icl compiler with no optimization Command line: Measure-Command { .icl /Od /c test4.cpp } TotalMilliseconds for compile only: 5974.4083 TotalMilliseconds for compile only: 6008.3199 TotalMilliseconds for compile only: 6058.1437 Object file size: 2,391,263 bytes Final exectuable file size: 175,616 bytes Intel icl compiler with full optimization Command line: Measure-Command { .icl /Ox /c test4.cpp } TotalMilliseconds : 13064.7322 TotalMilliseconds : 13085.5804 TotalMilliseconds : 13117.1896 Object file size: 685 bytes Final executable file size: 63,488 bytes

A Question for Readers

When I did the tests for the first article, I noticed something odd, and some readers noticed it too. For some compilers, running with full optimization was faster than running them without. I saw that this time around, as well: both the bcc64 and g++ compilers exhibited this behavior. To be frank, I'm not sure why this would happen. Optimization requires some sophisticated algorithms that analyze the code, and so it seems to take longer to compile when optimization is turned on. The generated files with optimization turned off are larger, but not so much so that it would take that much longer to write them. I'm not sure about the root cause of this situation, and I would love to hear from readers who might have honest suggestions for why this might be happening, especially readers who might have worked for a compiler company. (I'll talk to some people I know and invite them to comment.)

A Final Note

I want to be clear about one point: we can't really compare these tests to the Linux tests in the earlier article, because we're in a completely different environment. Not only am I on a different operating system, I'm actually on a slower machine: this is just a quad core AMD unit. (With the previous article, some readers mentioned that I didn't include all the specs for the computer; I did this time, but I want to be clear that my goal here is to show how the compilers performed relative to each other. Obviously, if you ran these on a faster machine, you'll get shorter times. And further, the specs of the computer certainly shouldn't affect the final size of the executables.)   Image: wrangler/Shutterstock.com