Coding Challenge: Prove Your Factorial Fluency

This is the first coding challenge in a regular series and runs through October. We’re looking for you to write a computer program in C#, C++ or Java to solve a programming problem. If you win, you get not only bragging rights, but a genuine Dice T-Shirt, too (!).

employment competitionTo enter, you must submit just source code, not executable binaries. That way if there’s a problem with your code, we may be able to fix it and of course by compiling and running it on my PC, we can ensure that it’s fair. It also avoids the issue of running potentially dangerous code on a PC.

The deadline for this Coding Challenge is Oct. 31, 2013.

The Problem

Consider all the nine digit integers where each digit 1-9 occurs only once. The lowest of these numbers is 123456789, the next is 123456798 and the highest is 987654321. There are 362,880 such numbers if I have my maths correct. It’s 9! (9 Factorial) i.e., 362,880 different numbers.

If the numbers are ordered starting from the highest to the lowest, you have to write a program that returns the 100,000th in this sequence. So the first five are these:

1) 987654321

2) 987654312

3) 987654231

4) 987654213

5) 987654132

If you follow this sequence starting with 1 – 987654321 then 2 – 987654312, what number is the 100,000th in the sequence?

Fastest Correct Answer Wins

To make it more interesting, you should also write your code to run as fast as possible. Speed is the criteria for winning, so we’re looking for the fastest correct answer. You should include timing code in your program and it should time with an accuracy of 1 microsecond. All three languages can time code: Follow these links for timing code in C++, C# and Java.

The upper limit on execution time is 15 minutes. If it takes any longer then you need to rewrite it or start again. Any program that takes longer than 15 minutes on the contest PC will be disqualified.

If your solution is so fast that it runs in under a tenth of a second, please loop it multiple times so the total elapsed time is around five seconds. Then divide the five seconds by the number of times around the loop to give the average time per solution. Eg if it runs 75 times in 5 seconds then it’s taking 1/15th of a second, or approx 0.06666 seconds.

The Output

Your program should output two lines as shown below, the answer followed by the time to do one iteration. (The value below is not the correct answer!)


4 minutes, 32.678 seconds

Submitting Code and Compiling

Please submit your entry zipped up as an attachment to this email address and please don’t include any executable binaries such as exes or dlls.

The competition PC is an i7 950 with 4 cores doubled to 8 by hyper threading and 24 GB of ram. You are allowed to use multithreading/multitasking etc.

For compiling C++ and C# I’ll use Visual Studio 2012. Please include all the project and solution files such as .suo, etc., so I can open the solution and build it. Code built with earlier versions (VS 2003, VS 2005, VS 2008 and VS 2010) are all acceptable.

If you use GCC/G++ please let me know and include your preferred command line to optimize it. I use MinGW to build that on my Windows PC. All programs are compiled in Release mode, so please specify any optimization flags for GCC/G++ or specify them in the solution.

For Java, I have Java 7 installed on my PC. If you use any third party open source libraries that need to be installed, please let me know.

Again, the prizes are glory and a t-shirt. At the end of the competition we’ll publish the last version of each entry on SourceForge and highlight a winner. Be sure to include your name and website in the source code comments. By entering this contest you agree to allow Dice to publish your source code with credit to you as the copyright owner.

Submitting Newer Entries

It’s acceptable to submit newer versions right up until the deadline. If you’ve sent in several, then the most recent will be used.

The Results

Will be posted here and on SourceForge. I may periodically update this during the month as new entries come in. Soon after the deadline, the final results will be posted.

Updated October 3 to correct example error.

40 Responses to “Coding Challenge: Prove Your Factorial Fluency”

  1. @Steven: My take on it is that since the contest ends on the 31st, there’s not a lot of rush required. Now, granted, I’m going to be hovering over this site pretty much on the Monday after that weekend to see if I won, but I’m not going to worry about lack of news until then.

  2. Thomas, the Mischievious Optimizer

    Wouldn’t the winning code be the one that just outputs the correct answer (multi-threaded so it takes up as much of the computer’s resources as possible)? There’s nothing in the rules that says the answer needs to be calculated by the program… which makes this a challenge to just figure out the most optimal way to “return” the correct 9 digit number… (how stand-alone does the “program” need to be? Can I just time how long it takes to store the result to a variable, or do I need to actually call into another program/function to have it return results…? What if the compiler decides to optimize everything out, since it’s just storing the same value into the same variable?).

    Sounds to me like this will be like the challenge to create the “smallest self-replicating program,” which was won by the team that submitted a “null file”…

    • Richard Wolf

      Hmm…I was actually thinking along similar lines…I was able to do a paper run of my algorithm and come up with the answer (which I am very confident is correct). Wag that I am, I thought about just submitting a cout << the_answer; type solution…but I figure that goes against the spirit of the context. I figured that, in the spirit of the contest, it should be possible to modify the code so-as-to produce the nth highest possibility (that is, you could get the 50,000th highest number or the 100,000th just as easily). My code (Xcode on a MacBook Air/Core i7) is running in about 0.00003 sec. My feeling is that brainier (i.e. most) other entrants will easily surpass that. 🙂

      • David Lotts

        @Richard, I took the same approach, but yours is 10 times faster.
        Mine runs 0.000511 seconds
        averaged over 10,000,000 runs.
        I’m betting yours is C++ ?

        I did it in Java in Eclipse on Ubuntu on a 1.9GHz AMD Phenom(tm) II P840 Triple-Core Processor. (Which I believe is much slower than an i7)
        Maybe I still have a chance!

        • Richard Wolf

          @David Yes, mine is C++…well, really it’s straight-up C that I modified just enough so that it would compile as C++. I get 5,000,000 iterations in about 3.72 seconds in my development environment (Xcode 5, LLVM on a MacBook Air (2011), Core i7). I think we’re all on the same wavelength about how this should be solved…recursion is more elegant, but since is one of those tail recursion situations, so I went recursion-less. If I were brainier, I might be able to kill a subtraction or something…squeeze a bit more out of it…but I don’t think there’s much more that can be done. I’m really looking forward to seeing everyone’s take on this.

    • @ Thomas I do wonder if I should turn in a “solution” that simply assigns the answer along with a switch to turn on my general solution. I would hate to miss out on the t-shirt because I didn’t cheat. Hmm… If I knew which graphics card he uses I could write a massively parallel solution for the 5 second timing test that would finish trillions of calculations.

  3. @Thomas, I hope that doesn’t win because it just wrong. It doesn’t really solve the problem.
    @judges. What the requirement for thread do you have to use threads in the solution or can you just create 8 separate and have then each solve it x number of times. I.e. are we timing the solution time for the algorithm or how many times can I run in 5 secs.

    • @Richard, I was able to run my Microsoft Excel VBA code in 8.13e-6 seconds (614,391 times in 5 seconds) on a Windows 7 Core i5 2.27 GHz computer with 4 GB of RAM.

      Sadly, VBA doesn’t allow multithreading, or I might be able to do more optimizations (more than the Mischievous’s suggestion of just trying to run it as many times as possible to increase the count) and maybe get it down further. 🙁

      @Daren, I think it’s because the judge doesn’t want to install who knows how many programs…

      @Steven, I have the feeling that he’s going for the timing of the solution, not how many times it can run in 5 seconds, although it isn’t specified in the contest rules.

      @Vasan, Doesn’t Java have an option where the threads can be put on different cores? And are context-switches always required when the thread returns results to other threads/the main thread, or could it just pass a message so it’s quicker than a context-switch?

      • @Thomas – I checked on multi-threading into multiple-cores, and it appears that there may be some options of tuning the JVM on Solaris. Found it on this link: Pretty confusing reading, but worth trying on Solaris.
        On context-switches – in a multi-core system each core run independently, and context switches should happen only when co-ordination takes place. I guess that depending on your algorithm for this problem, you may get some advantage (skeptical, since this problem has a lot of dependencies between the digits).
        FYI: I got a result of 250ns (0.000,000,25). Hopefully the answer is correct. However on this scale, computation on timings is probably way off. But stop watch showed a correlation. (i7-2670QM quad-core, 2.2GHZ,windows 7-64 bit, 8GB laptop)

        • Richard Wolf

          Wow, 250 nanoseconds? The best I can get is roughly 620 ns. I can only get that if I average across 5,000,000 iterations in 2 to 3 seconds (I suspect this is because I’m seeing hidden optimizations over the large number of iterations). My processor is slower than yours (1.8 GHz)…still, it’s not enough to make up the difference in our speeds. My bet is that that t-shirt is yours. 🙂

  4. @All – Why is multi-threading critical here? Since this is CPU bound, the only advantage of multi-threading would be if all cores were used. I am not sure that each thread will be assigned to separate cores from ‘one’ program (in Java/C++ ..). I believe that windows runs different tasks/separate programs on the individual cores, but not the sub tasks, which would be threads.
    If it does, you may have some advantage, however, balance this against the context-switching overhead. IMHO that multi-threading in this CPU bound task, is not going to be of much help in the Windows O/S

    • Richard Wolf

      I agree…I think the amount of resources needed just to setup threading for this problem would probably make any threading solution run more slowly than a straight-up solution. Threading is an awesome thing…but in this case, I think it’s a red herring. 🙂

  5. Steven Heithoff

    Problem with Java timer code. The code you sugguested has a bug in it. If you call the getElapsedTimeSecs function it will round down to the nearest second. Ie if it takes 6.9 seconds it will return 6. So I would suggest it be changed to this below.

    It should be changed to
    public double getElapsedTimeSecs() {
    double elapsed;
    if (running) {
    elapsed = ((System.currentTimeMillis() – startTime) / 1000);
    else {
    elapsed = ((stopTime – startTime) / 1000);
    return elapsed;

    if you are using it this function.

  6. Well I did some testing with threading. With my I7 laptop. I was able to thread 8 times. Resulted in net 3-4 times has many solutions. So when though I’m just having the system do extra calcs. I’m not using threads to complete a indiviual solution. Just to do more of them faster. So if threading is not restricted we all most likely could implement and get another 3 or 4 times solutions.

    • Richard Wolf

      Hmm…interesting take on threading. I think if you’re just solving this *once*, then threading doesn’t make any sense. But if you’re trying to get the maximum runs in any five second interval, then it might (probably would) be worth the overhead. As I noted in an earlier comment, I’m getting about 5,000,000 iterations in 3.72 seconds on a single thread. If I multiplied that, roughly, by 4 or 8, I’d get 20 to 40 million results in five seconds. Not sure if that is in the spirit of the contest…but I’d still bow to anyone who won that way. 🙂

  7. Wondering Richard

    I have it down to 170 ns on my 5 year old computer. I suspect that I an getting close to the end of the optimizations as I am debating if we actually need to add things or would an XOR work…

  8. Richard Vasquez

    After seeing numbers like 620ns, 250ns, and 170ns, I was really starting to feel bad about what I had originally thought was a pretty spiffy C++ 680us result which beat hands down my C# 4ms project.

    So I cracked open some of my books and started reading and experimenting.

    It’s not that I mind getting beaten, and I’m sure there’s possibly someone submitting a light speed implementation right now that’ll beat me. I just didn’t want to get beat by a few orders of magnitude. 🙂

    Non-premature optimization resulted in the following:

    Intel Xeon W3505@2.53GHz => 34 to 36 nanoseconds, no threading or multiple processes.

    • Richard Wolf

      Yeah, the best time I could get was 430 ns. I felt pretty good about that…and there’s no way I’m going to get it down an order of magnitude without going hard-core and flipping bits. I’m not going to win, but if I can (fingers crossed) get in the top five or ten, I’ll be happy.

      I kind of look at it this way…I put my best coding foot forward because I figure, hey, maybe some employers are reading this…it’s a good way to show my coding chops, such as they are. I think this contest is a tremendously clever move on Dice’s part…it’s fun for us, gives us a chance to show our mad skillz, and is a great way for employers to get a good look at what we can do. Fiendishly clever…good people at!

      • Steven Heithoff

        Did I do my math right? The fastest solution on update is 2.4 nano seconds or on a 3ghzish processer 6-10 instructions/cpu cycles?. Seems I going to have to study up that solution for next time.

        • Richard Wolf

          Yeah, when I got mine under a microsecond…hitting 600/700 ns, I figured, “Hey, I’m down to a few hundred machine instructions…that’s gotta be pretty close to the maximum performance you can get.” So if someone’s got it down to under 100 ns on a 3GHz machine…well, that person is looking at, maximum, one/two hundred machine instructions? We’re talking hard-core there. Of course a piece of this is that we’re supposed to iterate over 5 seconds…so, dunno what tricks folks are using to cram the maximum number of computations over the greatest number of threads/cores, going for the lowest average time/computation. Either way you cut this, color me impressed. 🙂

        • Richard Vasquez

          Yeah, I saw that number, counted the zeroes and just plotzed. I submitted an entry earlier this month, and if that one shows up on the update, I can scale it to my 680us, and then hopefully apply it to my 35ns answer to compare.

          I’ve been thinking on it off and on since I saw that number and I have no idea how it’s done that quickly. I’m definitely looking forward to learning something new from that.

          I went and unrolled my loops and coded in some lookup tables, and could not make it faster. At least not without LLVM. That turned out to be an amusing, albeit useless, experiment.

          • Thomas Gauthier

            @Richard, I’m looking forward to learning something new from the winner too, but it seems to me like the mischievous optimizer might have given away at least one of the winner’s tricks…

            @David, could you check how long the top 3 winning solutions take if the programs are modified so they are single threaded and output/calculate the results for all positions in the sequence- not just 100,000? That might be enlightening…

          • Richard Vasquez

            @Thomas: For what it’s worth, I’d say that if the time was the result of threading, it’s legit, but I too would like to see what the result would be if the search index was changed. I’m fine with that as I have some variable set to 100000 which could be changed to something else, then rerun, so not worried.

            If it _did_ turn out to be a “mischievous optimization”, I’d definitely not be happy.

  9. Richard Wolf

    Any word on the results? I’m positive I lost…but I”m hoping I got into the top ten. No biggie in the Great Scheme of Things…but I confess I super curious to see how I did. 🙂