A NoSQL Alternative: Disk Folders

by David Bolton Apr 23, 2014 4 min read

One of the joys of programming is that there are often many different ways to accomplish the same goal. That comes in handy when you’re not feeling too confident about a given solution. Consider this problem I recently faced. I was choosing between a relational database or NoSQL for a game server I was working on. Whichever I chose would contain a couple dozen tables and would need to support up to 100,000 users. My real-time needs were very modest, especially compared to say Candy Crush Saga or Farmville, which have millions of users and millions of dollars to throw at the problem – I just needed to be able to receive updates and later do a lot of processing. I was leaning toward NoSQL as the solution but that was going to mean learning a new technology, and I didn’t want to spend tons of time on it. Click here to find game developer jobs. Unsatisfied with both options, I started to look for alternatives, which got me wondering about using folders. How many files can you have in a folder? Could I use a hierarchical structure of folders to hold the files? A bit of digging found a couple of interesting facts about Windows NTFS folders. As it turns out, folders can hold more than 10,000 files; you’ll just want to disable the 8.3 naming (left over from the 16 bit days) to improve performance.

I went with 100,000 folders organized as 100 folders numbered 0-99, each containing 1,000 folders numbered 0-999. I wrote a quick Delphi program to test out the structure. It created the 100,000 folders in 55 seconds, about 500 microseconds per folder. I put a 20 line text file in each one, which took a bit longer, about 365 seconds for all 100,000 folders. Finally, I tested choosing 1,000 folders at random, reading the text file, appending a line then writing it back. That took 4.4 seconds or about 4 μs per read/update/write. I rewrote the program in C# using File.ReadAllLines() combined with the LINQ .ToLines() to read the file into a List<String> in memory; I used File.WriteAllLines() to write it back. (By default, ReadAllLines() returns an array of strings which is not as convenient as a List<String>.) This method, below, takes about 4 seconds to run – about as long as the Delphi version.

As .NET sits on Windows, deep down they’re likely using the same code. Note that I built folder and file paths using Path.Combine(). That’s so the code can be ported to Linux, running under Mono without changing lots of front slashes to backslashes. On Mono, I just changed the constructor path from c:\dice.com\folders to /var/temp/dice.com/folders. $folderPath = Path.Combine(@"C:\", "dice.com", "folders");$

to:

I ran the same test in Mono Develop version 4 (available as part of Ubuntu LTS 14.04, it’s still in alpha but very usable) on a seven-year-old PC with 2 GB of RAM. Creating the folders and writing to the text files took 46,559 μs and 597,465 μs respectively. What was really slow though, was the 1,000 random reads/append/updates, which took about 40,776 μs each, 10 times the Windows speed! I can't explain this but it doesn't seem very good. With those speeds, using a file system in place of a database on Linux is not an option. Keep in mind, however, this was an alpha version; it may prove worthwhile retesting on a non-alpha version with newer hardware.

Next Time

In general, using folders is quite slow. In fact, an old programming method from 25 years ago (back when I was writing Turbo Pascal!) may be a lot quicker. Back in the those days, before databases were in common use, I'd store game information in a map. I did this by using binary access to a file and a chain of file pointers. So instead of having 100,000 folders, I'd have an index file with 100,000 records, each holding a file pointer into a data file holding all the text. We'll explore that method further in a future article. The C# source and project files from this test are available on SourceForge.

A NoSQL Alternative: Disk Folders

Next Time

Related Stories

A NoSQL Alternative: Disk Folders

Next Time

Related Stories

David Bolton

A NoSQL Alternative: Disk Folders

Next Time

Related Stories

A NoSQL Alternative: Disk Folders

Next Time

Related Stories

David Bolton

Related Articles

Does Web Game Scalability Demand NoSQL?

Testing Couchbase as a NoSQL Server

NoSQL Embraced by Database Pros