When it comes to storing vast amounts of data, your options are fairly limited. You can choose a “traditional” hard drive with spinning disks, more newfangled Solid State Drives (SSD), or—if you want to go truly old-school—print out all that information on paper and store it somewhere safe.
Now a team of scientists has offered another possibility for data storage: DNA.
Headed by Nick Goldman of the European Bioinformatics Institute (EBI), the group has coded data onto deoxyribonucleic acid using a cipher based on its component nucleotides (G, A, T, C). It’s an update of a method devised by George Church of Harvard Medical School, who coded information into DNA using adenine and cytosine (two DNA bases) to represent zeroes and guanine or thymine to represent ones.
Church’s binary system, while capable of the task, was “hard for sequencing machines to read and led to errors,” according to a new article about the quest in Nature. The Goldman team’s solution apparently sidesteps those issues. Both methods, however, are expensive—around $12,400 per megabyte, the EBI scientists told the journal.
As a proof of concept, Goldman’s team coded 154 of Shakespeare’s sonnets into DNA, but the technique can deal with much larger datasets: Nature suggested that the 90 petabytes of data stored by CERN could fit within 41 grams of DNA—a significant weight and size reduction from the 100 tape drives currently tasked with holding that information.
And unlike hard drives, which generally last a few years at most, DNA can keep for millennia under the right conditions. (Otzi the Iceman, for example, contained perfectly readable DNA even after a 30-century internment in a glacier.)
That doesn’t mean companies are going to rush to store their data in DNA anytime soon. For one thing, businesses often need to retrieve and analyze data in a matter of minutes or hours, something not possible with this technique. But as an archival method for massive datasets, DNA may someday come in useful.