Using DNA as a Digital Information Storage Medium

Printer-friendly versionPDF version

Researchers in bioengineering and genetics at Harvard’s Wyss institute have figured out how to store 1000 times the amount of data previously possible on strands of DNA; storing approximately 700 terabits of binary code in a gram of liquid DNA solution[1].  The researchers were able to demonstrate this feat by storing 96 bits of information on a synthesized strand of DNA by assigning each of the four bases a 1 or 0.  The A and C bases were assigned 0’s while their reciprocal pairs, T and G were assigned 1’s.  The DNA was then sequenced to retrieve the binary code and make the information readable.

To demonstrate this technology, the researchers encoded an HTML book that included some 50,000+ words, images, and java script into about 55,000 159-nt oligonucleotides.  159-nt Oligonucleotides are synthetic DNA strands created from polymers in the laboratory.  While this may seem like a time consuming and expensive way to store data, the sequencing of the data can be random.  Each of the oligonucleotides has its own address attached, allowing the information to be ordered after sequencing[2].

This approach to DNA information storage has a number of advantages over previous DNA information storage techniques.  Each base only has one bit of information stored on it versus two, reducing the difficulty of reading complex code.  Additionally, attaching an address to each strand eliminates the costly and time-consuming step of ordering these tiny molecules into long chains.  The DNA synthesis was done in vitro rather than in vivo, mitigating cloning and stability issues.  Finally, the process uses next-generation processes for synthesizing and sequencing the DNA strands to reduce the costs of storing the encoded information by approximately 100,000 times that of first generation DNA information storage techniques.



Development Stage: 

Key Words: 




This next-generation form of DNA information storage increases the data density, increases stability of storage medium, and significantly reduces the cost of DNA information storage.





Benefit Summary: 

Unlike conventional magnetic storage, DNA is not refined to a planar layer (mass DNA storage can take a number of shapes), is stable on the order of millennia instead of centuries, and is not sensitive to environmental factors. This makes DNA storage a potential disruptive technology to the semiconductor and integrated chip industry. Additionally, the materials that the DNA strands are made from are much less resource intensive than traditional semiconductor materials, creating potential benefits in resource conservation and environmental health.


Risk Summary: 

The researchers who developed this technology identified three potential risk areas in DNA information storage. This technology poses potential risks in cryptography since DNA can encode computer viruses. DNA can also be used to encode and transport select viral agents that could be harmful to humans or other organisms. Finally, DNA used for storage could potentially incorporate into living organisms in the wild. The DNA would not likely be able to replicate or hybridize with biologically active organisms, but the technology would likely need to conform to EPA, FDA, or USDA regulations due to their risk potential.

Risk Characterization: 

Risk Assessment: 




Challenge Area: