Recent advances suggest there may be a new way to store the exploding amount of computer data and keep it pristine for centuries rather than decades.
Computer data have been depicted as microscopic magnetic smudges, electric charges and even Lilliputian patterns of dots that reflect laser beams. Data may ultimately move into the fabric of life itself — encoded in the organic molecules that are strung together like pearls to form strands of DNA.
In two recent experiments, a team of computer scientists at the University of Washington and Microsoft, and a separate group at the University of Illinois, have shown that DNA molecules can be the basis for an archival storage system potentially capable of storing all of the world’s digital information in roughly 9 liters of solution, about the amount of liquid in a case of wine.
The new research demonstrates that specific digital files can be retrieved from a potentially vast pool of data.
The new storage technology would also be capable of keeping immense amounts of information safely for a millennium or longer, researchers said.
Most Read Stories
- Seahawks' Richard Sherman, dozens of athletes respond to Trump's rant against NFL player protests
- Russian hackers tried to access Washington’s voting systems, officials say
- GOP’s know-nothing approach to health care is symptom of a bigger disease | Danny Westneat
- California brain surgeon faces more child sex abuse charges
- UW cornerback Byron Murphy expected to miss 6 weeks with a broken foot
It would also address a glaring Achilles’ heel at the heart of microelectronic data-storage systems: Magnetic disks, tape and even optical storage systems safely store information at most for only a handful of decades.
The recent advances suggest there may be a new way to store the exploding amount of computer data for centuries rather than decades.
The raw storage capacity of DNA is staggering compared with even the most advanced electronic or magnetic storage systems. It is theoretically possible to store an exabyte of information, if it were coded into DNA, in the volume of a grain of sand. An exabyte is roughly equivalent to 200 million DVDs.
In nature, DNA molecules carry the genetic instructions that govern the development and function of living organisms.
The cost of sequencing, or “reading,” the genetic code is falling faster than the cost of computer memory, and technologists are beginning to make progress in their ability to more rapidly synthesize strands composed of arbitrary sequences of the small organic molecules known as oligonucleotides, the basic DNA building blocks.
Computer scientists say they believe that as costs of sequencing and creating synthetic DNA continue to fall, it will soon be possible to create a new class of hybrid storage systems.
“In the last year, it suddenly hit us that this fusion of computer technology and biology will be where future advances come from,” said Douglas M. Carmean, a Microsoft researcher who had been a leading designer of microprocessor chips at Intel.
The evolution of the two fields dates back to the start of interactive computing. The first true personal computer, known as the LINC, was designed by Wesley A. Clark in 1961 for biomedical researchers.
“Information technology has helped biotech in the past,” said Luis Ceze, a University of Washington computer scientist and one of the designers of the new DNA storage system. “Now biotech has to pay back.”
Early signs of a possible convergence of computing and biology can be found in a visit to a cramped laboratory in the basement of the Paul G. Allen Center for Computer Science & Engineering on the University of Washington campus.
It is crammed with equipment more readily found in a biology laboratory — a desktop DNA sequencing system and a separate machine that is used to amplify fragments of DNA by making billions of precise copies.
Together, the two machines form a prototype of a data-archiving approach that could spread more widely in as soon as five years.
The researchers note that it could be used by Hollywood studios and modern hospitals that need long-term storage for digitized movies as well as X-ray and MRI images.
Previous experiments performed by scientists at European Bioinformatics Institute in Hinxton, England, in 2013, and in 2012 at Harvard University, showed that it was possible to store data files in DNA and then read the information back in digital form.
The Harvard group received international attention for storing billions of copies of “Regenesis,” a book written by Harvard geneticist George Church and Ed Regis.
The research teams from the University of Illinois and from the UW and Microsoft have built on that work by storing information in DNA form and then retrieving a specific file from the data.
The Illinois scientists were able to encode parts of the Wikipedia pages of six universities, then select and edit parts of the text written in DNA corresponding to three of the colleges.
The University of Washington and Microsoft researchers decided that because of the vast potential storage capacity of DNA, it would be better used for simply storing data rather than rewriting it. They were able to store four small image files and then retrieve them independently with just a single error.
In addition to refining the computerized reassembly techniques, the research groups are continuing to work on improving the basic storage technology.
“We have scaled up our 2012 work about a hundredfold,” Church said. His laboratory is working in collaboration with Technicolor S.A., a French company that has a large business in digital data and film archiving.
“The big issue is lowering the cost by another thousandfold, which is our current focus,” he added.
The Harvard laboratory is trying to encode and retrieve “A Trip to the Moon,” a 1902 French silent film.
The University of Washington and Microsoft researchers have partnered with Twist Bioscience, a San Francisco startup that has developed a semiconductor-based system that accelerates the production of custom DNA strands in which digital data can be encoded.
The scientists acknowledge that their current bottleneck is in the ability to write the information in DNA, but they say they expect that technology to begin to improve rapidly in the near future.
“It is absolutely about the technology and miniaturizing the scale of the reaction” used to create synthetic DNA, said Emily Leproust, the chief executive of Twist.
Although it is snaillike in retrieval speed compared with electronic and magnetic memory, DNA will be far better in terms of the scale of the data it can store and the time scale.
“DNA is a remarkable media for long-term storage,” said Karin Strauss, a Microsoft computer architect. “All you have to do is keep it cold and dry.”