Here’s an interesting article from New Scientist about long term personal data storage. The key idea is that while it will become easier and cheaper to store, well, an infinite amount of data, it’s important that better ways of organising, retrieving and presenting that data be developed.
“Last week New Scientist pondered the fragility of digital data stores over the very long term, in the event of a civilisation-wide calamity. But anyone worried about civilisation’s chances would do well to look to their own data stores first.
Most of us today are blithely heading for our own personal data disasters. We generate and store vast volumes of information, but few of us really look after it.
“Benign neglect” is how Cathy Marshall of Microsoft Research Silicon Valley in Mountain View, California, describes the way most people treat their personal archives of digital material. It’s a view formed by spending time with computer users to find out how much people value their accumulated data, how they try to protect it and whether they’ve succeeded.
Most people adopt what she dubs “the infinite U-Store-It” approach, accumulating data haphazardly on various computers, gadgets, removable disks and online services. “If you’ve ever looked inside a U-Store-It you’ll realise why this is a bad idea,” she says. “People don’t realise what they have, they just save everything and when they do clean up they don’t do it systematically.”
When asked, people typically say they value their data a lot. But they lose it nonetheless, more from disorganisation than from a technological catastrophe such as a hard disk failure, Marshall has found. Data can fall prey to online services or ISPs closing accounts or changing their policies, logins being lost, or simply forgetting what and where we have in physical or virtual space.
Web services – “cloud” computing – are becoming the home for much of our data: for example, people often store their photos on Flickr or business contacts on LinkedIn. Giving stewardship of our data to a third party in the cloud could be a way to keep it safe from both disaster and disorganisation.
For example, computer scientists led by Ethan Miller at the University of California, Santa Cruz, are developing hardware for storage services designed to look after data that you have yet to create.
Their plan, dubbed Pergamum, is to use low-power storage “bricks” that can each make 1 terabyte of data available instantly over the web while using just 2 watts of power – roughly the same as a pair of computer speakers.
The bricks contain digital storage and processors to manage that store and coordinate with other bricks. They can be connected together to make as large a store as is necessary with very little effort, and are designed to prevent future obsolescence: they connect using standard network switches to allow today’s bricks, which are built around hard disks, to work smoothly with tomorrow’s flash-based bricks, or those containing storage formats as yet unknown.
But as well as developing cheaper, more cavernous digital U-Store-Its, we need help to explore, organise and rediscover forgotten, perhaps decades-old data.
Software developed at the library of Stanford University, California, to record stories of pioneers of early computing suggests how this might be done. The Self Archiving Legacy Toolkit can recognise places, names and other organising concepts in a person’s digital “papers”, such as emails, letters and research reports. It then creates a branching “mind map” linking items by people, places or ideas that they have in common, forming an interactive digest of person’s life.
Such a tool could be of use to any of us now that diverse, disorganised digital archives are becoming the norm.