Everett Dirksen (if you are asking who, check out the ever useful Wikipedia) might have said “A million here, a billion there and pretty soon your talking real records.”
You might want to look at a new book by James Gleick – The Flood.
The flood of information humanity is now exposed to, presents new challenges Gleick says, as we retain more of our information now than at any previous point in human history, it takes much more effort to delete or remove unwanted information than to accumulate it. This is the ultimate entropy cost of generating additional information and the answer to slay Maxwell’s Demon.
Gleick’s website is interesting http://around.com/
If only because of what he says of twitter (in fact a lot more – especially if you like a diagram – see below)
Any way back to the topic. I previously estimated that the number of digital records created by the VPS in a year – lets call the year 2011 as being somewhere around 1.5 billion.
Traditionally archivists have identified that somewhere between one and five percent of the total records created would end up being permanently retained. Using the lower 1% figure indicates that 150 million digital records created each year should be retained as permanent archives. Sure you could reasonably argue that in fact, the digital regime results in more records and less of these are going to be kept for the long term – but as your not giving me any contrary evidence I’m sticking with my numbers. And any way my argument is that in fact the number of records that needs to be managed for the long term (and have the same attributes and management regime as permanent archives) is even higher in the digital world. So if anything 150 million is an underestimate.
It is important to note that relationship of a digital record to a VEO is not always (frequently not in fact) one to one. Current evidence suggests that on average the relationship of digital record to VEO is estimated to be 30 to 1 (30 records per file VEO).
On this basis 2011 will result in approximately 5 million VEOs for transfer to Digital Archive at some stage after the records become inactive.
At the moment PROV’s Digital Archive has an ingest capacity of 200,000 to 500,000 VEOs per year. Taking a mid point of 250,000 the gap between capacity and required is substantial and the cumulative un-transferred number grows each year. On these figures after 10 years the number of un-transferred VEOs would be c50 million. Sure the figures are disputable/disreputable – but not the magnitude of the numbers.
While these numbers are massive, there is a lot of comfort in considering that ICT capacity is growing much faster. The following is from Wikipedia:
“Driven by areal density doubling every two to four years since their invention, hard disk drives have changed in many ways. A few highlights include:
- Capacity per HDD increasing from 3.75 megabytesto 3 terabytes or more, about a million times larger.
- Physical volume of HDD decreasing from 68 ft or about 2,000 litre (comparable to a large side-by-side refrigerator), to less than 20 ml (1.2 in3), a 100,000-to-1 decrease.
- Weight decreasing from 2,000 lbs (~900 kg) to 48 grams (~0.1 lb), a 20,000-to-1 decrease.
- Price decreasing from about US$15,000 per megabyte to less than $0.0001 per megabyte ($100/1 terabyte), a greater than 150-million-to-1 decrease.
- Average access time decreasing from over 100 milliseconds to a few milliseconds, a greater than 40-to-1 improvement.
- Market application expanding from mainframe computers of the late 1950s to most mass storage applications including computers and consumer applications such as storage of entertainment content.”
The challenge for government archivists and record managers is to recognise, acknowledge and communicate what is required and then to work with ICT specialists to deliver an outcome that ensures digital records are preserved for the long term.