Author: Public Record Office Victoria
Emails are a vital part of doing business and considered public records under the Public Records Act 1973. Emails enable exchange of ideas, enactment of decisions and support collaboration between an increasingly dispersed workforce. In government, emails also provide evidence essential for accountability and need to be preserved as public records into the future.
The problem
Emails should not be disposed of until their value and content are known, but in the course of their work, public sector employees can generate hundreds of thousands of emails, including emails that do not need to be captured. The large volume of emails involved in even a single email account can make it difficult to identify those for storage.
Over twenty years of routine backup has resulted in an unwieldly backlog of Victorian Government emails including 67,000 tapes and 28 petabytes of content. Access and retrieval of emails for the purpose of analysis and evidence of decisions can be difficult, expensive and time consuming. This compromises the Government’s reputation for transparency and accountability.
The proof of concept
We've been working with the Victorian Government technology provider, CenITex, on a project to make the Lotus Notes email stores more accessible and better managed. The Lotus Notes Proof of Concept (PoC) is the first step.
The PoC involved exploring the use of an eDiscovery tool to review and facilitate disposal of large volumes of emails, including:
• An initial assessment to quantify and qualify a sample email data set
• Identifying duplicates within the data set
• Identifying low value versus high value records within the data set
• Assigning contextual information to the de-duplicated set
• A manual review of results to determine level of accuracy.
Of the sample 4.6 million emails we found 43% duplication and 7% of low value.
How we did it
Our goal was to reduce the volume of the email backlog in an authorised way; which in the Victorian Government means in line with Retention and Disposal Authorities (RDAs).
The tool was used to identify duplicate emails from within the sample. To identify low value emails among the remaining sample we reviewed a list of email domains to identify those that would reasonably result in irrelevant, non-business related emails. The top results, which included common subscription emails and Google Alerts, were selected and saved as filters. The use of Fwd: in the subject line was also used as a filter.
Next we tried a second approach on the sample, searching the remaining emails for key search terms.
Using a third approach we were able to apply additional contextual information to the emails, which would allow them to be grouped by areas of responsibility within the organisation. This allows us to assess and prioritise the emails to be kept long term.
The findings
The eDiscovery tool was successful in allowing us to identify emails eligible for disposal, as well as assessing and prioritising remaining emails with between 98% and 100% accuracy, with upto 50% of the sample identified for potential disposal. The tool allowed us to apply additional metadata to every email in the set, enabling easier identification of emails at a high level, facilitating future decision making around retention.
An eDiscovery tool may be used to assist agencies to reduce their email backlogs and unlock greater value from their email assets, though a larger sample of manual testing is recommended prior to implementing disposal. Note, an eDiscovery tool may be beyond the means of smaller agencies who nonetheless struggle with similar email backlog issues. An investigation into email back-up for smaller agencies and potential testing of free, open source solutions is recommended.
For more information, download our proof of concept summary as a PDF below:
If you'd like further information about this project feel free to contact Julie McCormack, Senior Manager, Government Services, Julie.McCormack@prov.vic.gov.au.
Material in the Public Record Office Victoria archival collection contains words and descriptions that reflect attitudes and government policies at different times which may be insensitive and upsetting
Aboriginal and Torres Strait Islander Peoples should be aware the collection and website may contain images, voices and names of deceased persons.
PROV provides advice to researchers wishing to access, publish or re-use records about Aboriginal Peoples