Background
Emails are a vital part of doing business and are considered public records under the Public Records Act 1973. Emails enable exchange of ideas, enactment of decisions and support collaboration between an increasingly dispersed workforce. In government, some emails also provide evidence essential for accountability and should be preserved as public records into the future.
Since the late 1990’s, the Victorian Government (VG) has used the Lotus Notes (LN) email application as a principal communication tool both internally and externally. Key actions and decisions of public officers are captured in the email, it is a primary repository of VG records.
In its current proprietary format and accumulated (online and Linear Tape-Open) storage volumes, access and retrieval of emails for the purpose of analysis and evidence of decisions can be difficult, expensive and time consuming. Meaning its value as an information source cannot be fully realised.
This compromises the VG’s reputation for transparency and accountability and poses a risk to current administration, as well as creates potential gap in the documented memory of Victoria.
About the project
Public Record Office Victoria (PROV) is undertaking a project to develop and test solutions to appropriately capture, store, appraise and dispose of LN email accumulations.
The project has been undertaken as a series of stages outlined below.
Stage 1: Proof of Concept (PoC), 2017-18
For Stage 1, PROV undertook a PoC with CenITex to test an eDiscovery tool on a sample set of 4.6 million LN emails from a VG Department.
The PoC focused on disposal outcomes and included the following tasks:
- Conducting an initial assessment to quantify and qualify a sample email data set
- Identifying duplicates within the data set (we found that 43% were duplicates)
- Identifying low value/non-public records within the data set by analysing domain names
- Manually reviewing results to determine accuracy
For more information, please download our Stage 1 PoC summary report.
Stage 2, 2019-20
For Stage 2 we used a collection of our own LN emails (approx. 1.2 million emails) and explored approaches for:
- De-duplicating emails (as with the PoC, we found that over 40% in the sample were duplicates)
- Threading emails to preserve email conversations and reduce the overall number of email records
- Identifying non-public records using email header analysis and domains
- Converting emails into VERS Encapsulated Objects (VEOs)
A key outcome of Stage 2 was the confirmation that the Lotus Notes Storage Format (NSF) is not a sustainable format for email records. Meaning that the backlog of LN emails in VG will need to be managed soon, before they become completely obsolete.
For more information, please download our Stage 2 summary report.