Perth's major cultural and government institutions are sitting on millions of duplicate digital images, and the decisions made in the next six to twelve months will determine whether the state wastes tens of millions of dollars on redundant storage or finally clears the backlog threatening to stall several high-profile digitisation programs.
The problem is not abstract. The State Library of Western Australia on Francis Street, which holds more than 1.3 million digitised photographs, maps and documents, has been running deduplication audits since February 2026 after discovering that a 2023 integration with the Trove national discovery platform created duplicate records for roughly 18 per cent of its photographic collection. The State Records Office of Western Australia, based in the Alexander Library Building precinct, is facing a similar reckoning as its cloud migration to Microsoft Azure — contracted at approximately $4.2 million over three years — moves into its second phase.
Why does this matter right now? The Cook Labor government's $12.5 billion infrastructure pipeline, combined with record Metronet documentation and an explosion of AUKUS-related defence paperwork flowing through Stirling Naval Base at HMAS Stirling in Rockingham, has dramatically accelerated the volume of official imagery entering state systems. Planning photographs, environmental impact scans, and heritage surveys for the Bayswater and Midland rail precincts alone generated an estimated 400,000 individual image files in 2025. Without clear deduplication policy, those files multiply across departments.
The Fork in the Road
Institutions now face three broad options. The first is automated deduplication using perceptual hashing tools — software that identifies visually identical images regardless of filename or metadata — which the City of Perth piloted quietly across its Forrest Place and Perth Cultural Centre event photography archives in late 2025. The City found a 22 per cent duplication rate across 60,000 images, freeing roughly 1.4 terabytes of storage. The second option is manual curatorial review, preferred by archivists at the Western Australian Museum's Collections and Research Centre in Welshpool for historically sensitive material, but prohibitively slow at scale. The third is doing nothing, which carries its own cost: cloud storage fees for redundant data, degraded search results for researchers, and compounding problems as AI-assisted cataloguing tools increasingly struggle with near-identical duplicates.
The financial stakes are real. At current WA government cloud storage contract rates — approximately $0.023 per gigabyte per month under the whole-of-government arrangement negotiated by the Department of Finance in 2024 — holding a single unnecessary terabyte costs around $276 a year. Across a state apparatus storing petabytes of imagery, the cumulative waste runs well into six figures annually.
What Happens Next
The State Library's deduplication audit is due to report findings to the Minister for Culture by September 2026. That report is expected to recommend a hybrid model: automated hashing for bulk photographic collections, with manual review reserved for items flagged as culturally sensitive under the Aboriginal Cultural Heritage Act 2021. Any agency-wide policy framework would likely be coordinated through the Office of Digital Government, which released its Digital Strategy 2025–2028 in March.
For organisations outside the government sector — including the Perth Festival archive held by the Perth Theatre Trust and the photographic collections maintained by Curtin University's Special Collections in Bentley — the decisions are less structured. Curtin's library staff confirmed in May 2026 that they were evaluating three commercial deduplication platforms, with a procurement decision expected before the end of the 2026 calendar year.
Researchers and institutions navigating this process should register with the Trove Data Partner program before August 15, the next intake deadline, to access deduplication guidance developed by the National Library of Australia. For WA agencies specifically, the Department of Finance's ICT procurement team is accepting expressions of interest for a whole-of-government deduplication panel arrangement — a contract that, if established, would standardise the process and reduce costs for every public body from the City of Fremantle to the Pilbara Development Commission.
The backlog will not clear itself. The tools exist, the costs of inaction are measurable, and the policy window is open. Whether institutions move before the next round of digitisation grants closes in October is the question everyone in the sector is now asking.