Western Australia's public sector is sitting on an estimated tens of thousands of duplicate digital images across its agency databases, a problem that has quietly inflated storage costs, slowed content workflows and undermined public records integrity at organisations stretching from the City of Perth's King Street offices to the State Records Office in Alexander Drive, Midland.
The issue has sharpened this year as several WA government departments migrate legacy content into new cloud-based asset management systems, a process required under the state's Digital Strategy 2025–2028. When agencies move old file servers into centralised repositories, duplicate images — the same photograph catalogued under two file names, three timestamps or four department subfolders — surface in bulk. Migration teams at some departments have reportedly found duplication rates above 30 percent in unstructured image libraries, according to general findings published by the Australian Digital Alliance in its 2025 records management review, though specific agency figures have not been made public.
What the Data Actually Shows
The scale of duplication in digital image libraries is not unique to Perth, but WA's rapid infrastructure expansion has made it acute. Metronet alone has generated thousands of site-progress photographs since construction began on the Morley-Ellenbrook Line — images captured by contractors, engineers, communications teams and drone operators, frequently uploaded to multiple platforms without deduplication protocols in place. The project spans more than 21 kilometres of new rail corridor and involves at least a dozen separate contractor organisations, each managing their own documentation systems.
In practical terms, storing a duplicate image costs the same as storing the original. Cloud storage on enterprise platforms used by WA government agencies typically runs between $0.02 and $0.05 per gigabyte per month, depending on redundancy tier. A library of 500,000 images — not unreasonable for a major infrastructure agency across a five-year build — can consume several terabytes. If 30 percent of those files are duplicates, an agency is effectively paying for storage it gains nothing from, month after month.
The City of Stirling, which manages one of Perth's largest municipal communications archives, has been piloting an automated deduplication tool since March 2026 as part of a broader digital asset management overhaul. The program runs hash-matching algorithms — essentially digital fingerprints — against the council's entire image repository. Early internal benchmarks, referenced in the council's March 2026 ordinary meeting agenda, suggested the tool identified duplicates at a rate that could reduce active storage volume by roughly a quarter, though the council has not published final figures.
Why It Matters Beyond Storage Bills
The financial cost is real but secondary. The more significant problem is what duplicate images do to public records. Under the State Records Act 2000, WA agencies have obligations around the authenticity and accessibility of their records. When the same image exists in four locations under different metadata — different dates, different descriptions, different access permissions — it creates ambiguity about which version is the authoritative record. That matters in an AUKUS environment, where defence-related imagery associated with projects at HMAS Stirling on Garden Island is subject to additional classification and record-keeping scrutiny.
Perth-based digital archiving firm Datasphere Group, which works with local government and resources sector clients in the CBD and Osborne Park, has described the deduplication process as a necessary precondition before any serious AI-assisted content search can be deployed. Duplicate images confuse training datasets, inflate search result counts and produce false matches when image recognition tools are applied to large libraries.
For agencies and organisations beginning or continuing digital migration this financial year, the practical advice from records management practitioners is consistent: run a deduplication audit before migrating, not after. Cleaning a 200,000-image archive before upload takes days; cleaning it after it has been ingested, indexed and cross-referenced across multiple systems can take months. The City of Perth's digital asset team, based at its Hay Street administrative centre, has scheduled a full repository audit for the September 2026 quarter — one of the first metropolitan councils to formalise the process on the public record calendar.
The numbers are not glamorous, but they add up fast.