The Daily Perth

Perth news, every day

News

Perth's Duplicate Image Problem: The Numbers Exposing a Hidden Drain on the City's Digital Archive

A growing body of data reveals how duplicated and mismatched images are quietly costing Perth institutions time, storage budget, and public trust.

By Perth News Desk · Published 5 July 2026, 4:51 am

3 min read

UpdatedUpdated 5 July 2026, 12:32 pm

#News

Advertisement

Perth's public-sector digital archives are sitting on a problem measured in terabytes. Across agencies from the City of Perth's planning portal on Barrack Street to the State Records Office of Western Australia in Alexander Drive, Osborne Park, the accumulation of duplicate images — identical or near-identical files stored under different filenames — has reached a scale that is now registering on IT budget reviews.

The trigger for closer scrutiny is timing. The WA government's 2025–26 budget allocated additional capital to digitise heritage collections and support Metronet station documentation, pushing the volume of imagery flowing into state systems to levels not previously encountered in a single financial year. When intake accelerates, so does duplication — and so does the cost of fixing it.

What the Data Actually Shows

Industry benchmarks from digital asset management research — including studies published by the Chartered Institute of Library and Information Professionals — suggest that between 20 and 40 percent of files in unmanaged image repositories are exact or near-exact duplicates. Apply even the conservative end of that range to a mid-sized government archive holding 500,000 image assets, and you are talking about 100,000 redundant files consuming server capacity, slowing search retrieval times, and creating version-control headaches for staff.

Advertisement

Storage costs vary sharply depending on infrastructure, but enterprise-grade cloud storage in Australia is broadly priced in the range of $25 to $50 per terabyte per month for managed services. A repository bloated by 30 percent unnecessary duplication across, say, 10 terabytes of image data translates to roughly $75 to $150 per month in pure waste — before factoring in the labour hours staff spend manually checking which version of a file is authoritative.

The City of Stirling, which manages one of the largest local government digital records programmes in Western Australia due partly to its proximity to HMAS Stirling infrastructure and associated planning documentation, has been among the councils quietly grappling with this. The volume of site imagery generated through development applications along Wanneroo Road and the Scarborough foreshore redevelopment has compounded the problem since 2023.

Replacement Without a Plan Creates New Problems

The instinct when duplicate images are detected is to delete or overwrite. But rushed replacement without a deduplication protocol can break metadata chains — the indexed links that tell a records system which image belongs to which file, permit, or heritage listing. That breakage is itself a compliance risk under the State Records Act 2000 (WA), which mandates the integrity of official records.

The WA State Records Office updated its digital recordkeeping guidance in 2024, but the practical implementation of image-specific deduplication policy remains uneven across local government authorities. Libraries and archives that have adopted dedicated digital asset management platforms — tools that use perceptual hashing to flag visually similar images even when filenames differ — report significant reductions in manual review time. Perceptual hashing works by generating a compact numerical fingerprint of an image's visual content rather than its file data, meaning two differently compressed versions of the same photograph will register as duplicates even if their file sizes differ.

The Perth-based technology sector, particularly firms clustered around the Technology Park precinct in Bentley, has seen growing demand for this kind of remediation work from both state government and resources companies managing enormous photographic catalogues from mine sites and offshore infrastructure.

For organisations beginning to audit their own holdings, the practical starting point is a baseline count: total image assets, file size distribution, and the date range of ingestion. Running a perceptual hash comparison against any repository over 10,000 files without automated tooling is not a realistic proposition — the manual labour alone would run to weeks. Open-source tools exist, but enterprise environments in WA are increasingly turning to vendors who can integrate deduplication into existing records management platforms rather than running it as a separate process.

The next test for WA's digital recordkeeping infrastructure will come as Metronet station documentation accelerates through 2026 and 2027, with construction photography and planning imagery arriving from Ellenbrook, Morley, and Yanchep simultaneously. Getting deduplication policy settled before that wave arrives is the practical priority — the cost of sorting it afterwards will be considerably higher.

Advertisement

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Perth

This article was produced by the The Daily Perth editorial desk and covers news in Perth. See our editorial standards for how we use AI.

Stay in the loop

Enjoyed this story? Get tomorrow's briefing free.

Daily brief

Enjoyed this? Wake up to Perth news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Perth and accept our Privacy Policy. Unsubscribe anytime.

The Daily Network — local news across Australia

More local news across Australia