Perth's cultural institutions are sitting on tens of thousands of duplicate digital images — redundant files clogging servers, inflating storage costs and slowing down public access to civic and heritage records. The State Library of Western Australia on Francis Street has acknowledged the scale of the problem internally, and the City of Perth's digital asset management systems are among those under review as archivists push for a citywide deduplication strategy before the end of the 2026 financial year.
The timing matters. With Metronet station construction generating thousands of new site photographs monthly and AUKUS-related infrastructure documentation expanding rapidly at HMAS Stirling on Garden Island, the volume of image data entering government repositories has surged. Institutions that might once have managed their archives with a small team and a filing cabinet are now dealing with petabyte-scale storage problems that other global cities confronted a decade ago.
What Singapore and Amsterdam Got Right
Singapore's National Heritage Board completed a system-wide deduplication overhaul in 2023, cutting its image repository from roughly 4.2 million files to just under 2.9 million after removing exact and near-duplicate records. The project used perceptual hashing — a technique that identifies visually identical images even when file sizes or formats differ — and was completed across 14 months. Amsterdam's Stadsarchief ran a comparable program between 2021 and 2022, reducing storage costs by an estimated 31 percent across its municipal photography holdings, according to figures published by the archive itself.
Toronto took a different approach. The City of Toronto's Open Data team, which manages civic image assets alongside documentary records, embedded deduplication checks directly into its upload pipeline from 2022 onward, preventing duplicates at the point of ingestion rather than cleaning them up retrospectively. That pipeline model is now cited regularly in archival conferences as the low-friction option for cities still building their digital infrastructure.
Perth has not yet adopted any of those three models at scale. The Western Australian State Records Office, based in the Barrack Street precinct in the CBD, operates under the State Records Act 2000 and has published guidance on digital asset management, but a mandatory deduplication standard for image files does not yet exist across state agencies. Individual institutions are making their own calls, with inconsistent results.
Local Institutions Trying to Fill the Gap
The Fremantle City Council has been among the more proactive local governments. Its Heritage and Culture directorate has been trialling automated image review software across its collection of approximately 80,000 digitised photographs of the West End heritage precinct and surrounding suburbs since late 2025. The project is connected to a broader push to make Fremantle's photographic archive publicly searchable online by mid-2027.
The Art Gallery of Western Australia on James Street Mall in Perth's cultural centre has also moved toward a centralised digital asset management platform in the past two years, a shift that archivists say naturally reduces duplication as a by-product of better cataloguing. The gallery declined to provide specific figures on storage reduction when contacted by The Daily Perth.
Across these efforts, the common thread is that deduplication is rarely the primary goal — it emerges as a consequence of better systems built for other reasons. That contrasts sharply with Singapore's approach, where deduplication was the explicit brief from the start, with a dedicated budget line and a fixed completion date.
For Perth's institutions, the practical path forward likely runs through the state government's Digital Strategy 2025–2030, which includes provisions for shared infrastructure across agencies. Archivists and records managers who want to push deduplication onto that agenda have until the next policy review cycle, expected in the first quarter of 2027, to make the case. In the meantime, the storage bills keep growing — and so does the backlog.