domingo, 1 de febrero de 2026

Retrieval-augmented generation for natural language art provenance searches in the Getty Provenance Index Mathew Henrickson* [1] , Eric Atwell [1] , John Stell [1] , Mark Westgarth [2] , Dibyadyuti Roy [2] , Noorhan Abbas [1]

https://www.academia.edu/academia-ai-and-applications/2/1/10.20935/AcadAI8122 This study presents a prototype Retrieval Augmented Generation (RAG) framework for art provenance research, focusing on the Getty Provenance Index German Sales dataset. The prototype addresses challenges posed by fragmented and multilingual archival data, as well as the limitations of traditional metadata-based search tools. By enabling flexible, natural language queries in multiple languages, the framework facilitates searches of the Getty Provenance Index without knowledge of specific object metadata. Using a sample of 10,000 records to test the concept and later an extended 100,000 record sample, we explore a RAG prototype that aims to improve both the efficiency and accessibility of provenance searches and find encouraging results for specific and exploratory research scenarios. The framework emphasises transparency, suggesting a scalable and practically oriented approach for historians and cultural heritage professionals working with complex art market archives.

No hay comentarios: