EMA: Brazilian Cultural Heritage Image Dataset - Towards AI-based metadata annotation of digital collections

Publication preview

Abstract

Metadata annotation in digital collections is typically conducted by several specialized professionals, configuring a complex, labor-intensive, and time-consuming activity, leading to human failure, high costs, and problems in retrieving information accordingly. Recent advances in artificial intelligence, particularly Deep Learning techniques, have shown their potential in performing visual recognition and interpretation of objects on images. In this context, the present work introduces EMA, a Brazilian cultural heritage image dataset with over 11,000 labeled images of objects from seventeen Brazilian museums. EMA dataset is a contribution towards the development of automated metadata annotation tools. The paper also presents baseline ResNet50 results for the dataset, resulting in an over 86% recognition rate.

Explore the Details

Watch the presentation video and read the full article presented at the International Conference on Digital Preservation in 2022 for more details.

Project Featured on University TV

The project was showcased on the university’s internal TV channel. The feature included an interview where we discussed the project’s development, its objectives, and the impact it aims to create in our field of research. You can watch the full interview here: TV Unicamp.