Towards a multimodal classification of cultural heritage
Abstract
The humanity has always learned from the previous experiences and thus for many reasons. The national heritage proves to be a great way to discover and access a nation's history. As a result, these priceless cultural items have a special attention and requires a special care. However, Since the wide adoption of new digital technologies, documenting, storing, and exhibiting cultural heritage assets became more affordable and reliable. These digital records are then used in several applications. Researchers saw the opportunity to use digital heritage recordings for virtual exhibitions, links discoveries and for long-term preservation. Unfortunately, there are many under looked cultural assets due to missing history or missing information (metadata). As a classic solution for labeling these assets, heritage institutions often refer to cultural heritage specialists. These institutions are often shipping their valuable assets to these specialists, and need to wait few months even years to hopefully get an answer. This can in fact have multiple risks such as the loss or damage of these valuable assets. This in fact is a big concern for heritage institutions all around the world. Recent studies are reporting that only 10 percent of the world heritage is exhibited in museums. The rest (90%) is deposited and stored in museum archives especially because of their damage or the lack of metadata. With a deep analysis of the current situation, our team did a survey of current state-of-the-art technologies that can overcome this problem. As a result, new machine learning and deep learning techniques such as Convolutional Neural Networks (CNN) are making a radical change in image and bigdata classification. In fact, all the big technology companies such as Google, Apple and Microsoft are pushing the use of these artificial techniques to explore their astronomic databases and repositories in order to better serve their users. In this contribution, we are presenting a classification approach aiming at playing the role of a digital cultural heritage expert using a machine learning model and deep learning techniques. The system has mainly two stages. The first stage which is the so-called "the learning stage" where the system receives as input a large dataset of labeled data. This data is mainly images of different cultural heritage assets organized in categories. This is a very critical step as the data must be very descriptive and coherent. The next stage is in fact the classification stage where the system receives an image of an unlabeled asset and then tries to extract the relevant visual features of the image such as shapes, edges, colors and fine details such as text. The system then analyses these features and predicts the missing metadata. These data can be the category, the year, the region etc. the first tests are actually giving promising results. Our team is aiming to further improve these results using a multimodal machine learning model. In fact, these models rely on multiple learning sources (text, videos, sound recordings, images) at the same time. The research progress shows that this technique is giving very accurate predictions.
DOI/handle
http://hdl.handle.net/10576/30211Collections
- Computer Science & Engineering [2402 items ]