Metadata: A Scientific Jargon
Library drawers with name tags

 Image by Jan Antonin Kolar on Unsplash 

Metadata is a word trending in the scientific community for as long as data exists. In this series of articles, we will try to touch on various aspects of metadata, including its history, definition, importance, and implementation, among other aspects. The idea of this article is, to understand this crucial (yet often overlooked) topic.

A royal memoir

Herbert George Wells once said, “Human history in essence is the history of ideas” and we have a reason to believe that Metadata is one of such great ideas!

Metadata has always been a part of our lives without us even realizing it. Although we now use the term "metadata" in the digital age, its origins can be traced back to Zenodotus, the first librarian of The Great Library of Alexandria, a colossal library built in the second century BCE in ancient Egypt [1]. At its peak, this monumental library housed around 0.9 million papyrus scrolls [1]. Managing such an immense volume of information was a formidable task. Zenodotus devised a clever method – he and his team attached small tags to the end of each scroll. These tags held details about the author, title, and subject of each work. This made it easy to return materials to their designated spots and spared library users the hassle of unrolling each scroll to find out what it contained. This ingenious system marked the first-ever use of metadata [1], however, the term “metadata” itself came into the picture long after a lot of developments took place in the data field!
In 1967 Stuart McIntosh and David Griffel coined the term “Metadata”, describing it as a category of questions with distinct semantic characteristics and multiple tangible instances while unveiling ADMINS, a computerized data management system employed to organize, process, and rearrange bibliographic information about handbooks [2].
Several developments further occurred in metadata management since the term came into existence. Since the beginning, libraries, archives, and museums (LAM) communities have dealt extensively with metadata. During the 1970s, Machine-Readable Cataloging (MARC) exploded [3], specifically among the LAM communities. The MARC standard included a robust metadata schema, an efficient exchange standard, and a detailed encoding and data storage system [4]. Originating from conventional bibliographic control within LAM communities, metadata has undergone significant growth parallel to the development and broadening reach of the Internet. It now encompasses principles and practices for describing resources, applicable to digital and non-digital materials.

Metadata in the modern world

In this digital era, data holds the true value. Metadata serves as a lifeline for data by providing the necessary context, governance, and organization to make data meaningful, trustworthy, and accessible. Today, the simplest definition of metadata, “data about data”, is widely known and accepted worldwide. As simple as it sounds, the metadata definition is far more complex than just these three words. The challenge lies in the diversity of definitions, formats, standards, and contexts, resulting in a lack of clarity regarding the true nature of the metadata [5]. The rise of dispersed data repositories on the internet created a demand for standardized methods to describe resources. Consequently, various metadata standards have emerged since the 1990s, such as the Dublin Core Metadata Element Set (DCMES or DC), Metadata Object Description Schema (MODS), Encoded Archival Description (EAD), and Learning Object Metadata (LOM), among others [6]. With so many metadata standards existing, the effort to address harmonization through the implementation of metadata standards ended up introducing a different type of heterogeneity, rather than resolving the interoperability issue [5].

Importance of metadata

It is a widely accepted reality that the human mind draws its attention to objects that are well-described and well-categorized. Our auto-filtering mind generally filters out anything that is not described enough or well-structured. For example, when we visit a supermarket and find a new product, we instantly switch to the back of the cover, which is far from the “marketing beautification” and focuses on what’s inside. A product whose description aligns with our needs, comes home with us! That’s the power of metadata. In simple terms, we understand and acknowledge something better if there is detailed and systematic information attached to it, called “Metadata”. 
Metadata serves various use cases in different research areas, such as data identification, data classification, data retrieval, and dataset validation [7]. It acts as a vital bridge between raw data and meaningful insights, providing essential context, structure, and organization to vast datasets. Metadata streamlines data discovery, retrieval, and utilization processes by cataloging details such as data origin, format, and relationships. Moreover, it ensures data quality, consistency, and compliance with regulatory standards, fostering trust and reliability in data-driven decision-making. From facilitating efficient search and analysis to enabling seamless integration and interoperability across systems, metadata plays a foundational role in optimizing resource allocation, enhancing user experiences, and driving innovation in diverse domains.
Now that we are acquainted with metadata, we are ready to delve deeper into its nature, standards, and applications. The next article will explore topics including metadata types, different metadata standards, metadata management, and much more. Stay tuned!

Published

Authored by

Sanghmitra Banerjee

Disclaimer
The following blog doesn’t represent the view of the website maintainer, it is the view of the authors of this piece.

References

[1] H. Phillips, "The Great Library of Alexandria?," Library Philosophy and Practice, vol. 2010, 30 September 2010.
[2] S. D. McIntosh and D. M. Griffel, "ADMINS : for computer based library management," Cambridge, Mass. : Massachusetts Institute of Technology, Center for International Studies, 1967.
[3] M. Seikel and T. Steele, "How MARC Has Changed: The History of the Format and Its Forthcoming Relationship to RDA," Technical Services Quarterly, vol. 28, no. 3, pp. 322-334, 2011.
[4] E. T. Mitchell, "Metadata Developments in Libraries and Other Cultural Heritage Institutions," Library Technology Reports, ALA TechSource, vol. 49, pp. 5-10, July 2013.
[5] H. Ulrich , A. Kock-Schoppenhauer, N. Deppenwiese, R. Gött, J. Kern , M. Lablans , R. Majeed , M. Stöhr , J. Stausberg , J. Varghese , M. Dugas and J. Ingenerf , "Understanding the Nature of Metadata: Systematic Review," Journal of Medical Internet Research (JMIR), vol. 24, no. 1, 2022.
[6] J. S. Oh and O. N. Park, "Topics and Trends in Metadata Research," Journal of Information Science Theory and Practice, vol. 6, no. 4, pp. 39-53, 2018.
[7] J. T. F. T. Initiative, "Managing Information Security Risk: Organization, Mission, and Information System View," , 2011. [Online]. Available: https://csrc.nist.gov/publications/detail/sp/800-39/final. [Accessed 19 3 2024].

Scientific progress is not just about breakthroughs in labs or successful social sciences projects; it is also about fostering trust in those results and discoveries.
FEAST (Food systems that support transitions to hEalthy And Sustainable dieTs), a 5-year Horizon Europe project, aims to transform European food systems from the current 'Lose-Lose-Lose-Win' model,...
In the world of academic research, the dissemination of knowledge and collaboration among scholars are fundamental drivers of progress.