Metadata Field Guide
Purpose of the Field Guide
The Metadata Field Guide is designed to provide an easy guide to which metadata content should be collected while a researcher is at a field site. This saves researchers time and money by ensuring they do not take more time than necessary, or need to return to a site a second time to collect missing data. Clear organization from the beginning also helps streamline the metadata generation process once researchers have returned to their offices or labs.
This site is subsequently broken into three pages of more detailed information that will help researchers understand, organize, and think about their data: Metadata Matters, Best Practices, and Thinking About Data
This page provides an overview of metadata that expands on the information provided in the field guide. It explains what metadata is, what function it serves, and why it matters to researchers, then provides an overview of common metadata standards.
What is Metadata?
Metadata is a combination of keywords, descriptions and other pertinent information about data that allows for discovery, understanding and use. It is the way researchers and organizations document relevant information about the creation, content, quality and editing process of data, among other characteristics. Simply put, it is data about data. It can either be found within data files, or as a separate text file.
What is it for?
Think of an online library database. These search engines allow us to look for articles or books by their titles, keywords, specific words within their titles or content, authors, type of document, publication date, and language. We can even combine parameters to acquire specific results. An example of this would be: Peer reviewed articles about hurricanes published after 2010. All of this information and more is linked to the document itself. Because of this, when we search for specific terms, we find all documents that meet the criteria. This information is considered metadata, and without it, it is nearly impossible to access the documents of interest.
The same concept applies when we create an online profile, be it on a social media portal or for registration on a website. We typically provide some basic information about ourselves: name and/or username, date of birth, email, address, phone number. This is information that the platform creators consider pertinent to have about each user, or that will be displayed in our personal page so others get to know us. This is our personal metadata, the way we introduce ourselves to the cyber-world.
When we fill out a metadata form, we are answering relevant questions about our data, so that other users understand its contents and characteristics, and are able to find it. Depending on the metadata standard used, and the requirements of data documentation of your organization, the questions you ask of your data might be eliminated and others added. We introduce some of the existing metadata standards below.
Why Does it Matter?
Metadata is essential to help other users find the data in archives, to understand the data’s contents and its creation process, to keep track of any updates and applications, and to allow future reproducibility of the data. We cannot stress enough the importance of documenting relevant information about the data that is being created or edited. Metadata is the biography and the history of data. Without it, we cannot assess with confidence its validity and accuracy.
As originators of data, we should know the answers to the questions mentioned above, but would we be able to answer such questions for data created by someone else? We might be able to figure out what type of information is being represented and other basic details, but without metadata we would not know the purpose for which the data was created, contact information about the person and/or organization responsible for the data, quality standards under which the data was created, among other key facts.
Many researchers raise a red flag when acquiring data with no metadata because using such data for research can put in question the validity of the research project itself. Some use the saying “garbage in, garbage out” to refer to the use of inaccurate or error-riddled data in analysis and research, because it automatically damages their results. We would not go as far as to call data without metadata garbage, but the lack of metadata does affect the way the data is perceived, read and processed.
How can we know if the data comes from a reliable source? Was the format changed at some point? What equipment was used to collect the data, and at what resolution? Who can we contact to ask these questions and how can we contact them?
If metadata is documented properly, we can have immediate access to the information that would answer all of these questions and more. After going through the time and energy-consuming endeavor that can be creating data, including metadata will ensure other users will get the most out of it.
Metadata can be stored following the guidelines of different formats, what are generally known as metadata standards. These are the different types of templates available to for us to input information about data. Depending on the type of data and the requirements of the organization producing it (or the entity for which it is produced), metadata can be represented using standards that include FGDC CSDGM (Federal Geographic Data Committee’s Content Standard for Digital Geospatial Metadata), ISO19115/19115-2/19139 (Standard for Digital Geospatial Metadata), and Dublin Core.
FGDC CSDGM is the US Federal geospatial metadata standard, although it has been substituted in the development of new metadata by ISO standard versions such as ISO 19115-2, given ISOs capacity to capture more detailed information. Dublin Core records general data documentation and it is widely applied in library environments.
For the most part, these standards pertain to the description of geospatial data. On the other hand, any type of data, geospatial or not, should be documented.