Answered By: Metadata and Discovery ULS Last Updated: Aug 19, 2024 Views: 24
The consistency and shareability of your metadata can be improved with the use of controlled vocabularies and encoding schemes. A controlled vocabulary is "an organized arrangement of words and phrases used to index content and/or to retrieve content through browsing and searching." (Patricia Harpring). To put it simply: "Find out what they call it, then select it" (Allen Smith). There are many types of controlled vocabularies, created and maintained by professionals in the subject area. A well-known example is the Library of Congress Subject Headings (LCSH). This controlled vocabulary allows users to browse and search entire library collections unified through the terms assigned by the Library of Congress. Using a controlled vocabulary in your research works in a similar way, allowing researchers to locate relevant materials quickly. This in turn increases the findability of your research and allows consistency in the field.
Controlled vocabularies are not limited to topics only; they can also include names and places. For example, many libraries also use the Library of Congress Name Authority Files (NAF) along with LCSH. If the system is organized using NAF, a search for materials related to Mos Def will result in materials related to Yasiin Bey and Dante Terrell Smith as well. Consistent maintenance and application of controlled vocabularies to the data sets means that there is only one person attached to the name P. Diddy, and that his multiple names are attached to the same person.
Controlled vocabularies help to keep your data organized so that you and others can understand it. To ensure that the system you are using can understand the data as well, it is important to be aware of any encoding schemes required. An encoding scheme is “a set of specific definitions that describe the philosophy used to represent character data” (Compart). An encoding scheme tells you exactly how to enter your data so that the system can use it accordingly. Depending on the encoding scheme used, this could mean including specific characters or spaces in relation to the term being used or entering dates in specific formats.
One commonly used encoding scheme is the ISO-8601 Data elements and interchange formats - Information interchange - Representation of dates and times, a global standard for numeric date and time format. The purpose of this standard is to provide a clear method of representing dates and times, particularly when data are transferred between countries with different conventions for writing them numerically (Wikipedia). For example, “3-8-2015" may be interpreted as “March 8, 2015” in the United States, but also might be interpreted as “August 3, 2015” in Europe. The standard allows universal understanding by defining the order that date and time information is presented. Using the standard, “March 8, 2015” would be represented as “2015-03-08", and “August 8, 2015” would be represented as “2015-08-03".
The Library’s Metadata and Discovery Unit provides metadata advice and support. To ask a question, set up a consultation, or learn more about our services, contact us.
Was this helpful? 0 0
Comments (0)
Related Topics
Need help? Chat with us!
More ways to contact us
Please note:
- Ask Us is intended for current Pitt students, faculty, and staff or questions from others regarding our unique resources and services
- We cannot respond to medical, legal, or tax-related questions
- Use of Ask Us is your agreement to our Privacy Policy Statement
- Anonymized transcripts may be used to improve our service or for training