Loading

DataBasics Blog and Information Bulletins

HOW TO Create Your Own Controlled Vocabulary

Controlled vocabulary and metadata are important terms to consider when implementing a DAM as they define how your users will search and retrieve your digital assets. Digital assets that you can’t find are not your assets.

Keywording or tagging is the way to identify unique elements to make them easily found and retrieved and their management is through the use of a controlled vocabulary.

Have you ever run into a problem when searching for oranges but getting everything from sunsets to flowers? Just because you’re using ORANGE to identify fruit and also colour. Or have you used DOG, PUPPY, CANINE instead of just DOGS – forcing the user to think of all alternative keywords, only to be missing out on images of puppies and canines when searching for images of dogs? Synonym support is critical in a good vocabulary as it points the user to the preferred term i.e. DOGS, in this case.

The key to effective search and retrieval is control of these keywords. A controlled vocabulary provides consistency in the way groups of words are presented to users. A specifically selected group of words can be set up to identify different uses of the term, for example: SWIMMING POOLS, POOL TABLES, POOLS. The use of a controlled vocabulary with synonym support is also valuable for common misspellings or alternate spellings, eliminating typing errors. To build your controlled vocabulary, you will need two things: the keywords and a structure. The controlled vocabulary guru David Riecks recommends developing answers to the Who, What, Why, When, Where and How questions of the image in his Keyword Guidelines.

Structure

But first, you need to take a look at your whole collection and think about how someone else would describe what they see, and how would they search for it. Start by creating categories or subject headings that fit your line of business. You can assign any number of assets to a category, and any asset can appear in any number of categories. It creates a network of relations between assets.

You can structure your collection any way you think would benefit your organisation’s use of assets. You can have categories based on the asset type: InDesign, PDF, PowerPoint, etc. Or based on relevance: streets, parks, urban elements, infrastructure, facilities, projects, marketing, advertisements, events, publications, etc. You can organise by date or by work processes e.g. In Progress; Awaiting Approval; Approved, and so on.

As you start building your keyword vocabulary, you’ll know how deep your hierarchies need to be and how to break them down. Once you have picked some categories, start adding keywords in a hierarchical structure with broader terms at the top and more specific terms further down. For example:

Streets->
Grafton->
Sheridan-> Signs

Choosing your keywords

When developing your keywords, keep in mind that even with a controlled vocabulary in place, asset naming is open to interpretation so keywords should generally relate to an asset’s description. It is also important to include proper names and corporate names when trying to identify assets.

To create a uniform approach, an effective way to set up your vocabulary is to follow Riecks’ guidelines by developing answers to the Who, What, Why, When, Where, and How questions of an document, image or file. It is not so much what you are finding when you search, but what you AREN’T finding when you search.

The Who, What, Why, When, Where and How of an image

How would you tag this image?

Golden Rules for Choosing Keywords

Be relevant, be consistent, be specific and be descriptive:

  • Use plural spelling only, unless the spelling is different, like “baby” and “babies”.
  • Avoid ambiguous words and homonyms. Distinguish between them by making one a term like “oranges” and “orange” to depict a colour.
  • Don’t use nicknames, figures of speech or slang, like “coffin nail” for “cigarette”.
  • Think about the context and purpose of the image – an image of a building could be keyword “neoclassic”, “baroque”, “gothic” or for a different purpose it could have keywords like “family home”, “church”, “apartment building”.

A controlled vocabulary adds power and effectiveness to your organisation’s processes by removing confusion and ambiguity, and applying consistency. It is not enough to just create a controlled vocabulary, you need to use it and keep it current. One of the biggest challenges for the success of your DAM is the resistance to manual entering of data. However, you don’t need to start from scratch. There are established lists that have been tested over time and can be found online:

  • JISC Directory of Metadata Vocabularies: This directory provides details of more than 70 vocabulary sources. It categorises the various types of vocabularies as: Thesauri, Subject Headings, Word Lists, Authority Lists and Classification Schemes.
  • Getty Vocabularies: The Getty vocabularies contain structured terminology for art, architecture, decorative arts and other material culture, archival materials, visual surrogates, and bibliographic materials.
  • Photo-Keywords: A range of specialised tab-indented keyword lists that can be added to the ‘Photo-Keywords.com’ Hierarchical Image Keyword Catalog, or used on their own in Lightroom and other photo editing programs.

You can also purchase an established Controlled Vocabulary Keyword Catalogue.
Written by Linda Rouse, Information Manager, DataBasics
www.databasics.com.au

 

More Resources

DataBasics Blog – Digital Asset Management and Controlled Vocabulary

David Riecks Controlled Vocabulary Website

Wikipedia

Marine Metadata

Palgrave Macmillan Journal on Vocab

Association for Information Science and Technology Bulletin

Digital Library for Earth System Education

Leave a Reply