Introduction to Knowledge Organisation Systems for the Digital Humanities: Why are Knowledge Organisation Systems Important?

Why are Knowledge Organisation Systems Important?

Remember the last time you couldn’t find a piece of information, a book, or a data set? That is because many information search systems like web search engines rely on purely automatic matching between the word(s) you wrote in the search box and the words from the documents in their database (document is here used broadly to refer to any information resource in an information system, such as books, maps, datasets, images; an information search system includes web search engines as well as online library, archive or museum catalogues). However, words take on lots of meanings, which is why when searching for 'jaguar' you could get a bunch of results for jaguar the car, jaguar the animal, jaguar an aircraft engine etc. Or, you may use the search word which a document does not use although both the document's word and the search term both refer to the same concept (e.g. you wrote ‘cinema’ and the website uses the term ‘movie theatre’).

And this is where we come to knowledge organisation systems (KOS) – they take care of those problems and make sure you only get relevant results. They can do this because they:

Disambiguate homonyms, which are words like 'jaguar' that look the same but mean different things, making it possible for the user to find only information resources about the specific meaning of jaguar they are looking for;
Bring together related words like synonyms which are different words that refer to the same concept, e.g. 'cinema' and 'movie theatre', making it possible to find all relevant information resources, not just the ones that match the user's search query with the words used in documents.

KOS can also provide suggestions to end users such as more specific terms if the user initially used a term that retrieved too many results. For example, if one writes 'table', the search system could ask "do you mean coffee table, dining table, card table, etc." thus narrowing down the number of highly relevant results. And the other way around, of too few hits are found, broader terms could be offered, like 'furniture'. Due to these benefits, institutions like libraries, museums and others that wish to provide quality search functions to their users rely on knowledge organisation systems (KOS).

However, KOS are expensive to produce, manage and apply. This is why many search systems rely on other methods, such as semi-automatic approaches, also known as machine-aided indexing while still using KOS. Others are fully automatic, like commercial search engines. Some search systems rely on user-added tags; this is known as social tagging (see, for example, Goodreads).

What are the pros and cons of each of these approaches? What kinds of KOS are relevant for cultural heritage (CH) institutions and Digital Humanities (DH) research projects? How can these approaches be combined for the best benefit of the user? The course provides an introductory overview of KOS, including social tagging and automatic approaches, with a focus on the search and retrieval of documents from across a range of cultural heritage institutions, digital collections, and digital humanities projects.

After completing this course, you will understand the importance of applying KOS in any CH or DH context and you will know where to look for a KOS most appropriate for your needs. You will also understand the pros and cons of social tagging and automatic approaches and thus you will be able to choose a balanced implementation of professional, social and automatic approaches. While this course provides most examples from knowledge organisation for the performing arts, newspaper archives, and cultural heritage (the three focus areas of the Dimpah project), the knowledge acquired will be widely transferable.

Last modified: Monday, 15 May 2023, 11:22 AM