4.2 Models for data: triples and RDF

4.2.1 Triples

RDF, the Resource Description Framework, is the standard for LOD, as defined by the World Wide Web Consortium (W3C) and has been widely adopted. It is a data model that is based on triples (see 4.1.1 The Semantic Web) and that uses URIs to contextualize the object, predicate and subject in a directed graph. There are different serialisation formats in use for RDF, such as RDF/XML, Turtle and JSON-LD.

"This is a manuscript."


This statement is a triple (as those seen in 4.1 Why use LOD for ancient written artefacts?). This is a very practical way to model knowledge, as you can expand this statement to similar others such as "The manuscript is in London" or "The manuscript is called BL Orient 716". You can use these statements to describe other ancient written artefacts such as papyri or the variants in an edition. Such statements, consisting of a 'Subject', 'Predicate' and 'Object' are called "triples".

“This” may be a URI, identifying a concept or a real thing, the subject of our statement. If you want to write a triple in RDF using URIs for Linked Open Data about “Paris. Bibliothèque nationale de France, Département des manuscrits, Supplément grec 607 is a manuscript.” we would have the following:

The Subject of our statement Paris. Bibliothèque nationale de France, Département des manuscrits, Supplément grec 607 has a URI in Biblissima: https://data.biblissima.fr/entity/Q77950 (it has also other URIs available, e.g., https://gallica.bnf.fr/ark:/12148/btv1b8593585j, so you can create multiple triples). The Predicate “is a” translates in a URI in the RDF definition: http://www.w3.org/1999/02/22-rdf-syntax-ns#type, in short rdf:type. “Manuscript” will be in this case a URI identifying a class, for example the one from Biblissima (https://w3id.org/bibma/Manuscript). This results in the following triple:

subject predicate object
https://data.biblissima.fr/entity/Q77950 http://www.w3.org/1999/02/22-rdf-syntax-ns#type https://w3id.org/bibma/Manuscript


This is just one example of a triple, but because every URI can be a subject, predicate, object of any other statement, the model is prone to quick expansion in all directions. Finding or defining concepts with URIs and modelling the sets of triples is not an easy task. Therefore, the key aspect to keep in mind is to keep it as simple as possible.

Using existing and shared URIs relevant for a specific domain will greatly help the interchangeability of information. For instance, if Homer is named in this manuscript in any respect, instead of the VIAF identifier for Homer, you can also opt for the Wikidata URI, https://www.wikidata.org/wiki/Q6691. Within a dataset, it is definitely recommended to be consistent with the use of identifiers!


References
Further reading
Resources
ENCODE presentations