In this short blog, I’ll explain two useful concepts in RDF (Resource Description Framework): BNode skolemization and deskolemization.

Table of Contents

Introduction

RDF is a way to describe data, and sometimes it includes blank nodes (also called BNodes). These are like ‘unnamed’ entities and are useful when we want to represent related aspects of an entity without giving each a unique identifier. These aspects and related data are often normalized into different resources in RDF, similar to how data is normalized into different tables in relational databases. For example, Jan, a person, can have related aspects like name (a literal) and family details (represented with a blank node). Additionally, Jan can have job, email, and website details in separate blank nodes, much like how data is normalized into different tables in relational databases. Instead of repeating related data (e.g., family and job info) within a person’s RDF entry, you represent them as separate blank nodes and link them back to the main entity. Blank nodes help avoid duplication and offer a flexible way to represent complex, structured information.

However, some believe that blank nodes can cause issues when comparing, sharing, versioning, or updating data, because they don’t have a fixed or globally unique identifier. That’s where skolemization comes in. Skolemization is the process of assigning a unique identifier (IRI) to each blank node. This helps track and manage blank nodes across datasets, especially when integrating data from different sources, like giving each ‘nameless’ person a unique ID so they don’t get mixed up later.

On the other hand, deskolemization is turning those names back into blank nodes. You might do this if you want to work with anonymous data again. In this post, we’ll start by looking at how to insert data with blank nodes, and then explore how skolemization and deskolemization help manage these resources.

Insert Data

Let’s walk through an example of adding data to a semantic model and applying skolemization. We’ll begin by creating a model for the process and then insert some individual persons with basic properties like name, email, and affiliation. This RDF data describes the metadata of the “Person” ontology, including its contributors, editors, affiliations, and licensing information. Please note, this data is fake and manually created.

begin sem_apis.create_sem_model('skolemization',null,null,network_owner=>'MARYAMSAJJADIAN',network_name=>'RDF_NETWORK'); end;
/
begin
  sem_apis.update_model('skolemization',
  'prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix dc11: <http://purl.org/dc/elements/1.1/>
prefix dc: <http://purl.org/dc/terms/>
prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix ns0: <http://purl.org/vocab/vann/>
prefix ns1: <http://www.w3.org/2001/02pd/rec54#>
prefix org: <http://www.w3.org/ns/org#>
prefix owl: <http://www.w3.org/2002/07/owl#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix schema: <http://schema.org/>
prefix skos: <http://www.w3.org/2004/02/skos/core#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
insert data {

<https://oracle.nl/ns/persoon>
    dc:contributor [
        schema:affiliation [
            foaf:homepage <http://vlm.nl> ;
            foaf:name "VLM"
        ] ;
        a foaf:person ;
        foaf:firstName "Ann" ;
        foaf:lastName "Stevens" ;
        foaf:mbox <mailto:Ann.Stevens@oracle.nl>
    ], [
        schema:affiliation [
            foaf:homepage <http://digipolis.nl> ;
            foaf:name "Data science"
        ] ;
        a foaf:person ;
        foaf:firstName "Matt" ;
        foaf:lastName "Mostmans" ;
        foaf:mbox <mailto:matt.Mostmans@google.com>
    ], [
        schema:affiliation [
            foaf:homepage <https://www.lne.nl/> ;
            foaf:name "LNE"
        ] ;
        a foaf:person ;
        foaf:firstName "Kris" ;
        foaf:lastName "Peirelinck" ;
        foaf:mbox <mailto:kris.peirelinck@google.com>
    ], [
        schema:affiliation [
            foaf:homepage <http://binnenland.nl/> ;
            foaf:name "ABC"
        ] ;
        a foaf:person ;
        foaf:firstName "Williem" ;
        foaf:lastName "Devroey" ;
        foaf:mbox <mailto:Williem.devroey@google.nl>
    ] ;
    dc:issued "2018-10-04"^^xsd:date ;
    dc:license <https://domaim.nl/sites/default/files/documenten/ict-egov/licenties/hergebruik/modellicentie_gratis_hergebruik_v1_0.html> ;
    dc:mediator [
        foaf:homepage <https://oracle.nl> ;
        foaf:mbox <mailto:oslo@oracle.com> ;
        foaf:name "Data science"
    ] ;
    dc:title "Person"@en, "Persoon"@nl ;
    ns0:preferredNamespaceUri "https://data.oracle.nl/ns/persoon" ;
    a owl:Ontology ;
    rdfs:label "Person"@en, "Persoon"@nl ;
    ns1:editor [
        schema:affiliation [
            foaf:homepage <http://idlab.nl> ;
            foaf:name "imec"
        ] ;
        a foaf:person ;
        foaf:firstName "Eric" ;
        foaf:lastName "Mans" ;
        foaf:mbox <mailto:eric.mans@ug.nl>
    ], [
        schema:affiliation [
            foaf:homepage <http://idlab.nl> ;
            foaf:name "imec"
        ] ;
        a foaf:person ;
        foaf:firstName "Vali" ;
        foaf:lastName "Bouserie" ;
        foaf:mbox <mailto:vali.bouserie@oracle.com>
    ], [
        schema:affiliation [
            foaf:homepage <https://oracle.nl/> ;
            foaf:name "AGO"
        ] ;
        a foaf:person ;
        foaf:firstName "Gis" ;
        foaf:lastName "Martens" ;
        foaf:mbox <mailto:gis.martens@oracle.nl>
    ], [
        schema:affiliation [
            foaf:homepage <http://www.DBDigitaleTransformatie.netherlands.nl/> ;
            foaf:name "DG Digitale Transformatie"
        ] ;
        a foaf:person ;
        foaf:firstName "lei" ;
        foaf:lastName "DHondt" ;
        foaf:mbox <mailto:liesbet.dhondt@oracle.com>
    ], [
        schema:affiliation [
            foaf:homepage <http://digipolis.nl> ;
            foaf:name "Digipolis Gent"
        ] ;
        a foaf:person ;
        foaf:firstName "Katrien" ;
        foaf:lastName "liefde" ;
        foaf:mbox <mailto:Katrien.liefde@oracle.nl>
    ], [
        schema:affiliation [
            foaf:homepage <http://cipalstaad.nl> ;
            foaf:name "Cipal staad"
        ] ;
        a foaf:person ;
        foaf:firstName "meli" ;
        foaf:lastName "Sajadian" ;
        foaf:mbox <mailto:meli.Sajadian@oracle.nl>
    ], [
        schema:affiliation [
            foaf:homepage <http://geosolutions.nl> ;
            foaf:name "Geosolutions"
        ] ;
        a foaf:person ;
        foaf:firstName "Marike" ;
        foaf:lastName "Cahy" ;
        foaf:mbox <mailto:marike.cahy@oracle.com>
    ], [
        schema:affiliation [
            foaf:homepage <https://oracle.nl/> ;
            foaf:name "HR"
        ] ;
        a foaf:person ;
        foaf:firstName "Henk" ;
        foaf:lastName "Vander" ;
        foaf:mbox <mailto:henk.vander@oracle.nl>
    ], [
        schema:affiliation [
            foaf:homepage <http://oracle.nl> ;
            foaf:name "HR"
        ] ;
        a foaf:person ;
        foaf:firstName "Wouter" ;
        foaf:lastName "Beynaerts" ;
        foaf:mbox <mailto:Wouter.beynaerts@oracle.nl>
    ], [
        schema:affiliation [
            foaf:homepage <http://oracle.nl> ;
            foaf:name "HR"
        ] ;
        a foaf:person ;
        foaf:firstName "Karin" ;
        foaf:lastName "De Vreese" ;
        foaf:mbox <mailto:Karin.devreese@oracle.nl>
    ], [
        schema:affiliation [
            foaf:homepage <https://oracle.nl/> ;
            foaf:name "Netherlands website"
        ] ;
        a foaf:person ;
        foaf:firstName "Maryam" ;
        foaf:lastName "Sajjadian" ;
        foaf:mbox <mailto:Maryam.Sajjadian@oracle.com>
    ], [
        schema:affiliation [
            foaf:homepage <http://www.oracle.com> ;
            foaf:name "Graph "
        ] ;
        a foaf:person ;
        foaf:firstName "Juses" ;
        foaf:lastName "Demol" ;
        foaf:mbox <mailto:Juses.demol@oracle.com>
    ] .
}',  
       network_owner=>'MARYAMSAJJADIAN',
   network_name=>'RDF_NETWORK',
   options=>' CLOB_UPDATE_SUPPORT=T '
);
end;

Now, you can check for all blank nodes in subject and object positions using isBlank():

testcheck

Skolemization

As mentioned earlier, skolemization is the process of replacing blank nodes (anonymous resources) with IRIs, making the data easier to query and reference across systems and domains. In this example, we assume our domain is Oracle, and we’re using the well-known IRIs https://oracle.com/.well-known/genid/.  The DELETE and INSERT statements are used together to update RDF data. In the example below, Oracle’s sem_apis.update_model procedure identifies all triples with a blank node in the subject position and replaces that subject with a skolem IRI using the BIND() and CONCAT() functions:

begin
    sem_apis.update_model('skolemization',
'
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

DELETE {
?subject ?predicate ?object .
}
INSERT {
?convertedURI ?predicate ?object
}
WHERE {
?subject ?predicate ?object .
FILTER (isBLANK(?subject))
BIND(URI(CONCAT("https://oracle.com/.well-known/genid/", STRAFTER(STR(?subject), "_:"))) AS ?convertedURI)
}
',
      options=>'STREAMING=F',
      match_options=> 'ALLOW_DUP=T',
      network_owner=>'MARYAMSAJJADIAN',
      network_name=>'RDF_NETWORK'
    );
end;

Please repeat the same process for all objects by replacing the blank node in the object position instead of the subject. This script will now convert blank nodes in the object field to IRIs, making the RDF data more interoperable and easier to handle.

begin
    sem_apis.update_model('skolemization',
'
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

DELETE {
?subject ?predicate ?object .
}
INSERT {
?subject ?predicate ?convertedURI
}
WHERE {
?subject ?predicate ?object .
FILTER (isBLANK(?object))
BIND(URI(CONCAT("https://oracle.com/.well-known/genid/", STRAFTER(STR(?object), "_:"))) AS ?convertedURI)
}
',
      options=>' STREAMING=F',
      match_options=> 'ALLOW_DUP=T ',
      network_owner=>'MARYAMSAJJADIAN',
      network_name=>'RDF_NETWORK'
    );
end;

Check the result in the SPARQL editor using the following query:

prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix owl: <http://www.w3.org/2002/07/owl#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>

SELECT distinct ?subject
WHERE
  { ?subject ?predicate ?object .
  FILTER ( isIRI(?subject) && STRSTARTS(STR(?subject), "https://oracle.com/.well-known/genid/") )
    }

Deskolemization

Deskolemization is the process of converting Skolem IRIs back into blank notes (anonymous nodes). This can be especially helpful when you’re exporting RDF data to systems or formats that expect blank nodes, or when you want to make the data feel less “machine-generated” and more intuitive for human.

begin
    sem_apis.update_model('skolemization',
'
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

DELETE {
?subject ?predicate ?object .
}
INSERT {
?bnode ?predicate ?object
}
WHERE {
?subject ?predicate ?object .
FILTER(REGEX(STR(?subject), "https://oracle.com/"))
BIND(SUBSTR(STR(?subject), STRLEN(STR(?subject)) - STRLEN(STRAFTER(STR(?subject), "/.well-known/genid/")) + 0) AS ?localName)
BIND(BNODE(?localName) AS ?bnode)  
}
',
      options=>' STREAMING=F',
      match_options=> 'ALLOW_DUP=T ',
      network_owner=>'MARYAMSAJJADIAN',
      network_name=>'RDF_NETWORK'
    );
end;

Now, to repeat this process for objects instead of subjects, simply refactor the script like this:

begin
    sem_apis.update_model('deskolemization',
'
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

DELETE {
  ?subject ?predicate ?object .
}
INSERT {
  ?subject ?predicate ?bnode
}
WHERE {
  ?subject ?predicate ?object .
  FILTER(REGEX(STR(?object), "https://oracle.com/"))
  BIND(SUBSTR(STR(?object), STRLEN(STR(?object)) - STRLEN(STRAFTER(STR(?object), "/.well-known/genid/")) + 0) AS ?localName)
  BIND(BNODE(?localName) AS ?bnode)
}
',
    options=>' STREAMING=F',
    match_options=> 'ALLOW_DUP=T ',
    network_owner=>'MARYAMSAJJADIAN',
    network_name=>'RDF_NETWORK'
    );
end;

Conclusion

Skolemization and deskolemization are useful techniques for managing RDF data , preferred by some depending on the context. Skolemization helps make data easier to query and share by giving blank nodes unique IRIs, which is especially helpful when working across different systems and domains. Deskolemization, on the other hand, restores blank nodes when anonymity is preferred or required by certain applications or formats.

Further study