This document proposes a serialization of RDF graphs as S-expressions.
Design goals
- Represent all RDF concepts and syntax
- A syntax that seems non-broken for Common Lisp and Scheme
- Readable serialization... especially for people used to reading s-expressions
- Unambiguously parseable
- (read)'able by Scheme and/or Common Lisp processors
Nodes
A node is represented by a symbol containing a URI.
|http://www.example.org/bob|
A namespaced node is a pair with a namespace abbreviation and a qualified part, both symbols:
(subject . dc) (name . foaf)
If the namespace abbreviation is null, it stands for the default namespace:
(bob)
Blank nodes
Blank nodes are represented by a list of one or more predicate-object lists.
(((creator . dc) "Evan Prodromou") ((title . dc) "S-expression RDF serialization"))
Note that the parts of a blank node are proper lists, while namespaced nodes are improper lists.
Literals
Literals are strings.
"foo" "bar" "bletch"
Typed literals
Typed literals are pairs of a string plus a type. The type is a symbol.
("1999-08-16" . http://www.w3.org/2001/XMLSchema#date)
Types can be abbreviated like node URIs:
("1999-08-16" . (date . xsd))
Statements
Statements are lists with three elements, representing the predicate, subject, and object respectively.
(|http://www.example.org/loves| |http://www.example.org/bob| |http://www.example.org/fishing|)
If there's a default namespace, this could be said more compactly as:
((loves) (bob) (fishing))
Graphs
A graph is a list of statements:
(((ex . loves) (ex . bob) (ex . fishing)) ((creator . dc) |http://www.example.com/| "Bob Reynolds"))
Graphs can be abbreviated in three ways. First, multiple statements with the same predicate and subject can be combined into a single list. So this graph:
(((loves) (bob) (fishing)) ((loves) (bob) (databases)) ((loves) (bob) (ice-skating)))
could be abbreviated as:
(((loves) (bob) (fishing) (databases) (ice-skating)))
Second, if there is more than one statement with the same predicate, they can be combined, such that:
(((met) (harry) (sally)) ((met) (randolph) (his-doom)))
becomes:
((met) ((harry) (sally))
((randolph) (his-doom)))
Finally, a graph can use namespaces to abbreviate nodes. For each namespace, a graph must contain a special statement, "@prefix". If the statement has two arguments, it maps a namespace abbreviation to an URL prefix. If the statement has one argument, it makes a namespace abbreviation the "default".
((@prefix "dc" "http://purl.org/dc/elements/1.1/") (@prefix "http://evan.prodromou.name/") ((creator . dc) (Rdf_serialization_to_s-expressions) "Evan Prodromou"))
Canonical version
The canonical version of a serialized graph uses none of the above abbreviations.
Open questions
- There would be some utility in making typed literals for numbers be just s-expression integers and floats.
- Deal with RDF collections, especially lists!
- Make sure things are unambiguous. Blank node syntax is especially scary.
- Prefix notation is lovely and Lisp-y and makes lists of properties of a single subject kind of a hassle.
- Postfix notation for namespace abbreviations makes for easy-to-read versions of stuff in the default namespace but is in the opposite order of "normal" namespacing.
- Compare with KIF.
tags: rdf lisp scheme sexps s-expressions serialization kif




