rdf-rdfa

HTML+RDFa 1.1 and XHTML+RDFa 1.1 parser classes for the ZuzuScript rdf distribution.

This trial distribution provides:

HtmlRdfaParser (rdf/parser/html_rdfa) — implements HTML+RDFa 1.1 - Second Edition, parsing with html/parser so real-world tag soup is fine.
XhtmlRdfaParser (rdf/parser/xhtml_rdfa) — implements XHTML+RDFa 1.1 - Third Edition, parsing with std/data/xml; input is assumed to be well-formed XML.
RdfaCoreParser (rdf/parser/rdfa_core) — generic RDFa Core 1.1 processing for arbitrary XML, without any HTML or XHTML host-language rules.
CurieExpander (rdf/parser/rdfa_core) — a standalone class for expanding CURIEs, SafeCURIEs, and terms to full IRIs using the RDFa initial context, custom prefixes, a default vocabulary, and a base IRI.

All parsers support the optional RDFa features: @role (per the Role Attribute spec), property copying via rdfa:copy/rdfa:Pattern, and vocabulary expansion (RDFa Core 1.1 §10) behind the vocab_expansion option.

Example

from rdf/parser/html_rdfa import HtmlRdfaParser;

let quads := ( new HtmlRdfaParser() ).parse_string("""
<!DOCTYPE html>
<html>
  <body vocab="http://schema.org/">
    <div typeof="Person">
      <span property="name">Alice</span>
    </div>
  </body>
</html>
""", base: "http://example.com/doc");

Parsers compose the RdfParser trait from rdf/parser and support parse_string, parse_file, parse_lines, and parse_chunks. They accept the same base and into options as the main RDF parsers, plus:

vocab_expansion (Boolean) — apply RDFa vocabulary entailment over the vocabularies named by @vocab.
vocab_loader (Function) — fn (String iri) -> Array returning the quads of a vocabulary document; defaults to fetching over HTTP and parsing by content type.

from rdf/parser/rdfa_core import CurieExpander;

let expander := ( new CurieExpander() ).with_prefixes({ ex: "http://example.com/" });
say( expander.expand("ex:thing") );     # http://example.com/thing
say( expander.expand("foaf:name") );    # initial context prefix
say( expander.expand("[dc:title]") );   # SafeCURIE

Test suite

tests/rdfa-testsuite.zzs runs the official RDFa 1.1 test suite for the html5 and xhtml1 host languages, plus the rdfa1.1-vocab vocabulary expansion tests. All applicable tests pass. The run is author-gated (AUTHOR_TESTING=1) and skipped when the fixtures are absent.

Fixtures are vendored into tests/rdfa11/ from <https://github.com/rdfa/rdfa.github.io>. To (re-)vendor them:

cd tests
git clone --depth 1 https://github.com/rdfa/rdfa.github.io.git /tmp/rdfa-site
mkdir -p rdfa11
cp /tmp/rdfa-site/test-suite/manifest.jsonld rdfa11/
cp -r /tmp/rdfa-site/test-suite/test-cases/rdfa1.1/html5 rdfa11/
cp -r /tmp/rdfa-site/test-suite/test-cases/rdfa1.1/xhtml1 rdfa11/
cp -r /tmp/rdfa-site/test-suite/test-cases/rdfa1.1-vocab/html5 rdfa11/vocab-html5
cp -r /tmp/rdfa-site/test-suite/test-cases/rdfa1.1-vocab/xhtml1 rdfa11/vocab-xhtml1
mkdir -p rdfa11/vocabs
cp /tmp/rdfa-site/vocabs/rdfa-test.html rdfa11/vocabs/

The vocabulary-expansion tests use the vendored copy of the http://rdfa.info/vocabs/rdfa-test# vocabulary (itself an HTML+RDFa document) through a file-backed vocab_loader, so no network access is needed.

README.md

Package

rdf-rdfa

Example

Test suite