RDFStore

 view release on metacpan or  search on metacpan

doc/SWADe-rdfstore.html  view on Meta::CPAN

<TR><TH></TH><TH>st_num --></TH><TH>0</TH> <TH>1</TH> <TH>2</TH> <TH>3</TH> <TH>4</TH> <TH>5</TH> <TH>6</TH> <TH>7</TH> <TH>9</TH> <TH>...</TH></TR>
<TR><TH>KEY</TH><TH COLSPAN="12"></TH></TR>
<TR><TD>s0=s3</TD><TD></TD><TD>1</TD> <TD>1</TD> <TD>1</TD> <TD>1</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>000</TD><TD>s0=s3 is (directly and indirectly) connected to all the other statements</TD></TR>
<TR><TD>s1=s2</TD><TD></TD><TD>0</TD> <TD>1</TD> <TD>1</TD> <TD>1</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>000</TD><TD>s1=s2 do not connect to st(0)</TD></TR>
</TABLE>

<BR>

Similarly we can generate the P_CONNECTIONS and O_CONNECTIONS tables to map in the connections to the other two statement components: <BR>
<BR>


<TABLE BORDER="0">

<TR>

<TD VALIGN="top">
<TABLE BORDER="1">
<TR><TH COLSPAN="13">P_CONNECTIONS hash table</TH></TR>
<TR><TH></TH><TH></TH><TH COLSPAN="10">VALUE</TH></TR>
<TR><TH></TH><TH>st_num --></TH><TH>0</TH> <TH>1</TH> <TH>2</TH> <TH>3</TH> <TH>4</TH> <TH>5</TH> <TH>6</TH> <TH>7</TH> <TH>9</TH> <TH>...</TH></TR>
<TR><TH>KEY</TH><TH COLSPAN="12"></TH></TR>
<TR><TD>p0</TD><TD></TD><TD>1</TD> <TD>0</TD> <TD>0</TD> <TD>1</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>000</TD></TR>
<TR><TD>p1</TD><TD></TD><TD>0</TD> <TD>1</TD> <TD>1</TD> <TD>1</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>000</TD></TR>
<TR><TD>p2</TD><TD></TD><TD>0</TD> <TD>1</TD> <TD>1</TD> <TD>1</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>000</TD></TR>
<TR><TD>p3</TD><TD></TD><TD>1</TD> <TD>1</TD> <TD>1</TD> <TD>1</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>000</TD></TR>
</TABLE>
</TD>

<TD VALIGN="top">
<TABLE BORDER="1">
<TR><TH COLSPAN="13">O_CONNECTIONS hash table</TH></TR>
<TR><TH></TH><TH></TH><TH COLSPAN="10">VALUE</TH></TR>
<TR><TH></TH><TH>st_num --></TH><TH>0</TH> <TH>1</TH> <TH>2</TH> <TH>3</TH> <TH>4</TH> <TH>5</TH> <TH>6</TH> <TH>7</TH> <TH>9</TH> <TH>...</TH></TR>
<TR><TH>KEY</TH><TH COLSPAN="12"></TH></TR>
<TR><TD>o0</TD><TD></TD><TD>1</TD> <TD>0</TD> <TD>0</TD> <TD>1</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>000</TD></TR>
<TR><TD>o1</TD><TD></TD><TD>0</TD> <TD>1</TD> <TD>1</TD> <TD>1</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>000</TD></TR>
<TR><TD>o2</TD><TD></TD><TD>0</TD> <TD>1</TD> <TD>1</TD> <TD>1</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>000</TD></TR>
<TR><TD>o3</TD><TD></TD><TD>1</TD> <TD>1</TD> <TD>1</TD> <TD>1</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>000</TD></TR>
</TABLE>
</TD>

</TR>
</TABLE>
<BR>

By using the above 3 additional tables is then possible to very efficently run RDQL (or XPath) queries spanning different connected statements, on the same storage, or even on different storages distributed over the Web as RDF/XML (or stored in some ...

Even if the graph is extremely large it is generally possible to store the above (sparse) matrix efficiently by making use of some specific properties. We will briefly discuss these and the hybrid run-length/variable-length encoding algorithm used by...

<H2>The compression algorithm</H2>

Both the graph as well as the free-text words index are relatively sparsely populated which make simple compression possible. The bit arrays used in each can grow to very significant sizes; in the order of several, if not tens of page multiples. Comb...
Initially a Run Length Encoding method was used; with two small optimizations. The first optimization was early termination; i.e. if the remainder of the row would solely contain zero's it would simply not list those explicitly. The second optimizati...
The first issue is that certain values, such as a reference to a schema or a common property are dis-proportionally over represented; by several orders of magnitude (e.g rdf:type property or contextual information). Secondly certain other values; suc...
So for this reason a variant of the Variable Run Length encoding is used along with part of the above RLE method. This method is still applicable to the word indexing but adds the ability to recognize short patterns; and code the patterns which occur...
At this point in time (de-)compression is such that the storage volumes are reasonable, that transfer volumes are manageable and we do not expect to give priority to work in this area. However we expect to examine this issue again and will be looking...

<H2>Conclusion: RDFStore</H2>

RDFStore <a href="#47">[47]</a> is a perl/C toolkit to process, store, retrieve and manage RDF; it consists of a programming API, streaming RDF/XML and N-Triples parsers and a generic hashed data storage which implements the indexing algorithm as des...
<BR>
RDFStore has been successfully used for the development of several Semantic Web applications <a href="#16a">[16a]</a><a href="#16b">[16b]</a><a href="#16c">[16c]</a> and others which read/write and query RDF descriptions using RDQL.

<H2>References</H2>

<a name="1">[1]</a> "A Relational Model of Data for Large Shared Data Banks", E.F. Codd, Communications of the ACM, Vol. 13, No. 6, June 1970, pp. 377-387. <a href="http://www.acm.org/classics/nov95/toc.html">http://www.acm.org/classics/nov95/toc.htm...
<a name="2">[2]</a> P. Buneman, S. Davidson, G. Hillebrand and D. Suciu, "A query language and optimization techniques for unstructured data". In SIGMOD, San Diego, 1996<BR>
<a name="3">[3]</a> S. Abiteboul, D. Quass, J. McHugh, J. Widom and J. Wiener "The lorel query language for semistructured data" 1996 ftp://db.stanford.edu/pub/papers/lorel96.ps<BR>
<a name="4">[4]</a> Dan Brickley, R.V. Guha "RDF Vocabulary Description Language 1.0: RDF Schema" <a href="http://www.w3.org/TR/rdf-schema/">http://www.w3.org/TR/rdf-schema/</a><BR>
<a name="5">[5]</a> Grady Booch "Object-Oriented Analysis and Design with Applications" p. 71-72<BR>
<a name="6">[6]</a> Aimilia Magkanaraki, Sofia Alexaki, Vassilis Christophides, Dimitris Plexous "Benchmarking RDF Schemas for the Semantic Web" <a href="http://139.91.183.30:9090/RDF/publications/iswc02.PDF">http://139.91.183.30:9090/RDF/publication...
<a name="7">[7]</a> S. Abiteboul "Querying Semi-Structured Data" 1997 <a href="http://citeseer.nj.nec.com/abiteboul97querying.html">http://citeseer.nj.nec.com/abiteboul97querying.html</a><BR>
<a name="8">[8]</a> S. Abiteboul and Victor Vianu, "Queries and Computation on the Web" 1997 <a href="http://citeseer.nj.nec.com/abiteboul97queries.html">http://citeseer.nj.nec.com/abiteboul97queries.html</a><BR>
<a name="9">[9]</a> Dan Suciu, "An overview of semistructured data" <a href="http://citeseer.nj.nec.com/160105.html">http://citeseer.nj.nec.com/160105.html</a><BR>
<a name="10">[10]</a> Graham Klyne, 13-Mar-2002 "Circumstance, provenance and partial knowledge - Limiting the scope of RDF assertions" <a href="http://www.ninebynine.org/RDFNotes/UsingContextsWithRDF.html">http://www.ninebynine.org/RDFNotes/UsingCon...
<a name="11">[11]</a> John F. Sowa, "Knowledge Representation:  Logical, Philosophical, and Computational Foundations", Brooks Cole Publishing Co., ISBN  0-534-94965-7<BR>
<a name="12">[12]</a> Graham Klyne, 18 October 2000 "Contexts for RDF Information Modelling" <a href="http://public.research.mimesweeper.com/RDF/RDFContexts.html">http://public.research.mimesweeper.com/RDF/RDFContexts.html</a><BR>
<a name="13">[13]</a> Seth Russel, 7 August 2002 "Quads" <a href="http://robustai.net/sailor/grammar/Quads.html">http://robustai.net/sailor/grammar/Quads.html</a><BR>
<a name="14">[14]</a> T. Berners-Lee, Dan Connoly "Notation3" <a href="http://www.w3.org/2000/10/swap/doc/Overview.html">http://www.w3.org/2000/10/swap/doc/Overview.html</a><BR>
<a name="15">[15]</a> Dave Beckett, "Contexts Thoughts" <a href="http://www.redland.opensource.ac.uk/notes/contexts.html">http://www.redland.opensource.ac.uk/notes/contexts.html</a><BR>
<a name="16a">[16a]</a> Asemantics S.r.l. "Image ShowCase (ISC)" <a href="http://demo.asemantics.com/biz/isc/">http://demo.asemantics.com/biz/isc/</a><BR>
<a name="16b">[16b]</a> Asemantics S.r.l. "Last Minute News (LMN)" <a href="http://demo.asemantics.com/biz/radio/">http://demo.asemantics.com/biz/radio/</a><BR>
<a name="16c">[16c]</a> Asemantics S.r.l. "The News Blender (NB)" <a href="http://demo.asemantics.com/biz/lmn/nb/">http://demo.asemantics.com/biz/lmn/nb/</a><BR>
<a name="17">[17]</a> GINF <a href="http://www-diglib.stanford.edu/diglib/ginf/">http://www-diglib.stanford.edu/diglib/ginf/</a><BR>
<a name="18">[18]</a> Jena <a href="http://www.hpl.hp.com/semweb/">http://www.hpl.hp.com/semweb/</a><BR>
<a name="19">[19]</a> Algae <a href="http://www.w3.org/1999/02/26-modules/User/Algae-HOWTO.html">http://www.w3.org/1999/02/26-modules/User/Algae-HOWTO.html</a><BR>
<a name="20">[20]</a> RDFSuite <a href="http://139.91.183.30:9090/RDF/">http://139.91.183.30:9090/RDF/</a><BR>
<a name="21">[21]</a> Wraf <a href="http://wraf.org/RDF-Service/doc/html/wraf.html">http://wraf.org/RDF-Service/doc/html/wraf.html</a><BR>
<a name="22">[22]</a> PARKA-DB <a href="http://www.cs.umd.edu/projects/plus/Parka/parka-db.html">http://www.cs.umd.edu/projects/plus/Parka/parka-db.html</a><BR>
<a name="23">[23]</a> RDFGateway <a href="http://www.intellidimension.com/pages/site/products/rdfgateway.rsp">http://www.intellidimension.com/pages/site/products/rdfgateway.rsp</a><BR>
<a name="24">[24]</a> 3Store <a href="http://sourceforge.net/projects/threestore/">http://sourceforge.net/projects/threestore/</a><BR>
<a name="25">[25]</a> TAP <a href="http://tap.stanford.edu/">http://tap.stanford.edu/</a><BR>
<a name="26">[26]</a> Inkling <a href="http://swordfish.rdfweb.org/rdfquery/">http://swordfish.rdfweb.org/rdfquery/</a><BR>
<a name="27">[27]</a> RubyRDF <a href="http://www.w3.org/2001/12/rubyrdf/intro.html">http://www.w3.org/2001/12/rubyrdf/intro.html</a><BR>
<a name="28">[28]</a> 4RDF <a href="http://Fourthought.com/">http://Fourthought.com/</a><BR>
<a name="29">[29]</a> Sesame <a href="http://sesame.aidministrator.nl/">http://sesame.aidministrator.nl/</a><BR>
<a name="30">[30]</a> KAON REVERSE <a href="http://kaon.semanticweb.org/alphaworld/reverse/view">http://kaon.semanticweb.org/alphaworld/reverse/view</a><BR>
<a name="31">[31]</a> D2R <a href="http://www.wiwiss.fu-berlin.de/suhl/bizer/d2rmap/D2Rmap.htm">http://www.wiwiss.fu-berlin.de/suhl/bizer/d2rmap/D2Rmap.htm</a><BR>
<a name="32">[32]</a> DBVIEW <a href="http://www.w3.org/2000/10/swap/dbork/dbview.py">http://www.w3.org/2000/10/swap/dbork/dbview.py</a><BR>
<a name="33">[33]</a> Virtuoso <a href="http://www.openlinksw.com/virtuoso/">http://www.openlinksw.com/virtuoso/</a><BR>
<a name="34">[34]</a> Federate <a href="http://www.w3.org/2003/01/21-RDF-RDB-access/">http://www.w3.org/2003/01/21-RDF-RDB-access/</a><BR>
<a name="35">[35]</a> Triple querying with SQL <a href="http://www.picdiary.com/triplequerying/">http://www.picdiary.com/triplequerying/</a><BR>
<a name="36">[36]</a> Squish-to-SQL <a href="http://rdfweb.org/2002/02/java/squish2sql/intro.html">http://rdfweb.org/2002/02/java/squish2sql/intro.html</a><BR>
<a name="37">[37]</a> Jena2 Database interface <a href="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/jena/jena2/doc/DB/index.html">http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/jena/jena2/doc/DB/index.html</a><BR>
<a name="38">[38]</a> Mapping Semantic Web data with RDBMSes <a href="http://www.w3.org/2001/sw/Europe/reports/scalable_rdbms_mapping_report/">http://www.w3.org/2001/sw/Europe/reports/scalable_rdbms_mapping_report/</a><BR>
<a name="39">[39]</a> BerkeleyDB Sleepycat <a href="http://www.sleepycat.com/">http://www.sleepycat.com/</a><BR>
<a name="40">[40]</a> rdfdb <a href="http://www.guha.com/rdfdb/">http://www.guha.com/rdfdb/</a><BR>
<a name="41">[41]</a> Redland <a href="http://www.redland.opensource.ac.uk/">http://www.redland.opensource.ac.uk/</a><BR>
<a name="42">[42]</a> rdflib <a href="http://rdflib.net/">http://rdflib.net/</a><BR>
<a name="43">[43]</a> DAML DB <a href="http://www.daml.org/2001/09/damldb/ ">http://www.daml.org/2001/09/damldb/ </a><BR>
<a name="44">[44]</a> ODP Search <a href="http://dmoz.org/ODPSearch/">http://dmoz.org/ODPSearch/</a><BR>
<a name="45">[45]</a> Dave Beckett "RDF/XML Syntax Specification (Revised)" <a href="http://www.w3.org/TR/rdf-syntax-grammar/">http://www.w3.org/TR/rdf-syntax-grammar/</a><BR>
<a name="46">[46]</a> Jan Grant, Dave Beckett "RDF Test Cases" <a href="http://www.w3.org/TR/rdf-testcases/">http://www.w3.org/TR/rdf-testcases/</a><BR>
<a name="47">[47]</a> Alberto Reggiori, Dirk-Willem van Gulik, RDFStore, <a href="http://rdfstore.sourceforge.net">http://rdfstore.sourceforge.net</a><BR>
<a name="48">[48]</a> Graham Klyne, Jeremy J. Carroll "Resource Description Framework (RDF): Concepts and Abstract Syntax" <a href="http://www.w3.org/TR/rdf-concepts/">http://www.w3.org/TR/rdf-concepts/</a><BR>
<a name="49">[49]</a> Patrick Hayes "RDF Semantics" <a href="http://www.w3.org/TR/rdf-mt/">http://www.w3.org/TR/rdf-mt/</a><BR>
<a name="50">[50]</a> Miller L., Seaborne A., Reggiori 'Implementations of SquishQL, a simpler RDF Query Language", 1st International Semantic Web Conference, Sardinia, 2002<BR>
<a name="51">[51]</a> Mastering Algorithms with Perl By JonOrwant ,Jarkko Hietaniemi ,JohnMacdonald 1st Edition August 1999 ISBN 1-56592-398-7 p.287<BR>
<a name="52">[52]</a> Unicode Caseless Matching <a href="http://www.unicode.org/unicode/reports/tr21/#Caseless_Matching">http://www.unicode.org/unicode/reports/tr21/#Caseless_Matching</a><BR>
<a name="53">[53]</a> Robert MacGregor and In-Young Ko 'Representing Contextualized Data using Semantic Web Tools' <a href="http://km.aifb.uni-karlsruhe.de/ws/psss03/proceedings/macgregor-et-al.pdf">http://km.aifb.uni-karlsruhe.de/ws/psss03/proceedin...



( run in 1.181 second using v1.01-cache-2.11-cpan-5837b0d9d2c )