Reading more RDF serialization format for SANSA-RDF
SANSA RDF is a library to read RDF files into Spark. SANSA RDF Reader is an extension of io package of SANSA RDF Reader for reading N-Quads, Turtle and RDF/XML serialization formats of RDF.
This package reads N-Quads, Turtle and RDF/XML files and loads them into RDD, DataFrame and GraphX‘s Graph of Spark.
The main application class is sansa_rdf.App
.
The application requires as application argument:
data/stw.rdf
)To run the application on a standalone Spark cluster
Build the application with Maven
cd /path/to/application
mvn clean package
Submit the application to the Spark cluster
spark-submit \
--class sansa_rdf.App \
--master spark://spark-master:7077 \
target/RDF_Reader-1.0-SNAPSHOT.jar \
/data/input
and for running each object individually replace the value of —class with one of sansa_rdf.io.NQuadReader, sansa_rdf.io.TurtleReader or sansa_rdf.io.XmlReader.