forked from perlrdf/RDF-LinkedData
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
215 lines (158 loc) · 7.53 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
RDF::LinkedData - A Linked Data server implementation
DESCRIPTION
This module is used to create a Linked Data server that can
serve RDF data out of an RDF::Trine::Model. It will look up URIs in
the model and do the right thing (known as the 303 dance) and mint
URLs for that, as well as content negotiation. Thus, you can
concentrate on URIs for your things, you need not be concerned about
minting URLs for the pages to serve it. In addition, optional modules
can provide other important functionality: Cross-origin resource
sharing, VoID description, cache headers, SPARQL Endpoint, Triple
Pattern Fragments, etc. As such, it encompasses a fair share of
Semantic Web best practices, but possibly not in a very flexible Big
Data manner.
INSTALLATION
On Debian and derivatives, such as Ubuntu, this module can be
installed with all its dependencies using
apt-get install librdf-linkeddata-perl
as root or using sudo.
To install the most recent module, it is likely that you already have
the cpan tool installed. Then just run it on the command line. If you
don't have it, see http://www.cpan.org/modules/INSTALL.html
Then, in the cpan tool, type
install RDF::LinkedData
The relevant scripts and modules will be install to different paths
depending on your system. To use it, you need to find the script
linked_data.psgi, e.g. using locate.
CONFIGURATION
*Quick setup for a demo*
One-liner
It is possible to make it run with a single command line, e.g.:
PERLRDF_STORE="Memory;path/to/some/data.ttl" plackup -host localhost script/linked_data.psgi
This will start a server with the default config on localhost on port
5000, so the URIs you're going serve from the file data.ttl will have
to have a base URI http://localhost:5000/.
Using perlrdf command line tool
A slightly longer example requires App::perlrdf, but sets up a
persistent SQLite-based triple store, parses a file and gets the
server with the default config running:
export PERLRDF_STORE="DBI;mymodel;DBI:SQLite:database=rdf.db"
perlrdf make_store
perlrdf store_load path/to/some/data.ttl
plackup -host localhost script/linked_data.psgi
*Configuration*
To configure the system for production use, create a configuration
file rdf_linkeddata.json that looks something like:
{
"base_uri" : "http://localhost:5000/",
"store" : {
"storetype" : "Memory",
"sources" : [ {
"file" : "/path/to/your/data.ttl",
"syntax" : "turtle"
} ]
},
"endpoint": {
"html": {
"resource_links": true
}
},
"cors": {
"origins": "*"
},
"void": {
"pagetitle": "VoID Description for my dataset"
},
"expires" : "A86400" ,
"fragments" : {
"fragments_path" : "/fragments" ,
"allow_dump_dataset" : 0
}
}
In your shell set
export RDF_LINKEDDATA_CONFIG=/to/where/you/put/rdf_linkeddata.json
If the linked_data.psgi script was installed in /usr/local/bin, go:
plackup /usr/local/bin/linked_data.psgi --host localhost --port 5000
The endpoint-part of the config sets up a SPARQL Endpoint. This requires
the RDF::Endpoint module, which is recommended by this module. To
use it, it needs to have some config, but will use defaults.
It is also possible to set an expires time. This needs
Plack::Middleware::Expires and uses Apache mod_expires syntax, in the
example above, it will set an expires header for all resources to
expire after 1 day of access. It is strongly recommended that this is
used, as it can potentially speed up access to resources that aren't
accessed frequently considerably, and take load off your server.
The cors-part of the config enables Cross-Origin Resource
Sharing, which is a W3C Recommendation for relaxing security
constraints to allow data to be shared across domains. In most cases,
this is what you want when you are serving open data, but in some
cases, notably intranets, this should be turned off by removing this
part.
The void-part generates some statistics and a description of the
dataset, using RDF::Generator::Void. It is strongly recommended to
install and run that, but it can take some time to generate, so you
may have to set the detail level.
Finally, fragments add support for Triple Pattern Fragments, a
work-in-progress, initiated by http://linkeddatafragments.org/
It is a more lightweight but less powerful way to query RDF data than
SPARQL. If you have this, it is recommended to have CORS enabled and
required to have at least a minimal VoID setup.
It is also worth noting that an environment variable LOG_ADAPTER can
be set to send log statements to the console. I recommend installing
Log::Any::Adapter::Screen, which can be used by setting this variable
like LOG_ADAPTER=Screen. Its documentation details more environment
variables that can be used to control log levels, etc.
This module now also contains some facilities to support read-write
hypermedia RDF, as well as the possibility to subclass RDF::LinkedData
itself. The read-write functionality is provided by a separate module,
RDF::LinkedData::RWHypermedia, but is highly experimental at this
point.
*Production server setup*
In addition to the configuration above, a production system should set
up a real Web server to run the Plack script. There are many ways to
do this (as Plack provides an elegant separation of concerns between
developers and system administrators).
To set this up under Apache, put this in the host configuration:
<Location />
SetHandler perl-script
PerlResponseHandler Plack::Handler::Apache2
SetEnv RDF_LINKEDDATA_CONFIG /to/where/you/put/rdf_linkeddata.json
PerlSetVar psgi_app /usr/local/bin/linked_data.psgi
</Location>
<Perl>
use Plack::Handler::Apache2;
$ENV{RDF_LINKEDDATA_CONFIG}='/to/where/you/put/rdf_linkeddata.json';
Plack::Handler::Apache2->preload("/usr/local/bin/linked_data.psgi");
</Perl>
<Location ~ "^/(dumps|js/|css/|favicon.ico)">
SetHandler default-handler
</Location>
Note that in some environments, for example if the Plack server
is dynamically configured and/or behind a proxy server, the server
may fail to bind to the address you give it as hostname. In this case,
it is wise to allow the server to bind to any public IP address,
i.e. set the host name to 0.0.0.0.
AUTHOR
Kjetil Kjernsmo, "<[email protected]>"
FUTURE
As the primary author of this module is now working on the Solid
Project, see https://solidproject.org/ , he isn't very interested in
further development of this module, as a Solid server would supersede
this module and allow much more advanced use cases.
BUGS
Please report any bugs using github
<https://github.com/perlrdf/RDF-LinkedData/issues>
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc RDF::LinkedData
The perlrdf IRC channel is the right place to seek help and discuss this module:
irc://irc.perl.org/#perlrdf
ACKNOWLEDGEMENTS
This module was started by Gregory Todd Williams "<[email protected]>"
for RDF::LinkedData::Apache, but has been almost totally rewritten.
COPYRIGHT & LICENSE
Copyright 2010 Gregory Todd Williams
Copyright 2010 ABC Startsiden AS
Copyright 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018 Kjetil Kjernsmo
This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.