5
5
[ ![ latest_version] ( https://img.shields.io/crates/v/pschema-rs )] ( https://crates.io/crates/pschema-rs )
6
6
[ ![ documentation] ( https://img.shields.io/docsrs/pschema-rs/latest )] ( https://docs.rs/pschema-rs/latest/pschema_rs/ )
7
7
8
- ` pschema-rs ` is a Rust library that provides a Pregel-based schema validation algorithm for generating subsets of data
8
+ ` pschema-rs ` is a Rust library that provides a Pregel-based schema validation algorithm for generating subsets of data
9
9
from Wikidata. It is designed to be efficient, scalable, and easy to use, making it suitable for a wide range of applications
10
10
that involve processing large amounts of data from Wikidata.
11
11
12
12
## Features
13
13
14
- - ** Pregel-based schema validation** : ` pschema-rs ` uses the Pregel model, a graph-based computation model, to perform
15
- schema validation on Wikidata entities. This allows for efficient and scalable processing of large datasets.
14
+ - ** Pregel-based schema validation** : ` pschema-rs ` uses the Pregel model, a graph-based computation model, to perform
15
+ schema validation on Wikidata entities. This allows for efficient and scalable processing of large datasets.
16
16
17
17
- ** Rust implementation** : ` pschema-rs ` is implemented in Rust, a systems programming language known for its performance,
18
- memory safety, and concurrency features. This ensures that the library is fast, reliable, and safe to use.
18
+ memory safety, and concurrency features. This ensures that the library is fast, reliable, and safe to use.
19
19
20
- - ** Wikidata subset generation** : ` pschema-rs ` provides functionality to generate subsets of data from Wikidata based on
21
- schema validation rules. This allows users to filter and extract relevant data from Wikidata based on their specific
22
- requirements.
20
+ - ** Wikidata subset generation** : ` pschema-rs ` provides functionality to generate subsets of data from Wikidata based on
21
+ schema validation rules. This allows users to filter and extract relevant data from Wikidata based on their specific
22
+ requirements.
23
23
24
- - ** Customizable validation rules** : ` pschema-rs ` allows users to define their own validation rules using a simple and
25
- flexible syntax. This makes it easy to customize the schema validation process according to the specific needs of a given
26
- application.
24
+ - ** Customizable validation rules** : ` pschema-rs ` allows users to define their own validation rules using a simple and
25
+ flexible syntax. This makes it easy to customize the schema validation process according to the specific needs of a given
26
+ application.
27
27
28
28
- ** Easy-to-use API** : ` pschema-rs ` provides a user-friendly API that makes it easy to integrate the library into any Rust
29
- project. The API provides a high-level interface for performing schema validation and generating Wikidata subsets, with
30
- comprehensive documentation and examples to help users get started quickly.
29
+ project. The API provides a high-level interface for performing schema validation and generating Wikidata subsets, with
30
+ comprehensive documentation and examples to help users get started quickly.
31
31
32
32
## Installation
33
33
34
34
To use ` pschema-rs ` in your Rust project, you can add it as a dependency in your ` Cargo.toml ` file:
35
35
36
36
``` toml
37
37
[dependencies ]
38
- pschema = " 0.0.2 "
38
+ pschema = " 0.0.4 "
39
39
```
40
40
41
41
## Usage
42
42
43
- Here's an example of how you can use ` pschema-rs ` to perform schema validation and generate a subset of data from Wikidata.
44
- Note that what we are doing here is first, defining the ` ShapeExpression ` we want the algorithm to validate. Next, we import
45
- the Wikidata entities from a file. Note that the import methods we have defined create an edge DataFrame, and as such, we
46
- need to call to the function ` GraphFrame::from_edges(edges) ` , which will build the GraphFrame from the imported edges. Lastly,
47
- by calling ` PSchema::new(start).validate(graph) ` , we will both construct the ` PSchema ` algorithm provided the ` ShapeExpression `
48
- we have defined, first, and create the subset of the graph, second. Then, we print the results. Note that we can also export
49
- the results to a file. See the [ examples] ( https://github.com/angelip2303/pschema-rs/tree/main/examples ) for more information.
50
-
51
- ``` rust
52
- use pregel_rs :: graph_frame :: GraphFrame ;
53
- use pschema_rs :: backends :: duckdb :: DuckDB ;
54
- use pschema_rs :: backends :: Backend ;
55
- use pschema_rs :: pschema :: PSchema ;
56
- use pschema_rs :: shape :: shex :: Shape ;
57
- use pschema_rs :: shape :: shex :: NodeConstraint ;
58
- use pschema_rs :: shape :: shex :: TripleConstraint ;
59
- use wikidata_rs :: id :: Id ;
60
-
61
- fn main () -> Result <(), String > {
62
- // Define validation rules
63
- let start = Shape :: TripleConstraint (TripleConstraint :: new (
64
- " City" ,
65
- u32 :: from (Id :: from (" P31" )),
66
- NodeConstraint :: Value (u32 :: from (Id :: from (" Q515" ))),
67
- ));
68
-
69
- // Load Wikidata entities
70
- let edges = DuckDB :: import (" ./examples/from_duckdb/3000lines.duckdb" )? ;
71
-
72
- // Perform schema validation
73
- match GraphFrame :: from_edges (edges ) {
74
- Ok (graph ) => match PSchema :: new (start ). validate (graph ) {
75
- Ok (result ) => {
76
- println! (" Schema validation result:" );
77
- println! (" {:?}" , result );
78
- Ok (())
79
- }
80
- Err (error ) => Err (error . to_string ()),
81
- },
82
- Err (error ) => Err (format! (" Cannot create a GraphFrame: {}" , error )),
83
- }
84
- }
85
-
86
- ```
87
-
88
- You could also run one of the examples to check how this library works:
89
-
90
- ``` sh
91
- cargo build
92
- cargo run --example from_duckdb
93
- ```
94
-
95
- Or follow the guidelines explained in [ examples/from_uniprot] ( https://github.com/angelip2303/pschema-rs/tree/main/examples/from_uniprot )
96
- where a more detailed use-case is shown.
97
-
98
- For more information on how to define validation rules, load entities from Wikidata, and process subsets of data, refer
99
- to the documentation.
43
+ TBD
100
44
101
45
## Related projects
102
46
@@ -114,11 +58,11 @@ the Free Software Foundation, either version 3 of the License, or
114
58
115
59
This program is distributed in the hope that it will be useful,
116
60
but WITHOUT ANY WARRANTY; without even the implied warranty of
117
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
61
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
118
62
GNU General Public License for more details.
119
63
120
64
You should have received a copy of the GNU General Public License
121
- along with this program. If not, see < https://www.gnu.org/licenses/ > .
65
+ along with this program. If not, see < https://www.gnu.org/licenses/ > .
122
66
123
67
** By contributing to this project, you agree to release your
124
68
contributions under the same license.**
0 commit comments