Skip to content

Problem querying PhySH subject headings using SPARQL: not all rows are returned #149

@nloyola

Description

@nloyola

Arc2 does not work when querying the PhySH RDF file for disciplines. It only returns 12 of the 18 disciplines.

If you go to the PhySH page, you can see that there are 18 concepts listed under Discipline. PhySH provides an RDF file for download at their GitHub page here:

https://github.com/physh-org/PhySH

I have taken the RDF file and made it available over HTTP here:

http://nloyola.asuscomm.com:8000/physh.rdf

I'm using the following script to query the disciplines:

<?php
require 'vendor/autoload.php';

$options = getopt("ld");

$config = array(
    /* db */
    'db_host' => 'localhost',
    'db_name' => 'physh_rdf',
    'db_user' => 'user',
    'db_pwd' => 'secret',

    /* store name (= table prefix) */
    'store_name' => 'physh_store',
);

$store = ARC2::getStore($config);

if (!$store->isSetUp()) {
    $store->setUp();
}

if (array_key_exists('l', $options)) {
    $store->query('LOAD <http://nloyola.asuscomm.com:8000/physh.rdf>');
}

if (array_key_exists('d', $options)) {
    $store->dump();
    exit(0);
}

function queryCheckError($store, $result) {
    if ($store->getErrors()) {
        print_r($store->getErrors());
        exit(0);
    }
    return $result;
}

function getDisciplines($store) {
    $q = '
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX physh: <https://doi.org/10.29172/>
PREFIX physh_rdf: <https://physh.org/rdf/2018/01/01/core#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

SELECT *
WHERE {
   ?s ?p physh_rdf:Discipline .
   ?s dcterms:title ?title .
   #?s physh_rdf:prefLabel ?label .
   #?s dcterms:description ?description .
}
';

    return queryCheckError($store, $store->query($q));
}

function showResult($result) {
    $rows = $result['result']['rows'];
    $numRows = count($rows);
    print("rows: {$numRows}\n");

    print(json_encode($rows, JSON_PRETTY_PRINT) . "\n");

    //print_r($result);
    // foreach ($rows as $k => $v) {
    //     print($k . ": " . json_encode($v, JSON_PRETTY_PRINT) . "\n");
    // }
}

$result = getDisciplines($store);
showResult($result);

When I run this script, it only returns 12 rows.

If I use Rasqal RDF Query Library with the same SPARQL query, 19 rows are returned. The extra row corresponds to the root entry I believe.

Note that I'm using MariaDB as my database server. Using PHP 8.1.12 running on Debian 11.

Any help with this issue is greatly appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions