Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update or synchronize AERS linksets and datasets #13

Open
stain opened this issue Oct 21, 2015 · 3 comments
Open

Update or synchronize AERS linksets and datasets #13

stain opened this issue Oct 21, 2015 · 3 comments

Comments

@stain
Copy link
Contributor

stain commented Oct 21, 2015

As pointed out in the support portal by Andrea Splendiani, some of the strange AERS URIs in the loaded AERS dataset don't exist in the corresponding AERS-Drugbank linkset.

Example, from the dataset faers-of-2012-generated-on-2012-07-09.nt:

<http://aers.data2semantics.org/resource/drug/ATARAX-P_%28HYDROXYZINE_HYDROCHLORIDE%29_SYRUP> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://aers.data2semantics.org/vocab/Drug> .
<http://aers.data2semantics.org/resource/drug/ATARAX-P_%28HYDROXYZINE_HYDROCHLORIDE%29_SYRUP> <http://www.w3.org/2000/01/rdf-schema#label> "ATARAX-P (HYDROXYZINE HYDROCHLORIDE) SYRUP" .
<http://aers.data2semantics.org/resource/drug/ATARAX-P___________________________%2F00058402%2F> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://aers.data2semantics.org/vocab/Drug> .
<http://aers.data2semantics.org/resource/drug/ATARAX-P___________________________%2F00058402%2F> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://aers.data2semantics.org/vocab/Drug> .
<http://aers.data2semantics.org/resource/drug/ATARAX-P___________________________%2F00058402%2F> <http://www.w3.org/2000/01/rdf-schema#label> "ATARAX-P                           /00058402/" .
<http://aers.data2semantics.org/resource/drug/ATARAX-P___________________________%2F00058402%2F> <http://www.w3.org/2000/01/rdf-schema#label> "ATARAX-P                           /00058402/" .
<http://aers.data2semantics.org/resource/involvement/8046146/1018523970> <http://aers.data2semantics.org/vocab/drug> <http://aers.data2semantics.org/resource/drug/ATARAX-P___________________________%2F00058402%2F> .
<http://aers.data2semantics.org/resource/involvement/8046146/1018523970> <http://www.w3.org/2000/01/rdf-schema#label> "Involvement of ATARAX-P                           /00058402/ in #8046146" .
<http://aers.data2semantics.org/resource/involvement/8130835/1018840039> <http://aers.data2semantics.org/vocab/drug> <http://aers.data2semantics.org/resource/drug/ATARAX-P___________________________%2F00058402%2F> .
<http://aers.data2semantics.org/resource/involvement/8130835/1018840039> <http://www.w3.org/2000/01/rdf-schema#label> "Involvement of ATARAX-P                           /00058402/ in #8130835" .
<http://aers.data2semantics.org/resource/involvement/8152200/1018970802> <http://aers.data2semantics.org/vocab/drug> <http://aers.data2semantics.org/resource/drug/ATARAX-P_%28HYDROXYZINE_HYDROCHLORIDE%29_SYRUP> .
<http://aers.data2semantics.org/resource/involvement/8152200/1018970802> <http://www.w3.org/2000/01/rdf-schema#label> "Involvement of ATARAX-P (HYDROXYZINE HYDROCHLORIDE) SYRUP in #8152200" .

Yet the linkset do not contain any ATARAX-P, just a few variations of ATARAX:

<http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugs/DB00557> skos:exactMatch <http://aers.data2semantics.org/resource/drug/ATARAX>.
<http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugs/DB00557> skos:exactMatch <http://aers.data2semantics.org/resource/drug/ATARAX_____________________________%2F00058402%2F>.
<http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugs/DB00557> skos:exactMatch <http://aers.data2semantics.org/resource/drug/ATARAXOID>.
<http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugs/DB00557> skos:exactMatch <http://aers.data2semantics.org/resource/drug/ATARAX_____________________________%2F00058403%2F>.
<http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugs/DB00557> skos:exactMatch <http://aers.data2semantics.org/resource/drug/ATARAX_%2F00595201%2F>.

This is probably because the linkset was created 2013-05-06, while the AERS dump faers-of-2012-generated-on-2012-07-09.nt was made 2012-07-09 - and identifiers in AERS probably changed in between.

Note that http://aers.data2semantics.org/ is currently down.

@stain
Copy link
Contributor Author

stain commented Oct 21, 2015

Note that the identifiers in AERS (particuarly the linkset) are dubious in themselves, with many duplicates, like:

ZOPICLONE
ZOPICLONE__________________________%28ZOPICLONE
ZOPICLONE_________________________%28ZOPICLONE%29
ZOPICLONE_______________________%28ZOPICLONE%29
ZOPICLONE____________________%28ZOPICLONE%29
ZOPICLONE_____________%28ZOPICLONE%29
ZOPICLONE____________%28ZOPICLONE%29
ZOPICLONE__________%28ZOPICLONE%29
ZOPICLONE_________%28ZOPICLONE%29
ZOPICLONE_____%28ZOPICLONE%29
ZOPICLONE_%28ZOPICLONE%29
ZOPICLONE%28ZOPICLONE%29

Unquoting the %-escaping you get:

ZOPICLONE
ZOPICLONE__________________________(ZOPICLONE
ZOPICLONE_________________________(ZOPICLONE)
ZOPICLONE_______________________(ZOPICLONE)
ZOPICLONE____________________(ZOPICLONE)
ZOPICLONE_____________(ZOPICLONE)
ZOPICLONE____________(ZOPICLONE)
ZOPICLONE__________(ZOPICLONE)
ZOPICLONE_________(ZOPICLONE)
ZOPICLONE_____(ZOPICLONE)
ZOPICLONE_(ZOPICLONE)
ZOPICLONE(ZOPICLONE)

Most of these ___ variants are not used in faers-of-2012-generated-on-2012-07-09.nt - but those that do follow a similar pattern - where the name seem to correspond to a direct transformation of the rdfs:label.

@stain
Copy link
Contributor Author

stain commented Oct 21, 2015

This VoID for the linkset proposes the SPARQL query

PREFIX skos: <http://www.w3.org/2004/02/skos/core#> 
CONSTRUCT {
?drugbank_uri skos:closeMatch ?aers_uri
} WHERE {
?drugbank_uri <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugs> .
 ?aers_uri skos:closeMatch ?drugbank_uri
}

which I assume was run at http://aers.data2semantics.org/ - those skos:closeMatch relations don't exist in any of the other RDF data.

Perhaps @antonisloizou or @rsiebes have the details on how this linkset should be updated or why the URIs are this weird?

@stain
Copy link
Contributor Author

stain commented Oct 22, 2015

Probably delay this fix until 2.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant