Skip to content

fqixiang/automated-systematic-review-datasets

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 

Repository files navigation

Automated Systematic Review Datasets

This project contains datasets for the Automated Systematic Review project. This repository is used to collect, preprocess and share datasets on Systematic Review.

Datasets

The datasets are alphabetically ordered.

Reference Topic Sample Size Inclusion Link License
Cohen et al., 2006 ACEInhibitors 2544 1.61% source NA
Cohen et al., 2006 ADHD 851 2.35% source NA
Cohen et al., 2006 Antihistamines 310 5.16% source NA
Cohen et al., 2006 Atypical Antipsychotics 1120 13.04% source NA
Cohen et al., 2006 Beta Blockers 2072 2.03% source NA
Cohen et al., 2006 Calcium Channel Blockers 1218 8.21% source NA
Cohen et al., 2006 Estrogens 368 21.74% source NA
Cohen et al., 2006 NSAIDS 393 10.43% source NA
Cohen et al., 2006 Opiods 1915 0.78% source NA
Cohen et al., 2006 Oral Hypoglycemics 503 27.04% source NA
Cohen et al., 2006 Proton Pump Inhibitors 1333 3.83% source NA
Cohen et al., 2006 Skeletal Muscle Relaxants 1643 0.55% source NA
Cohen et al., 2006 Statins 3465 2.45% source NA
Cohen et al., 2006 Triptans 671 3.58% source NA
Cohen et al., 2006 Urinary Incontinence 327 12.23% source NA
Van de Schoot et al., 2018 PTSD 5783 0.66% source
Wahono, 2015 Software Defect Detection 7002 0.89% source Creative Commons Attribution 4.0 International
Hall et al., 2012 Software Fault Prediction 8911 1.17% source Creative Commons Attribution 4.0 International
Radjenović et al., 2013 Software Fault Prediction 6000 0.80% source Creative Commons Attribution 4.0 International
Kitchenham et al., 2010 Software Engineering 1704 2.58% source Creative Commons Attribution 4.0 International
Bannach-Brown et al., 2019 Animal Model of Depression 1993 14.0% source Creative Commons Attribution 4.0 International

How it works

Collecting and preprocessing data

The folder datasets/ has a subfolder for the different Systematic Reviews datasets. Each of these subfolders are little project. They contain code and a README.md. The scripts in the different dataset folder create a subfolder named output/ with the result of the data collection.

Dataset formats

The [Automated Systematic Review](https://github.com/msdslab/automated- systematic-review) software accepts several file formats like RIS and CSV. The datasets in this project are stored in one of these formats.

RIS files

RIS files are used by digital libraries, like IEEE Xplore, Scopus and ScienceDirect. Citation managers Mendeley and EndNote support the RIS format as well. For simulation, we use an additional RIS tag with the letters LI (Label included).

CSV files

For CSV files, the software accepts a set of predetermined labels in line with the ones used in RIS files. The most commonly used ones are: "id", "authors", "date", "title", "keywords" and "abstract". To indicate labelling decisions, one can use "included" or "label_included".

In general, the following column names are recognized (based on https://pypi.org/project/RISparser/):

first_authors
secondary_authors
tertiary_authors
subsidiary_authors
abstract
author_address
accession_number
authors
custom1
custom2
custom3
custom4
custom5
custom6
custom7
custom8
caption
call_number
place_published
date
name_of_database
doi
database_provider
end_page
end_of_reference
edition
id
number
alternate_title1
alternate_title2
alternate_title3
journal_name
keywords
file_attachments1
file_attachments2
figure
language
label
note
type_of_work
notes
abstract
number_of_Volumes
original_publication
publisher
year
reviewed_item
research_notes
reprint_edition
version
issn
start_page
short_title
primary_title
secondary_title
tertiary_title
translated_author
title
translated_title
type_of_reference
unknown_tag
url
volume
publication_year
access_date

The custom tag is:

label_included

Contact and contributors

Contact details can be found at the Automated Systematic Review project page.

About

Datasets, preprocessing and publication of the datasets of the https://github.com/msdslab/automated-systematic-review project.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 89.0%
  • Python 9.2%
  • Shell 1.8%