pridepy: A Python package to download and search data from PRIDE database

Python Client library for PRIDE Rest API

Installation

From PyPI

To install, simply use pip:

$ pip install --upgrade pridepy

From Source

First, clone the repository on your local machine and then install the package using pip:

$ git clone https://github.com/PRIDE-Archive/pridepy
$ cd pridepy
$ pip install .

Install with setup.py:

$ git clone https://github.com/PRIDE-Archive/pridepy
$ cd pridepy
$ python setup.py sdist bdist_wheel 
$ pip install dist/pridepy-{version}.tar.gz

Examples

Download all the raw files from a dataset(eg: PXD012353). Warning: Raw files are generally large in size, so it may take some time to download depending on the number of files and file sizes.

-p: in download specifies protocol (ftp default):

ftp: FTP protocol
aspera: using the aspera protocol
globus: PRIDE globus endpoint (the data is downloaded through https)

$ pridepy download-all-public-raw-files -a PXD012353 -o /Users/yourname/Downloads/foldername/ -p aspera

Download single file by name:

$ pridepy download-file-by-name -a PXD022105 -o /Users/yourname/Downloads/foldername/ -f checksum.txt -p globus

NOTE: Currently we use Globus URLs (when -p globus is used) via HTTPS, not the Globus protocol. For more information about Globus, see Globus documentation.

Search projects with keywords and filters

$ pridepy search-projects-by-keywords-and-filters --keyword accession:PXD012353

Search files with filters

$ pridepy get-files-by-filter --filter fileCategory.value==RAW

Stream metadata of all projects as json and write it to a file

$ pridepy stream-projects-metadata -o all_pride_projects.json

Stream metadata of all files as json and write it to a file. Project accession can be specified as an optional parameter

$ pridepy stream-files-metadata -o all_pride_files.json
OR
$ pridepy stream-files-metadata -o PXD005011_files.json -a PXD005011

This Python CLI tool, built using the Click module, already provides detailed usage instructions for each command. To avoid redundancy and potential clutter in this README, you can access the usage instructions directly from the CLI Use the below command to view a list of commands available:

$ pridepy --help
Usage: pridepy [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  download-all-public-raw-files   Download all public raw files...
  download-file-by-name           Download a single file from a...
  get-files-by-filter             get paged files :return:
  get-files-by-project-accession  get files by project accession...
  get-private-files               Get private files by project...
  get-projects                    get paged projects :return:
  get-projects-by-accession       get projects by accession... 
  stream-files-metadata           Stream all files metadata in...
  stream-projects-metadata        Stream all projects metadata...

NOTE

Please make sure you are using Python3, not Python 2.7 version.

White paper

A white paper is available at here. We can build it as PDF using pandoc.

$docker run --rm --platform linux/amd64 -v /Users/yperez/work/pridepy/paper/:/data -w /data openjournals/inara:latest paper.md -p -o pdf

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement".

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

Citation

Selvakumar Kamatchinathan, Suresh Hewapathirana, Chakradhar Bandla, Juan Antonio Vizcaíno, Yasset Perez-Riverol. (2021, January 28). pridepy: A Python package to download and search data from PRIDE database (Version v0.0.3).

Name		Name	Last commit message	Last commit date
Latest commit History 257 Commits
.github/workflows		.github/workflows
benchmark		benchmark
paper		paper
pridepy		pridepy
recipe		recipe
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pridepy: A Python package to download and search data from PRIDE database

Installation

From PyPI

From Source

Examples

NOTE

White paper

Contributing

Citation

About

Releases 5

Packages

Contributors 7

Languages

License

PRIDE-Archive/pridepy

Folders and files

Latest commit

History

Repository files navigation

pridepy: A Python package to download and search data from PRIDE database

Installation

From PyPI

From Source

Examples

NOTE

White paper

Contributing

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 5

Packages 0

Contributors 7

Languages

Packages