Skip to content

Commit

Permalink
Merge pull request #84 from ukaea/nathan/remove_ingestion_files
Browse files Browse the repository at this point in the history
Remove a whole bunch of stuff.
  • Loading branch information
NathanCummings authored Oct 23, 2024
2 parents 47af94f + 36d4d8e commit a2408ee
Show file tree
Hide file tree
Showing 68 changed files with 0 additions and 274,900 deletions.
2 changes: 0 additions & 2 deletions .flake8

This file was deleted.

50 changes: 0 additions & 50 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,23 +123,6 @@ The data path will be will be along the lines of `~/fair-mast/tests/mock_data/mi

This will run some unit tests for the REST and GraphQL APIs against a testing database, created from the data in `--data-path`.

### Uploading Data to the Minio Storage

First, follow the [instructions to install the minio client](https://min.io/docs/minio/linux/reference/minio-mc.html) tool.

Next, configure the endpoint location. The development minio installation runs at `localhost:9000` and has the following default username and password for development:

```bash
mc alias set srv http://localhost:9000 minio99 minio123;
```

Then you can copy data to the bucket using:

```bash
mc cp --recursive <path-to-data> srv/mast
```


### Production Deployment

To run the production container to start the postgres database, fastapi, and minio containers. This will also start an nginx proxy and make sure https is all setup
Expand All @@ -161,43 +144,10 @@ docker compose --env-file dev/docker/.env.dev -f dev/docker/docker-compose.yml

**Note** that every time you destory volumes, the production server will mint a new certificate for HTTPS. Lets Encrypt currently limits this to [5 per week](https://letsencrypt.org/docs/duplicate-certificate-limit/)

You'll need to ingest download and ingest the production data like so:

```bash
mkdir -p data/mast/meta
rsync -vaP <CSD3-USERNAME>@login.hpc.cam.ac.uk:/rds/project/rds-sPGbyCAPsJI/archive/metadata data/
```

```bash
docker exec -it mast-api python -m src.api.create /code/data/index
```

## Building Documentation

See the guide to building documentation [here](./docs/README.md)

## Ingestion to S3

The following section details how to ingest data into the s3 storage on freia with UDA.

1. SSH onto freia and setup a local development environment following the instuctions above.
2. Parse the metadata for all signals and sources for a list of shots with the following command

```sh
mpirun -n 16 python3 -m src.archive.create_uda_metadata data/uda campaign_shots/tiny_campaign.csv
```

This will create the metadata for the tiny campaign. You may do the same for full campaigns such as `M9`.

3. Run the ingestion pipleline by submitting the following job:

```sh
qsub ./jobs/freia_write_datasets.qsub campaign_shots/tiny_campaign.csv s3://mast/level1/shots
```

This will submit a job to the freia job queue that will ingest all of the shots in the tiny campaign and push them to the s3 bucket.





Loading

0 comments on commit a2408ee

Please sign in to comment.