To Consider: Comprehensive Final Enhancements for Project Efficiency and Maintainability

### Some tasks to consider for the remaining time

- [x]  Implement Elasticsearch Scrolling for Pagination
- Add pagination for large datasets using Elasticsearch scrolling for @quemeb.
- Consider creating endpoints like scrollAnnotations, ScrollSNPsByChromosome, and ScrollSnpsById.
- Research and implement API security, possibly using an API Guard annotation.
- Make the scrollId an optional parameter and extend the Snp class to return a scrollId.
- I will explain below more detail
- [x]  Automate Purge of Downloads Folder
- Develop a cron job or equivalent to regularly clear the downloads folder.
- [x] Enhance Test Coverage
- Ensure test coverage includes fields like VEP_refseq_PANTHER_GO_SLIM_cellular_component_list_id.
- Add these values to your sample data to ensure comprehensive testing.
- [ ] Dynamic Column Handling
- Implement functionality to test variable column loading, allowing for the addition or removal of columns dynamically.
- This will start from your schema generation code
- [ ] API Documentation
- Research and implement a tool equivalent to Swagger for documenting APIs, including descriptions, required parameters, and optional parameters.
- [x]  Code Documentation
- If time allows, enhance code documentation using docstrings.
- Reference: https://testdriven.io/blog/documenting-python/
- [x] Something to consider, Standardize Coding Conventions
- Ensure consistent naming conventions across the codebase.
- Choose and enforce a standard naming convention (preferably snake_case for Python). sometimes it is 
GetSNPsByChromosome and sometimes it is search_by_chromosomes
- [x] Good Error Messages 

### Implementation flow idea Scrolling in Elasticsearch:

Scrolling in Elasticsearch allows you to retrieve large numbers of results from a query in multiple batches without the cost of deep pagination. It's suitable for processing large datasets that exceed typical pagination limits.

When a scroll query is initiated, Elasticsearch provides a scroll_id that you use to fetch the next batch of results. This scroll_id acts like a cursor pointing to a specific place in the dataset.

**Making scrollId an Optional Parameter:**

- Modify the endpoint that triggers the scrolling query to accept a scrollId as an optional query parameter.
- If a scrollId is provided, the API should continue fetching results from where the last batch ended.
- If no scrollId is provided, the API should start a new scroll session and return the initial batch of results along with a new scrollId.

**Extending the Snp Class:**
Subclass the Snp class to include a property that can return a scrollId associated with a query session.

**API and Code Adjustments:**
Adjust the API's logic to manage the lifecycle of a scroll session, including the expiration of scrollIds after a certain time (typically 1 minute by default in Elasticsearch, but configurable).
Implement error handling for cases when an expired or invalid scrollId is received.

tagging @akshala @huaiyumi 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

To Consider: Comprehensive Final Enhancements for Project Efficiency and Maintainability #32

Some tasks to consider for the remaining time

Implementation flow idea Scrolling in Elasticsearch:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

To Consider: Comprehensive Final Enhancements for Project Efficiency and Maintainability #32

Description

Some tasks to consider for the remaining time

Implementation flow idea Scrolling in Elasticsearch:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions