Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR adds scripts to enable bulk upload of CKAN site instance metadata to the POSE Ecosystem catalog. The implementation provides a CSV-based workflow for creating and updating site entries in the catalog.
- Adds a Python upload script with CSV parsing, API interaction, and validation logic
- Includes a template CSV file with example site data for different site types
- Provides comprehensive documentation covering field definitions, usage instructions, and troubleshooting
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| sites-upload/upload_sites.py | Main upload script handling CSV processing, CKAN API calls, field validation, and error handling |
| sites-upload/sites_metadata_template.csv | Template CSV with example site entries demonstrating various field configurations |
| sites-upload/config_sites.py | Configuration file for CKAN instance URL and API key |
| sites-upload/README.md | Documentation covering CSV format, field descriptions, usage instructions, and troubleshooting |
Comments suppressed due to low confidence (2)
sites-upload/upload_sites.py:114
- Variable response is not used.
response = action('organization_show', {'id': org_name})
sites-upload/upload_sites.py:24
- File is opened but is not closed.
open(os.path.abspath(file_dict.get('path')), 'rb'),
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| data_dict['upload'] = ( | ||
| file_dict.get('file_name'), | ||
| open(os.path.abspath(file_dict.get('path')), 'rb'), | ||
| 'application/octet-stream' | ||
| ) | ||
| fields = dict(data_dict) | ||
| m = MultipartEncoder(fields=fields) | ||
| r = requests.post( | ||
| ckanConfig.get('url') + '/api/action/' + action, | ||
| data=m, | ||
| headers={ | ||
| 'content-type': m.content_type, | ||
| 'Authorization': ckanConfig.get('api_key') | ||
| } | ||
| ) |
There was a problem hiding this comment.
File handle is never closed, leading to a resource leak. The opened file should be closed after the request completes. Consider using a context manager or explicitly closing the file after the request.
| data_dict['upload'] = ( | |
| file_dict.get('file_name'), | |
| open(os.path.abspath(file_dict.get('path')), 'rb'), | |
| 'application/octet-stream' | |
| ) | |
| fields = dict(data_dict) | |
| m = MultipartEncoder(fields=fields) | |
| r = requests.post( | |
| ckanConfig.get('url') + '/api/action/' + action, | |
| data=m, | |
| headers={ | |
| 'content-type': m.content_type, | |
| 'Authorization': ckanConfig.get('api_key') | |
| } | |
| ) | |
| file_path = os.path.abspath(file_dict.get('path')) | |
| with open(file_path, 'rb') as f: | |
| data_dict['upload'] = ( | |
| file_dict.get('file_name'), | |
| f, | |
| 'application/octet-stream' | |
| ) | |
| fields = dict(data_dict) | |
| m = MultipartEncoder(fields=fields) | |
| r = requests.post( | |
| ckanConfig.get('url') + '/api/action/' + action, | |
| data=m, | |
| headers={ | |
| 'content-type': m.content_type, | |
| 'Authorization': ckanConfig.get('api_key') | |
| } | |
| ) |
|
|
||
| try: | ||
| response = action('organization_show', {'id': org_name}) | ||
| return True |
There was a problem hiding this comment.
The response object is assigned but never checked. If the action() call raises no exception but returns a failed response (e.g., status code 404), this will incorrectly return True. Should check response status or success in the returned JSON.
| return True | |
| if response.status_code == 200: | |
| json_data = response.json() | |
| if json_data.get('success'): | |
| return True | |
| logger.warning(f"Organization {org_name} not found or API call failed: status={response.status_code}, response={response.text}") | |
| return False |
| print(r.json()) | ||
| print("\n") |
There was a problem hiding this comment.
Using print() statements for output is inconsistent with the logging framework used throughout the rest of the script. Should use logger.info() or logger.debug() instead for consistent logging.
| print(r.json()) | |
| print("\n") | |
| logger.info(r.json()) | |
| logger.info("\n") |
|
|
||
| Use decimal degrees format: | ||
| - Latitude: -90 to 90 (negative = South) | ||
| - longitude: -180 to 180 (negative = West) |
There was a problem hiding this comment.
Corrected capitalization of 'longitude' to match the consistent capitalization of 'Latitude' in line 159.
| - longitude: -180 to 180 (negative = West) | |
| - Longitude: -180 to 180 (negative = West) |
| ``` | ||
| Error: 403 Forbidden | ||
| ``` | ||
| **Solution:** Check API key in `config.py` |
There was a problem hiding this comment.
The filename reference is incorrect. The actual configuration file is named 'config_sites.py', not 'config.py'. This will confuse users trying to troubleshoot authentication errors.
| **Solution:** Check API key in `config.py` | |
| **Solution:** Check API key in `config_sites.py` |
| """ | ||
| Perform CKAN API action with optional file upload | ||
| """ | ||
| fields = data_dict |
There was a problem hiding this comment.
Add new scripts to upload ckan sites to the catalog