Open
Description
Problem
The current implementation of query_sql
in cratedb_mcp/__main__.py
attempts to restrict queries to read-only SELECT statements but uses a weak validation mechanism that can be bypassed.
As demonstrated in test_query_sql_permitted_exploit()
, the current validation only checks for the substring 'select' in the query string, which allows SQL injection attacks where the word 'select' is included in the values of an INSERT statement or other data modification queries.
Suggested Solution
Replace the naive string check with a more robust validation using a dedicated SQL parser library like sqlparse
:
import sqlparse
def query_sql(query: str) -> Dict[str, Any]:
# Parse the SQL statement
parsed = sqlparse.parse(query.strip())
# Check if it's a SELECT statement
if not parsed or parsed[0].get_type().upper() != 'SELECT':
raise ValueError('Only SELECT statements are allowed')
# Rest of the function...
Benefits of using sqlparse
:
- Properly handles SQL syntax
- Correctly identifies statement types
- More resilient against bypass attempts
- Handles edge cases like comments and whitespace
Implementation Steps
- Add
sqlparse
as a dependency inpyproject.toml
- Modify
query_sql
to usesqlparse
for statement validation - Update tests to verify the improved validation works correctly
References
- Identified in PR: Boilerplate: Add software tests and CI configuration #9
- Discussion: Boilerplate: Add software tests and CI configuration #9 (comment)
sqlparse
documentation: https://sqlparse.readthedocs.io/
Metadata
Metadata
Assignees
Labels
No labels