Skip to content

Conversation

@mahmoudnasser1561
Copy link

What

  • Extract the InfluxQL DatabaseClient implementation into openwisp_monitoring/db/backends/influx_base.py.
  • Refactor the InfluxDB backend client (openwisp_monitoring/db/backends/influxdb/client.py) into a thin subclass.
  • No behavior changes expected; tests and ./run-qa-checks pass.

Why

  • The existing client logic is tightly coupled to the global TIMESERIES_DB configuration dict (import-time global) openwisp_monitoring/db/backends/init.py.

  • That makes future backends (e.g. InfluxDB 2.x in InfluxQL compatibility mode) awkward because they would either:
    duplicate the entire client implementation, or rely on mutating the global dict (hard to test and error-prone).

This refactor keeps current behavior but makes the client reusable by allowing instance-level configuration overrides via:

def __init__(self, db_name=None, options=None, **config):
    # config overrides (if provided), else fallback to TIMESERIES_DB
    ...

This enables:

  • passing backend-specific connection settings without touching global state,
  • passing OPTIONS directly to the client (UDP settings today; hard to add new backend support later),
  • adding influxdb2 backend reusing the same InfluxQL implementation with minimal duplication.

Checklist

  • I have read the OpenWISP Contributing Guidelines.
  • I have manually tested the changes proposed in this pull request.
  • I have written new test cases for new code and/or updated existing tests for changes to existing code.
    (refactor-only; existing test suite covers behavior)
  • I have updated the documentation.

Reference to Existing Issue

Closes #731

Description of Changes

This PR is a refactor-only change: it moves the InfluxQL client logic into a shared base class and keeps the InfluxDB backend client as a thin subclass. The goal is to reduce coupling to the global TIMESERIES_DB settings and avoid duplication when introducing an InfluxDB 2.x backend in InfluxQL compatibility mode.

Screenshot

@coderabbitai
Copy link

coderabbitai bot commented Jan 25, 2026

📝 Walkthrough

Walkthrough

This PR extracts the InfluxQL-compatible TSDB client logic into a new reusable class BaseInfluxDatabaseClient (openwisp_monitoring/db/backends/influx_base.py) and refactors the existing InfluxDB backend client to be a thin subclass that inherits from it (openwisp_monitoring/db/backends/influxdb/client.py). The base class implements connection/configuration, database lifecycle (create/drop, retention policies), write and batch-write paths (UDP/TCP routing, timestamp handling), read/query construction (dynamic InfluxQL SELECTs, validation, grouping, top-field extraction), and utility helpers. The change preserves backward compatibility and enables reuse for other InfluxQL-compatible backends.

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title directly and clearly summarizes the main change: extracting the InfluxQL base client implementation into a separate module.
Description check ✅ Passed The description covers the main changes, rationale, and closure of issue #731, though testing and documentation checkboxes are not fully completed.
Linked Issues check ✅ Passed The PR successfully implements all acceptance criteria from issue #731: extracts BaseInfluxDatabaseClient with InfluxQL functionality, refactors the InfluxDB client into a thin subclass with instance-level config override support via init, maintains backward compatibility, and achieves no behavior changes.
Out of Scope Changes check ✅ Passed All changes are directly aligned with issue #731 objectives: extracting the base client, refactoring the InfluxDB subclass, and supporting instance-level configuration overrides.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@openwisp_monitoring/db/backends/influx_base.py`:
- Around line 491-507: The block that handles existing "GROUP BY" assumes
re.search(self._group_by_time_tag_regex, query) returns a match and directly
uses matches.group(1), which will raise when the query has GROUP BY with tags
only (no time()); update the logic in the method containing this block to check
whether matches is not None before calling matches.group(1) and handle the
tag-only case (e.g., skip stripping time(), treat group_by_fields as empty, or
proceed without replacing the time function) so that when strip=True and there's
a GROUP BY without time() the code does not attempt to access group 1; refer to
the variables/methods _group_by_time_tag_regex, matches, group_by_fields and the
surrounding GROUP BY handling branch to locate where to add the guard and
alternative flow.
- Around line 120-166: The dbs cached_property currently reads udp_port from
self.options and performs arithmetic without validating type, causing a
TypeError if a string is provided; update the dbs logic (and any other place
reading udp_port such as use_udp if relevant) to normalize and validate udp_port
once by converting self.options.get("udp_port", 8089) to int (e.g., port =
int(self.options.get("udp_port", 8089))) and explicitly catch/raise a clear
ValueError on invalid input so the two InfluxDBClient instances use the
validated integer port (use that port and port + 1 for the short client) and
fail fast with a helpful message if conversion fails.
- Around line 295-381: The read() and _get_where_query() code interpolate
identifiers and string literals directly into InfluxQL; fix by adding a small
escaping helper (e.g., _escape_influx_value or reuse/extend _clean_value) that
backslash-escapes single quotes in string literals and a helper to quote/escape
identifiers (measurement/retention/field/tag keys) with double quotes; then
update read(): wrap retention policy and measurement in the identifier-quoting
helper when building from_clause, ensure field names inserted into fields list
are quoted with the identifier helper (while keeping DISTINCT()/COUNT() wrappers
intact), and pass tag values through the string-escaping helper when
constructing conditions; also update _get_where_query() to use the same helpers
for tag values and identifiers so all interpolated values are safely escaped.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a0566f7 and 8594af9.

📒 Files selected for processing (2)
  • openwisp_monitoring/db/backends/influx_base.py
  • openwisp_monitoring/db/backends/influxdb/client.py
🧰 Additional context used
🧬 Code graph analysis (2)
openwisp_monitoring/db/backends/influxdb/client.py (1)
openwisp_monitoring/db/backends/influx_base.py (1)
  • BaseInfluxDatabaseClient (31-584)
openwisp_monitoring/db/backends/influx_base.py (2)
openwisp_monitoring/utils.py (1)
  • retry (17-35)
openwisp_monitoring/db/exceptions.py (1)
  • TimeseriesWriteException (1-2)
🪛 Ruff (0.14.13)
openwisp_monitoring/db/backends/influx_base.py

42-70: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


71-71: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


73-82: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


170-173: Avoid specifying long messages outside the exception class

(TRY003)


240-240: Do not catch blind exception: Exception

(BLE001)


247-247: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


319-321: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


319-321: Avoid specifying long messages outside the exception class

(TRY003)


334-336: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


334-336: Avoid specifying long messages outside the exception class

(TRY003)


349-349: Possible SQL injection vector through string-based query construction

(S608)


374-377: Avoid specifying long messages outside the exception class

(TRY003)


389-389: Loop control variable measurement not used within loop body

Rename unused measurement to _measurement

(B007)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Python==3.12 | django~=5.2.0
  • GitHub Check: Python==3.13 | django~=5.2.0
  • GitHub Check: Python==3.12 | django~=5.1.0
  • GitHub Check: Python==3.12 | django~=4.2.0
  • GitHub Check: Python==3.10 | django~=4.2.0
  • GitHub Check: Python==3.13 | django~=5.1.0
  • GitHub Check: Python==3.11 | django~=4.2.0
  • GitHub Check: Python==3.10 | django~=5.2.0
  • GitHub Check: Python==3.11 | django~=5.2.0
  • GitHub Check: Python==3.11 | django~=5.1.0
  • GitHub Check: Python==3.10 | django~=5.1.0
🔇 Additional comments (9)
openwisp_monitoring/db/backends/influx_base.py (8)

42-105: Config override precedence is clear.
The explicit db_name/optionsconfigTIMESERIES_DB fallback order is easy to follow and preserves backward compatibility.


106-119: Create/drop database wrappers look good.
The retry-wrapped helpers are simple and consistent.


167-294: Core query/write helpers look consistent.
Operator validation, retention-policy handling, and batching logic are cohesive and readable.


383-421: List/retention/delete helpers look solid.
Result reshaping and retention-policy helpers are straightforward.


424-446: Validation + param normalization looks fine.
The forbidden-word check and parameter cleanup are clear and contained.


456-474: Chart query assembly is straightforward.
The composition of fields, grouping, and timezone suffixing is easy to follow.


517-553: Field substitution helpers look good.
The placeholder expansion and field aliasing logic are clean.


554-583: Top-fields selection looks fine.
The aggregation and ordering steps are straightforward.

openwisp_monitoring/db/backends/influxdb/client.py (1)

1-5: Thin subclass refactor looks good.
The backend stays minimal while delegating logic to the shared base.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines 120 to 133
@cached_property
def db(self):
"""Returns an ``InfluxDBClient`` instance."""
return self.dbs["default"]

@cached_property
def dbs(self):
dbs = {
"default": InfluxDBClient(
self.host,
self.port,
self.user,
self.password,
self.db_name,
use_udp=self.options.get("udp_writes", False),
udp_port=self.options.get("udp_port", 8089),
),
}
if self.options.get("udp_writes", False):
# When using UDP, InfluxDB allows only using one retention policy
# per port. Therefore, we need to have different instances of
# InfluxDBClient.
dbs["short"] = InfluxDBClient(
self.host,
self.port,
self.user,
self.password,
self.db_name,
use_udp=self.options.get("udp_writes", False),
udp_port=self.options.get("udp_port", 8089) + 1,
)
dbs["__all__"] = InfluxDBClient(
self.host,
self.port,
self.user,
self.password,
self.db_name,
)
else:
dbs["short"] = dbs["default"]
dbs["__all__"] = dbs["default"]
return dbs

@cached_property
def use_udp(self):
return self.options.get("udp_writes", False)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Validate udp_port before arithmetic.
When udp_writes is enabled, udp_port is incremented (+ 1). If overrides pass it as a string (common from env/config), this will raise a TypeError at runtime. Normalize to int once and fail fast with a clearer error.

🛠️ Suggested fix
 `@cached_property`
 def dbs(self):
+    udp_port = self.options.get("udp_port", 8089)
+    try:
+        udp_port = int(udp_port)
+    except (TypeError, ValueError):
+        raise ValueError("udp_port must be an integer")
     dbs = {
         "default": InfluxDBClient(
             self.host,
             self.port,
             self.user,
             self.password,
             self.db_name,
             use_udp=self.options.get("udp_writes", False),
-            udp_port=self.options.get("udp_port", 8089),
+            udp_port=udp_port,
         ),
     }
     if self.options.get("udp_writes", False):
         # ...
         dbs["short"] = InfluxDBClient(
             self.host,
             self.port,
             self.user,
             self.password,
             self.db_name,
             use_udp=self.options.get("udp_writes", False),
-            udp_port=self.options.get("udp_port", 8089) + 1,
+            udp_port=udp_port + 1,
         )
🤖 Prompt for AI Agents
In `@openwisp_monitoring/db/backends/influx_base.py` around lines 120 - 166, The
dbs cached_property currently reads udp_port from self.options and performs
arithmetic without validating type, causing a TypeError if a string is provided;
update the dbs logic (and any other place reading udp_port such as use_udp if
relevant) to normalize and validate udp_port once by converting
self.options.get("udp_port", 8089) to int (e.g., port =
int(self.options.get("udp_port", 8089))) and explicitly catch/raise a clear
ValueError on invalid input so the two InfluxDBClient instances use the
validated integer port (use that port and port + 1 for the short client) and
fail fast with a helpful message if conversion fails.

Comment on lines +295 to +348
def read(self, key, fields, tags, **kwargs):
extra_fields = kwargs.get("extra_fields")
since = kwargs.get("since")
order = kwargs.get("order")
limit = kwargs.get("limit")
distinct_fields = kwargs.get("distinct_fields", [])
count_fields = kwargs.get("count_fields", [])
rp = kwargs.get("retention_policy")
where = kwargs.get("where", [])
precision = kwargs.get("precision", "s")

# Ensure fields is a list (in case it's passed as a string)
if isinstance(fields, str):
fields = [fields]
else:
fields = list(fields)

# Apply DISTINCT to fields
if distinct_fields:
for field in distinct_fields:
try:
field_index = fields.index(field)
fields[field_index] = f"DISTINCT({field})"
except ValueError:
raise self.client_error(
f'Cannot perform DISTINCT on "{field}" as it is not present in the fields list.'
)

# Ensure COUNT(DISTINCT) if both DISTINCT and COUNT are applied
if field in count_fields:
count_fields[count_fields.index(field)] = f"DISTINCT({field})"

# Apply COUNT to fields
if count_fields:
for field in count_fields:
try:
field_index = fields.index(field)
fields[field_index] = f"COUNT({field})"
except ValueError:
raise self.client_error(
f'Cannot perform COUNT on "{field}" as it is not present in the fields list.'
)

# Handle extra fields and ensure proper formatting
if extra_fields and extra_fields != "*":
fields = fields + extra_fields
elif extra_fields == "*":
fields = ["*"]

# Construct SELECT clause
fields_clause = ", ".join(fields)

# Construct the FROM clause with retention policy if provided
from_clause = f"{rp}.{key}" if rp else key
q = f"SELECT {fields_clause} FROM {from_clause}"

conditions = []
for field, op, value in where:
op = self._clean_operator(op)
value = self._clean_value(value)
conditions.append(f"{field} {op} {value}")
# Add conditions (time and tags)
if since:
timestamp = self._get_timestamp(since)
conditions.append(f"time >= '{timestamp}'")
if tags:
conditions.append(
" AND ".join(["{0} = '{1}'".format(*tag) for tag in tags.items()])
)
if conditions:
conditions = "WHERE %s" % " AND ".join(conditions)
q = f"{q} {conditions}"
if order:
# InfluxDB only allows ordering results by time
if order == "time":
order = "time ASC"
elif order == "-time":
order = "time DESC"
else:
raise self.client_error(
f'Invalid order "{order}" passed.\nYou may pass "time" / "-time" to get '
"result sorted in ascending /descending order respectively."
)
q = f"{q} ORDER BY {order}"
if limit:
q = f"{q} LIMIT {limit}"
return list(self.query(q, precision=precision).get_points())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, find call sites of .read() method (excluding tests)
echo "=== Call sites of .read() method ==="
rg -nP -C3 '\.read\s*\(' --type=py -g '!**/tests/**' | head -100

echo ""
echo "=== Examining the influx_base.py file structure ==="
wc -l openwisp_monitoring/db/backends/influx_base.py

echo ""
echo "=== Lines around 448-455 (likely _get_where_query) ==="
sed -n '445,460p' openwisp_monitoring/db/backends/influx_base.py

Repository: openwisp/openwisp-monitoring

Length of output: 9021


🏁 Script executed:

#!/bin/bash
# Check if there's input validation/escaping functions
echo "=== Searching for escape/validation functions ==="
rg -nP 'def.*escape|def.*clean|def.*validate|def.*sanitize' openwisp_monitoring/db/backends/influx_base.py

echo ""
echo "=== Full read method to understand parameter handling ==="
sed -n '295,381p' openwisp_monitoring/db/backends/influx_base.py | head -100

Repository: openwisp/openwisp-monitoring

Length of output: 3810


🏁 Script executed:

#!/bin/bash
# Look for _get_where_query method
echo "=== Searching for _get_where_query method ==="
rg -nP 'def _get_where_query' openwisp_monitoring/db/backends/influx_base.py -A 10

echo ""
echo "=== Check for any cleaning/escaping of values ==="
rg -nP '_clean_' openwisp_monitoring/db/backends/influx_base.py

Repository: openwisp/openwisp-monitoring

Length of output: 781


🏁 Script executed:

#!/bin/bash
# Get the implementation of _clean_value and _clean_operator
echo "=== _clean_value and _clean_operator implementations ==="
sed -n '167,195p' openwisp_monitoring/db/backends/influx_base.py

echo ""
echo "=== views.py context around chart.read() call ==="
sed -n '85,110p' openwisp_monitoring/views.py

echo ""
echo "=== How tags are passed in models.py ==="
sed -n '495,510p' openwisp_monitoring/monitoring/base/models.py

Repository: openwisp/openwisp-monitoring

Length of output: 2984


🏁 Script executed:

#!/bin/bash
# Check for any InfluxQL escaping or validation utilities
echo "=== Search for escaping functions in the codebase ==="
rg -i 'escape|quote' openwisp_monitoring/db/backends/influx_base.py

echo ""
echo "=== Search for input validation patterns ==="
sed -n '424,445p' openwisp_monitoring/db/backends/influx_base.py

Repository: openwisp/openwisp-monitoring

Length of output: 1047


🏁 Script executed:

#!/bin/bash
# Check where metric.key and tags come from - are they user-controlled?
echo "=== Check what metric.key and tags are (from models) ==="
rg -nP 'self\.key\s*=' openwisp_monitoring/monitoring/base/models.py | head -20

echo ""
echo "=== Check if tags are from user input ==="
sed -n '455,580p' openwisp_monitoring/db/backends/influx_base.py | grep -A5 -B5 'tags'

Repository: openwisp/openwisp-monitoring

Length of output: 310


🏁 Script executed:

#!/bin/bash
# Check where metric.tags come from
echo "=== Check metric.tags definition ==="
rg -nP 'def tags|self\.tags' openwisp_monitoring/monitoring/base/models.py | head -30

echo ""
echo "=== Check Metric model for tag handling ==="
sed -n '1,100p' openwisp_monitoring/monitoring/base/models.py | grep -A10 'class.*Metric'

echo ""
echo "=== Check if views.py filters are user-controlled ==="
sed -n '1,100p' openwisp_monitoring/views.py | head -50

Repository: openwisp/openwisp-monitoring

Length of output: 2577


🏁 Script executed:

#!/bin/bash
# Look for where inputs come from - check query parameters
echo "=== Check view parameters and inputs ==="
rg -nP 'request\.(GET|POST|query_params)' openwisp_monitoring/views.py -B2 -A2

echo ""
echo "=== Check if key/metric name can be user input ==="
rg -nP 'key.*=.*request' openwisp_monitoring/ -r py

echo ""
echo "=== Check _OPERATORS to understand what's validated ==="
rg -nP '_OPERATORS\s*=' openwisp_monitoring/db/backends/influx_base.py -A10

Repository: openwisp/openwisp-monitoring

Length of output: 1489


🏁 Script executed:

#!/bin/bash
# Check the full path from view to database - is there any input sanitization?
echo "=== Check what parameters are passed through chart.read ==="
sed -n '456,500p' openwisp_monitoring/db/backends/influx_base.py

echo ""
echo "=== Check if chart configuration is user-defined ==="
rg -nP 'class.*Chart' openwisp_monitoring/monitoring/base/models.py | head -5

Repository: openwisp/openwisp-monitoring

Length of output: 1983


🏁 Script executed:

#!/bin/bash
# Check where metric.tags comes from - implementation details
echo "=== Full tags property implementation ==="
sed -n '260,280p' openwisp_monitoring/monitoring/base/models.py

echo ""
echo "=== Check if Chart configuration is user-editable ==="
rg -nP 'class AbstractChart' openwisp_monitoring/monitoring/base/models.py -A 30 | head -50

echo ""
echo "=== Check if tags can be user input ==="
rg -nP 'tags.*=' openwisp_monitoring/monitoring/base/models.py | head -20

Repository: openwisp/openwisp-monitoring

Length of output: 2566


🏁 Script executed:

#!/bin/bash
# Check for any API endpoints that allow custom keys or metric names
echo "=== Check device API for chart/metric endpoints ==="
sed -n '260,280p' openwisp_monitoring/device/api/views.py

echo ""
echo "=== Look for any user-controllable metric creation ==="
rg -nP 'key.*request|tags.*request' openwisp_monitoring/ -t py

Repository: openwisp/openwisp-monitoring

Length of output: 1187


🏁 Script executed:

#!/bin/bash
# Let's trace what happens with the 'where' parameter in read()
echo "=== Check how 'where' parameter is passed to read() ==="
rg -nP '\.read\(' openwisp_monitoring/monitoring/base/models.py -B3 -A3

echo ""
echo "=== Check chart.read() to see where parameters come from ==="
rg -nP 'def read' openwisp_monitoring/monitoring/base/models.py -A 5 | head -40

Repository: openwisp/openwisp-monitoring

Length of output: 1864


🏁 Script executed:

#!/bin/bash
# Final check: look at the actual SQL queries being constructed
echo "=== Check if any direct query injection is possible ==="
rg -nP 'where.*=' openwisp_monitoring/monitoring/base/models.py | head -15

echo ""
echo "=== Check the key constraint more carefully ==="
sed -n '60,120p' openwisp_monitoring/monitoring/base/models.py

Repository: openwisp/openwisp-monitoring

Length of output: 2412


🏁 Script executed:

#!/bin/bash
# Check where retention_policy comes from in read() calls
echo "=== Check retention_policy usage ==="
rg -nP 'retention_policy' openwisp_monitoring/monitoring/base/models.py -B2 -A2 | head -40

echo ""
echo "=== Check if retention_policy is user input ==="
rg -nP 'retention_policy.*request' openwisp_monitoring/ -t py

echo ""
echo "=== Final check: look at _clean_value to confirm it doesn't escape quotes ==="
sed -n '176,186p' openwisp_monitoring/db/backends/influx_base.py

Repository: openwisp/openwisp-monitoring

Length of output: 2173


🏁 Script executed:

#!/bin/bash
# Check if there's any quote escaping mechanism anywhere
echo "=== Search for any quote escaping ==="
rg -i "escape.*quote|quote.*escape|replace.*quote|sanitize" openwisp_monitoring/db/backends/influx_base.py

echo ""
echo "=== Check if field names from where clause are safe ==="
sed -n '1000,1010p' openwisp_monitoring/monitoring/base/models.py

echo ""
echo "=== Summary check: are tag values ever directly from user input? ==="
rg -nP 'tags\s*=.*request' openwisp_monitoring/ -t py

Repository: openwisp/openwisp-monitoring

Length of output: 691


🏁 Script executed:

#!/bin/bash
# Check if tag values can contain special characters by examining how they're set
echo "=== Check how main_tags and extra_tags are set ==="
rg -nP 'main_tags\s*=' openwisp_monitoring/ -t py -B2 -A2 | head -50

echo ""
echo "=== Check if there's admin interface for editing tags ==="
fd -t f admin.py openwisp_monitoring/ -x grep -l 'main_tags\|extra_tags' {}

echo ""
echo "=== Check API endpoints for metric creation/update ==="
rg -nP 'class.*Metric.*ViewSet|class.*Metric.*APIView' openwisp_monitoring/ -t py

Repository: openwisp/openwisp-monitoring

Length of output: 4064


🏁 Script executed:

#!/bin/bash
# Verify if JSON values could contain quotes
echo "=== Verify that JSONField values are not validated against special chars ==="
python3 << 'EOF'
import json
# JSONField stores arbitrary JSON, which can include quotes in string values
test_tags = {
    "device": "router'1",  # single quote
    "interface": 'eth"0',  # double quote
    "location": "'; DROP TABLE metrics; --"
}
print("Example dangerous tag values JSON can hold:")
print(json.dumps(test_tags, indent=2))
print("\nWhen interpolated without escaping:")
for k, v in test_tags.items():
    print(f"{k} = '{v}'")  # This would create invalid InfluxQL
EOF

echo ""
echo "=== Check the actual vulnerability in _get_where_query ==="
sed -n '448,454p' openwisp_monitoring/db/backends/influx_base.py

Repository: openwisp/openwisp-monitoring

Length of output: 158


🏁 Script executed:

#!/bin/bash
# Check if there's an API to modify tags
echo "=== Check for API endpoints that could modify metrics ==="
fd -t f 'serializers.py' openwisp_monitoring/ -x grep -l 'Metric'

echo ""
echo "=== Look for admin.py that exposes metric editing ==="
fd -t f 'admin.py' openwisp_monitoring/ | head -5

echo ""
echo "=== Final verification: does _clean_value escape quotes? ==="
sed -n '167,186p' openwisp_monitoring/db/backends/influx_base.py

Repository: openwisp/openwisp-monitoring

Length of output: 1191


🏁 Script executed:

#!/bin/bash
# Comprehensive analysis of the vulnerability
python3 << 'EOF'
# Analyze the vulnerability
print("=== VULNERABILITY ANALYSIS ===\n")

print("1. INTERPOLATION WITHOUT ESCAPING:")
print("   Line 348: f\"{rp}.{key}\" - rp and key directly interpolated")
print("   Line 366: \"{0} = '{1}'\".format(*tag) - tag values directly interpolated")
print("   Line 453: f\"{field} = '{item}'\" - items directly interpolated")
print()

print("2. _clean_value() DOES NOT ESCAPE:")
print("   Only handles datetime conversion, returns non-datetime values as-is")
print("   Does NOT add quotes or escape special characters")
print()

print("3. _clean_operator() ONLY VALIDATES AGAINST ALLOWLIST:")
print("   Validates from _OPERATORS list")
print("   Does NOT escape field names")
print()

print("4. TAG SOURCES:")
print("   main_tags: JSONField from database model")
print("   extra_tags: JSONField from database model")
print("   object_id: CharField from database")
print("   These can theoretically contain single quotes if set via admin or API")
print()

print("5. EXAMPLE INJECTION:")
tag_with_quote = "device'; DROP TABLE *.* FROM *; --"
dangerous_query = f"SELECT value FROM traffic WHERE device = '{tag_with_quote}'"
print(f"   If tag contains: {tag_with_quote}")
print(f"   Query becomes: {dangerous_query}")
print(f"   This breaks the InfluxQL query syntax")
print()

print("6. RISK LEVEL:")
print("   - Current call sites appear safe (metrics/tags from database)")
print("   - But NO escaping means future code changes could introduce vulnerability")
print("   - Defense-in-depth: code should always escape values")
EOF

Repository: openwisp/openwisp-monitoring

Length of output: 1266


Add escaping for InfluxQL field/tag values and retention policy names.

The read() method (lines 295–381) and _get_where_query() method (lines 448–454) interpolate measurement names, field names, tag values, and retention policy names directly into InfluxQL without escaping. While _clean_value() validates operators and converts datetimes, it does not escape special characters like single quotes. Tag values originate from JSONField columns (main_tags, extra_tags) which can store arbitrary strings, and the admin interface allows modifying these values.

Even though current call sites retrieve metrics from the database (appearing trusted), the absence of escaping violates defense-in-depth principles. If tag values contain single quotes (e.g., via admin edits), queries will break. To prevent future vulnerabilities, apply proper InfluxQL escaping: backslash-escape single quotes in tag/field values, or use parameterized query methods if the InfluxDB client supports them.

Also applies to: lines 448–454 (_get_where_query())

🧰 Tools
🪛 Ruff (0.14.13)

319-321: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


319-321: Avoid specifying long messages outside the exception class

(TRY003)


334-336: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


334-336: Avoid specifying long messages outside the exception class

(TRY003)


349-349: Possible SQL injection vector through string-based query construction

(S608)


374-377: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents
In `@openwisp_monitoring/db/backends/influx_base.py` around lines 295 - 381, The
read() and _get_where_query() code interpolate identifiers and string literals
directly into InfluxQL; fix by adding a small escaping helper (e.g.,
_escape_influx_value or reuse/extend _clean_value) that backslash-escapes single
quotes in string literals and a helper to quote/escape identifiers
(measurement/retention/field/tag keys) with double quotes; then update read():
wrap retention policy and measurement in the identifier-quoting helper when
building from_clause, ensure field names inserted into fields list are quoted
with the identifier helper (while keeping DISTINCT()/COUNT() wrappers intact),
and pass tag values through the string-escaping helper when constructing
conditions; also update _get_where_query() to use the same helpers for tag
values and identifiers so all interpolated values are safely escaped.

Comment on lines +491 to +474
if "GROUP BY" not in query.upper():
query = f"{query} {group_by}"
else:
# The query could have GROUP BY clause for a TAG
if group_by:
# The query already contains "GROUP BY", therefore
# we remove it from the "group_by" to avoid duplicating
# "GROUP BY"
group_by = group_by.replace("GROUP BY ", "")
# We only need to substitute the time function.
# The resulting query would be "GROUP BY time(<group_by>), <tag>"
query = re.sub(self._time_regex, group_by, query)
else:
# The query should not include the "GROUP by time()"
matches = re.search(self._group_by_time_tag_regex, query)
group_by_fields = matches.group(1)
if len(group_by_fields.split(",")) > 1:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Guard against GROUP BY without time() when stripping.
When strip=True and the query has GROUP BY without a time() clause, re.search(...) returns None, and matches.group(1) will raise. A small guard avoids runtime errors for tag-only grouping.

🛡️ Suggested guard
-                matches = re.search(self._group_by_time_tag_regex, query)
-                group_by_fields = matches.group(1)
+                matches = re.search(self._group_by_time_tag_regex, query)
+                if not matches:
+                    return query
+                group_by_fields = matches.group(1)
🤖 Prompt for AI Agents
In `@openwisp_monitoring/db/backends/influx_base.py` around lines 491 - 507, The
block that handles existing "GROUP BY" assumes
re.search(self._group_by_time_tag_regex, query) returns a match and directly
uses matches.group(1), which will raise when the query has GROUP BY with tags
only (no time()); update the logic in the method containing this block to check
whether matches is not None before calling matches.group(1) and handle the
tag-only case (e.g., skip stripping time(), treat group_by_fields as empty, or
proceed without replacing the time function) so that when strip=True and there's
a GROUP BY without time() the code does not attempt to access group 1; refer to
the variables/methods _group_by_time_tag_regex, matches, group_by_fields and the
surrounding GROUP BY handling branch to locate where to add the guard and
alternative flow.

Move the InfluxQL DatabaseClient implementation into a shared base class and make the InfluxDB backend client a thin subclass.

This refactor paves the way for adding an InfluxDB 2.x backend in InfluxQL compatibility mode without duplicating the client implementation.

No behavior changes expected; tests and ./run-qa-checks pass.
@mahmoudnasser1561 mahmoudnasser1561 force-pushed the gsoc/extract-influxql-base branch from 8594af9 to 5db9f66 Compare January 25, 2026 18:50
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@openwisp_monitoring/db/backends/influx_base.py`:
- Around line 92-128: The dbs cached_property currently reads configuration from
the global TIMESERIES_DB causing instance-level overrides to be ignored; update
the dbs method to use instance attributes (e.g., self.host, self.port,
self.user, self.password, self.options) instead of TIMESERIES_DB and derive
udp_writes/udp_port from self.options, while still constructing InfluxDBClient
instances (default, short, __all__) and using self.db_name so per-instance
configs passed via __init__ take effect.
- Around line 68-71: Update the InfluxBase backend to accept instance-level
overrides by changing the __init__ signature to __init__(self, db_name=None,
options=None, **config) and store instance attributes (e.g., self.host,
self.port, self.user, self.password, self.options) defaulting to TIMESERIES_DB
values when not provided; then modify the dbs property (property name: dbs) to
read from these instance attributes (self.host, self.port, self.user,
self.password, self.options) instead of directly accessing the global
TIMESERIES_DB so that per-instance configuration passed via __init__ is honored.
♻️ Duplicate comments (3)
openwisp_monitoring/db/backends/influx_base.py (3)

327-330: Tag values interpolated without escaping.

Tag values from the tags dict are interpolated directly into InfluxQL using "{0} = '{1}'".format(*tag). If a tag value contains a single quote, this will break the query or potentially allow injection. This concern was raised in a prior review.


415-421: _get_where_query also lacks escaping for item values.

Similar to the read() method, values in items are interpolated directly without escaping single quotes, which could break queries or allow injection if values contain special characters.


470-481: Guard against GROUP BY without time() when stripping.

When strip=True and the query has GROUP BY without a time() clause, re.search(...) returns None, and matches.group(1) will raise AttributeError. This was flagged in a prior review.

🧹 Nitpick comments (3)
openwisp_monitoring/db/backends/influx_base.py (3)

207-214: Chain exception for better traceability.

When re-raising TimeseriesWriteException, chain it to preserve the original stack trace.

♻️ Suggested fix
-            raise TimeseriesWriteException
+            raise TimeseriesWriteException from exception

354-357: Rename unused loop variable measurement.

The measurement variable from tuple unpacking is not used in the loop body.

♻️ Suggested fix
-        for (measurement, tag), group_points in result.items():
+        for (_measurement, tag), group_points in result.items():

282-303: Chain exceptions when re-raising in DISTINCT/COUNT handling.

When catching ValueError and raising self.client_error, chain the exception to preserve context.

♻️ Suggested fix
                 except ValueError:
                     raise self.client_error(
                         f'Cannot perform DISTINCT on "{field}" as it is not present in the fields list.'
-                    )
+                    ) from None

Apply similarly to the COUNT block at lines 300-303.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8594af9 and 5db9f66.

📒 Files selected for processing (2)
  • openwisp_monitoring/db/backends/influx_base.py
  • openwisp_monitoring/db/backends/influxdb/client.py
🧰 Additional context used
🧬 Code graph analysis (2)
openwisp_monitoring/db/backends/influxdb/client.py (1)
openwisp_monitoring/db/backends/influx_base.py (2)
  • db (88-90)
  • BaseInfluxDatabaseClient (25-551)
openwisp_monitoring/db/backends/influx_base.py (2)
openwisp_monitoring/utils.py (1)
  • retry (17-35)
openwisp_monitoring/db/exceptions.py (1)
  • TimeseriesWriteException (1-2)
🪛 Ruff (0.14.13)
openwisp_monitoring/db/backends/influx_base.py

26-54: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


55-55: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


57-66: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


137-140: Avoid specifying long messages outside the exception class

(TRY003)


207-207: Do not catch blind exception: Exception

(BLE001)


214-214: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


286-288: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


286-288: Avoid specifying long messages outside the exception class

(TRY003)


301-303: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


301-303: Avoid specifying long messages outside the exception class

(TRY003)


316-316: Possible SQL injection vector through string-based query construction

(S608)


341-344: Avoid specifying long messages outside the exception class

(TRY003)


356-356: Loop control variable measurement not used within loop body

Rename unused measurement to _measurement

(B007)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Python==3.13 | django~=5.1.0
  • GitHub Check: Python==3.12 | django~=5.2.0
  • GitHub Check: Python==3.10 | django~=4.2.0
  • GitHub Check: Python==3.10 | django~=5.1.0
  • GitHub Check: Python==3.10 | django~=5.2.0
  • GitHub Check: Python==3.11 | django~=4.2.0
  • GitHub Check: Python==3.12 | django~=4.2.0
  • GitHub Check: Python==3.13 | django~=5.2.0
  • GitHub Check: Python==3.11 | django~=5.1.0
  • GitHub Check: Python==3.12 | django~=5.1.0
  • GitHub Check: Python==3.11 | django~=5.2.0
🔇 Additional comments (4)
openwisp_monitoring/db/backends/influx_base.py (3)

1-23: Imports and module setup look good.

Standard library and Django imports are well-organized. The @retry decorator import from openwisp_monitoring.utils integrates with the existing retry infrastructure.


216-260: Write methods are well-structured.

The _get_timestamp, write, and batch_write methods properly handle timestamp formatting and organize data points by database and retention policy before writing.


521-550: _get_top_fields implementation is correct.

The method properly queries for summed fields, filters out None values, sorts by value, and returns the top N field names with the sum_ prefix removed.

openwisp_monitoring/db/backends/influxdb/client.py (1)

1-11: Clean minimal subclass for backward compatibility.

The DatabaseClient class correctly inherits from BaseInfluxDatabaseClient and serves as a thin wrapper to maintain backward compatibility with existing imports and the backend loader. The docstring clearly explains its purpose.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +68 to +71
def __init__(self, db_name=None):
self._db = None
self.db_name = db_name or TIMESERIES_DB["NAME"]
self.client_error = InfluxDBClientError
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

__init__ signature doesn't match stated PR objectives.

The PR description and linked issue #731 explicitly state the goal is to support "instance-level configuration overrides via an __init__(self, db_name=None, options=None, **config) signature." However, the current implementation only accepts db_name.

Combined with the dbs property (lines 92-128) reading directly from the global TIMESERIES_DB, this means instance-level overrides for host, port, user, password, and OPTIONS aren't actually supported yet.

🛠️ Suggested signature to align with PR objectives
-    def __init__(self, db_name=None):
-        self._db = None
-        self.db_name = db_name or TIMESERIES_DB["NAME"]
-        self.client_error = InfluxDBClientError
+    def __init__(self, db_name=None, options=None, **config):
+        self._db = None
+        self.db_name = db_name or TIMESERIES_DB["NAME"]
+        self.host = config.get("host") or TIMESERIES_DB["HOST"]
+        self.port = config.get("port") or TIMESERIES_DB["PORT"]
+        self.user = config.get("user") or TIMESERIES_DB["USER"]
+        self.password = config.get("password") or TIMESERIES_DB["PASSWORD"]
+        self.options = options if options is not None else TIMESERIES_DB.get("OPTIONS", {})
+        self.client_error = InfluxDBClientError

Then update dbs to use self.host, self.port, etc. instead of TIMESERIES_DB[...].

🤖 Prompt for AI Agents
In `@openwisp_monitoring/db/backends/influx_base.py` around lines 68 - 71, Update
the InfluxBase backend to accept instance-level overrides by changing the
__init__ signature to __init__(self, db_name=None, options=None, **config) and
store instance attributes (e.g., self.host, self.port, self.user, self.password,
self.options) defaulting to TIMESERIES_DB values when not provided; then modify
the dbs property (property name: dbs) to read from these instance attributes
(self.host, self.port, self.user, self.password, self.options) instead of
directly accessing the global TIMESERIES_DB so that per-instance configuration
passed via __init__ is honored.

Comment on lines +92 to +128
@cached_property
def dbs(self):
dbs = {
"default": InfluxDBClient(
TIMESERIES_DB["HOST"],
TIMESERIES_DB["PORT"],
TIMESERIES_DB["USER"],
TIMESERIES_DB["PASSWORD"],
self.db_name,
use_udp=TIMESERIES_DB.get("OPTIONS", {}).get("udp_writes", False),
udp_port=TIMESERIES_DB.get("OPTIONS", {}).get("udp_port", 8089),
),
}
if TIMESERIES_DB.get("OPTIONS", {}).get("udp_writes", False):
# When using UDP, InfluxDB allows only using one retention policy
# per port. Therefore, we need to have different instances of
# InfluxDBClient.
dbs["short"] = InfluxDBClient(
TIMESERIES_DB["HOST"],
TIMESERIES_DB["PORT"],
TIMESERIES_DB["USER"],
TIMESERIES_DB["PASSWORD"],
self.db_name,
use_udp=TIMESERIES_DB.get("OPTIONS", {}).get("udp_writes", False),
udp_port=TIMESERIES_DB.get("OPTIONS", {}).get("udp_port", 8089) + 1,
)
dbs["__all__"] = InfluxDBClient(
TIMESERIES_DB["HOST"],
TIMESERIES_DB["PORT"],
TIMESERIES_DB["USER"],
TIMESERIES_DB["PASSWORD"],
self.db_name,
)
else:
dbs["short"] = dbs["default"]
dbs["__all__"] = dbs["default"]
return dbs
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

dbs reads from global state, preventing instance-level overrides.

This property hardcodes reads from TIMESERIES_DB global rather than instance attributes. If __init__ is updated to accept config overrides, this property should use self.host, self.port, self.user, self.password, and self.options instead.

🤖 Prompt for AI Agents
In `@openwisp_monitoring/db/backends/influx_base.py` around lines 92 - 128, The
dbs cached_property currently reads configuration from the global TIMESERIES_DB
causing instance-level overrides to be ignored; update the dbs method to use
instance attributes (e.g., self.host, self.port, self.user, self.password,
self.options) instead of TIMESERIES_DB and derive udp_writes/udp_port from
self.options, while still constructing InfluxDBClient instances (default, short,
__all__) and using self.db_name so per-instance configs passed via __init__ take
effect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor InfluxQL TSDB client to reduce global config coupling and enable InfluxDB 2.x compatibility backend

1 participant