Skip to content

[#570] Prevent segfault when accessing scalar config values with dotted keys #580

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Userfrom1995
Copy link
Contributor

Previously, querying a scalar configuration value (e.g., "port" or "max_connections") with a dotted key (like "port.a") would cause a segmentation fault. This happened because the JSON API returned a scalar value cast as a pointer, which was then incorrectly treated as a JSON object.

This commit adds a defensive check:
if ((uintptr_t)configuration_js < 0x1000)
to ensure that only valid JSON object pointers are passed to the iterator logic. If a scalar or invalid pointer is detected, the code now gracefully returns an error instead of crashing.

Before Changes:

base@LAPTOP-409T8FMM:~/pgagroal/build/src$ ./pgagroal-cli conf get port.a
AddressSanitizer:DEADLYSIGNAL
=================================================================
==511894==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000929 (pc 0x7f4f2bda0d06 bp 0x7fffb21ca380 sp 0x7fffb21ca350 T0)
==511894==The signal is caused by a READ memory access.
==511894==Hint: address points to the zero page.
    #0 0x7f4f2bda0d06 in pgagroal_json_iterator_create /home/base/pgagroal/src/libpgagroal/json.c:234
    #1 0x55dd97c3297c in get_config_key_result /home/base/pgagroal/src/cli.c:1695
    #2 0x55dd97c304fe in process_get_result /home/base/pgagroal/src/cli.c:1393
    #3 0x55dd97c2f8c7 in conf_get /home/base/pgagroal/src/cli.c:1265
    #4 0x55dd97c2dbb8 in main /home/base/pgagroal/src/cli.c:810
    #5 0x7f4f2b39e1c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #6 0x7f4f2b39e28a in __libc_start_main_impl ../csu/libc-start.c:360
    #7 0x55dd97c28ed4 in _start (/home/base/pgagroal/build/src/pgagroal-cli+0x11ed4) (BuildId: e996bc25e118f998a6d30b30cf98bc58ed976b2f)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /home/base/pgagroal/src/libpgagroal/json.c:234 in pgagroal_json_iterator_create
==511894==ABORTING

After Changes :

base@LAPTOP-409T8FMM:~/pgagroal/build/src$ ./pgagroal-cli conf get port.a
Error
2025-06-30 14:18:01 TRACE network.c:428 pgagroal_disconnect: fd=3

@fluca1978
Copy link
Collaborator

Do we have a better approach to understand if a result is an invalid object?

@Userfrom1995
Copy link
Contributor Author

I’m not sure if it's necessarily better, but I can think of a few more possible approaches to handle this:

  • Type Checking During Iteration
    Iterate over the parent JSON and check each value’s type before accessing it. This avoids trying to iterate or dereference scalar values. ( the one that i just pushed ).

  • Wrap Scalars in a Uniform JSON Structure
    Adjust the JSON API response so even scalar values are wrapped in a consistent structure (e.g., { "type": "scalar", "value": 42 }). This makes type checks safer and more predictable.

  • Refactor the CLI Logic
    Update the CLI’s get logic to explicitly handle scalars vs. objects, avoiding generic iteration and handling each case appropriately.

@jesperpedersen
Copy link
Collaborator

@Jubilee101 Cc:

@fluca1978
Copy link
Collaborator

* Wrap Scalars in a Uniform JSON Structure
  Adjust the JSON API response so even scalar values are wrapped in a consistent structure (e.g., { "type": "scalar", "value": 42 }). This makes type checks safer and more predictable.

This is what we should go for, because it makes the interface consistent (everything is an object) and stable.
Thoughts?

@Jubilee101
Copy link
Collaborator

This is what we should go for, because it makes the interface consistent (everything is an object) and stable.
Thoughts?

Second this. The value type system is designed on the assumption that users know what types they had put in when they try reading it out. So all the outcome should perhaps be json.

@Userfrom1995
Copy link
Contributor Author

I explored the idea of modifying the implementation to wrap scalar values before returning them from pgagroal_json_get(), so that we always return a JSON object rather than a raw scalar. Initially, this sounded like a clean way to handle things more uniformly and avoid issues like the segmentation fault we saw.

But after looking into it more deeply, I think this approach would introduce unnecessary complexity.

Right now, the code assumes that we know what we’re putting in and getting out — which is why we do things like:

id = (int32_t)pgagroal_json_get(header, MANAGEMENT_ARGUMENT_COMMAND);

If we start wrapping scalars, we’d need to:

  • Change all such direct casts.
  • Add utility functions to unwrap the actual values from the wrapper.
  • Handle additional memory management for the wrapper objects.

That alone would touch a lot of code and introduce risk.

There is a second approach I considered — modifying the pgagroal_json_put logic so that we always wrap scalars when inserting them into the JSON. That way, we ensure that the structure is consistent for all fields. Here's a comparison between the current and modified JSON structures:


(1) Current JSON Structure:

{
  "Header": {
    "ClientVersion": "2.0.0",
    "Command": 3,
    "Compression": 0,
    "Encryption": 0,
    "Output": 0,
    "Timestamp": "20250715184659"
  },
  "Request": {}
}

(2) Modified JSON Structure (wrapped scalars):

{
  "Header": {
    "ClientVersion": {
      "value": "2.0.0"
    },
    "Command": {
      "value": 3
    },
    "Compression": {
      "value": 0
    },
    "Encryption": {
      "value": 0
    },
    "Output": {
      "value": 0
    },
    "Timestamp": {
      "value": "20250715191154"
    }
  },
  "Outcome": {
    "Error": {
      "value": 2
    },
    "Status": {
      "value": false
    }
  },
  "Request": {}
}

This modified format keeps things consistent and simplifies value extraction logic across the board.

However, it still comes with some drawbacks:

  • The JSON becomes more verbose and harder to read.
  • We’d still need utility functions to extract values from the value field.
  • It would require refactoring all put and get calls across the codebase — which is a big and risky change.

In the end, this feels like a large change to fix a relatively small issue (a segmentation fault), which already has a simpler workaround (like the iterative approach I proposed earlier). So from a practicality standpoint, I’d suggest sticking with the current implementation.

That said, if we do want to move forward with modifying the JSON APIs, we should first have a more detailed discussion:

  • Should we modify the get path, the put path, or both?
  • How big is the impact across the codebase?
  • Is there a cleaner or smaller alternative we’re overlooking?

If we decide to go this route, it should definitely be done in a separate PR, as it touches core behavior.

Let me know your thoughts — I’m happy to help implement the change if we agree on a direction.

@jesperpedersen
Copy link
Collaborator

@Userfrom1995 I don't like it - it shouldn't be necessary to wrap all JSON values

@jesperpedersen
Copy link
Collaborator

@Userfrom1995 What is important is to create a chapter / appendix in the developer that describes the format (name/type) and so on

@fluca1978
Copy link
Collaborator

@Userfrom1995 this is an excellent and detailed analysis, and while I think wrapping is still the right path to get things done, it requires too many changes so far, and probably has also a runtime impact we don't want for a few scalar values.
Therefore, we should step back to the iterating approach you proposed.
Do you think it is worth providing a pgagroal_json_get_scalar defensive function to clearly indicate we want to extract a single value out of JSON?

@jesperpedersen
Copy link
Collaborator

@Jubilee101 will look at it - so hold

@jesperpedersen
Copy link
Collaborator

@Userfrom1995 See #592

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants