Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(esp_http_client): Fix host header for IPv6 address literal (IDFGH-14640) #15388

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

thelazt
Copy link

@thelazt thelazt commented Feb 13, 2025

An IPv6 IP that occurs in the 'Host:' header of an HTTP request must be enclosed in square brackets

Description

When performing an HTTP request to an IPv6 address via URL, for example

esp_http_client_config_t config = {
    .url = "http://[2001:db8::1]/path",
};

the HTTP request header of the ESP IDF will contain

Host: 2001:db8::1

This IPv6 format is invalid and webservers will refuse to process the request.
The correct host in the header would be

Host: [2001:db8::1]

Background

HTTP RFC 7230 Section 5.4 defines the Host header field as:

Host = uri-host [ ":" port ] ; Section 2.7.1

and explicitly references Section 2.7.1, which specifies that host needs to conform to RFC 3986 Section 3.2.2.

The RFC 3986 Section 3.2.2 mandates that, a host identified by an Internet Protocol literal address, version 6 or later, needs to be enclosed in square brackets within an http URI:

host        = IP-literal / IPv4address / reg-name
IP-literal = "[" ( IPv6address / IPvFuture  ) "]"

Therefore, an IPv6 occuring in the Host header of an HTTP request must be enclosed by square brackets.

Fix

The attached solution will test for a : character in the _get_host_header helper function to determine if the host is an IPv6 address, and in such a case create the header hostname with the address in square brackets.
This is a very simple approach and effectively the same check as GNU Wget uses.

I have also considered alternatives:
One would be testing for an valid IPv6 address (e.g., using inet_pton), however, this would probably add overhead and additional dependencies (lwip).

Another solutions would employ the parser to track if the URL contains an IPv6 literal.
While this would be a rather clean solution, it would require to changes components/http_parser/http_parser.c, in particular tracking the occurance of s_http_host_v6_start.
Such a solution would be similar to Curl, which sets an IPv6 IP flag.
However, for ESP IDF such a solution would mean to adjust the HTTP parser (which, to the best of my knowledge, is an outdated version originating from the now unmaintained Node.js HTTP parser).
Furthermore, it would not handle the case of providing config->host (instead of config->url) for the esp_http_client_init call.

Related

There have been similar issues in other projects, e.g. https://bugzilla.mozilla.org/show_bug.cgi?id=45891

Testing

Without the fix, a request to a lighttpd) server fails.
With the fix, it succeeds.
Using network package capturing, the fixed host header can be seen:
esp_http_client_ipv6_pcap


Checklist

Before submitting a Pull Request, please ensure the following:

  • 🚨 This PR does not introduce breaking changes.
  • All CI checks (GH Actions) pass.
  • Documentation is updated as needed.
  • Tests are updated or added as necessary.
  • Code is well-commented, especially in complex areas.
  • Git history is clean — commits are squashed to the minimum necessary.

An IPv6 IP that occurs in the 'Host:' header of an HTTP request must be enclosed
in square brackets (RFC3986 section 3.2.2).

Searches for ':' in the host string to efficiently determine if the host is an
IPv6 IP address.
Copy link

github-actions bot commented Feb 13, 2025

Messages
📖 🎉 Good Job! All checks are passing!

👋 Hello thelazt, we appreciate your contribution to this project!


📘 Please review the project's Contributions Guide for key guidelines on code, documentation, testing, and more.

🖊️ Please also make sure you have read and signed the Contributor License Agreement for this project.

Click to see more instructions ...


This automated output is generated by the PR linter DangerJS, which checks if your Pull Request meets the project's requirements and helps you fix potential issues.

DangerJS is triggered with each push event to a Pull Request and modify the contents of this comment.

Please consider the following:
- Danger mainly focuses on the PR structure and formatting and can't understand the meaning behind your code or changes.
- Danger is not a substitute for human code reviews; it's still important to request a code review from your colleagues.
- To manually retry these Danger checks, please navigate to the Actions tab and re-run last Danger workflow.

Review and merge process you can expect ...


We do welcome contributions in the form of bug reports, feature requests and pull requests via this public GitHub repository.

This GitHub project is public mirror of our internal git repository

1. An internal issue has been created for the PR, we assign it to the relevant engineer.
2. They review the PR and either approve it or ask you for changes or clarifications.
3. Once the GitHub PR is approved, we synchronize it into our internal git repository.
4. In the internal git repository we do the final review, collect approvals from core owners and make sure all the automated tests are passing.
- At this point we may do some adjustments to the proposed change, or extend it by adding tests or documentation.
5. If the change is approved and passes the tests it is merged into the default branch.
5. On next sync from the internal git repository merged change will appear in this public GitHub repository.

Generated by 🚫 dangerJS against cc33a32

@github-actions github-actions bot changed the title fix(esp_http_client): Fix host header for IPv6 address literal fix(esp_http_client): Fix host header for IPv6 address literal (IDFGH-14640) Feb 13, 2025
@espressif-bot espressif-bot added the Status: Opened Issue is new label Feb 13, 2025
@safocl
Copy link
Contributor

safocl commented Feb 13, 2025

and additional dependencies (lwip).

isn't lwip part of the network stack for esp32?
https://github.com/espressif/esp-idf/tree/master/components/lwip

@thelazt
Copy link
Author

thelazt commented Feb 14, 2025

isn't lwip part of the network stack for esp32?

Yes, the function in question is implemented in components/lwip/lwip/src/api/sockets.c - my comment referred to the includes of the esp_http_client.c file, but this point may indeed be rather negligible.

My main concern with this alternative approach is whether the higher overhead of inet_pton over strchr is worth it, since this call is made on every HTTP request:

if (host != NULL && host[0] != '[' && strchr(host, ':') != NULL)

vs

char tmp[INET6_ADDRSTRLEN];
if (inet_pton(AF_INET6, host, tmp))

I cannot see any valid real-world cases where this would make a difference, and since GNU Wget uses the same strchr check, I think this simpler and leaner approach is sufficient.

@safocl
Copy link
Contributor

safocl commented Feb 14, 2025

since this is a static function -- maybe it's worth checking somewhere higher in the API?

@safocl
Copy link
Contributor

safocl commented Feb 14, 2025

esp_http_client_config_t config = {
.url = "http://[2001:db8::1]/path",
};

if config with a "http://\[2001:db8::1\]/path" url string?

perhaps it is simply necessary to change the documentation, indicating that the user should monitor the valid value of this entity

@thelazt
Copy link
Author

thelazt commented Feb 14, 2025

if config with a "http://\[2001:db8::1\]/path" url string?

perhaps it is simply necessary to change the documentation, indicating that the user should monitor the valid value of this entity

This does not solve the problem:
Since the URL http://\[2001:db8::1\]/path is invalid, it will not pass the project's HTTP parser: http_parser_parse_url (inside the esp_http_client_set_url call) will return an error and the HTTP client initialization will fail.

For the (valid) URL http://[2001:db8::1]/path, the HTTP parser will return the following values:

  • UF_SCHEMA = http
  • UF_HOST = 2001:db8::1
  • UF_PATH = /path

The value of UF_HOST will be assigned to client->connection_info.host -- it is crucial that the value does not contain square brackets, since it is also used for establishing the transport connection.
We have also verified this part on the ESP: We get errors in the log and no packets recorded by tcpdump.

But there are additional problems along the way:
Even if we extend the HTTP parser by introducing a new HTTP parser URL field that contains the IPv6 address including the square brackets (we have implemented this as well), we have to store this host header value in an additional field in connection_info_t (requiring additional memory to be allocated just to cover the case of an IPv6 address as host).

Furthermore, this will not solve the case of calling the HTTP client with host and path instead of URL, as no HTTP parser will be called.

With

esp_http_client_config_t config = {
    .host = "2001:db8::1",
    .path = "/path"
};

the _get_host_header will not contain the square brackets.

If we use

esp_http_client_config_t config = {
    .host = "[2001:db8::1]",
    .path = "/path"
};

instead, the HTTP header would be correct, but this config will not succeed in establishing a connection to the host (because esp_transport does not handle hosts with square brackets).

since this is a static function -- maybe it's worth checking somewhere higher in the API?

With the findings from above, I am convinced that there is no appropriate place to integrate this somewhere higher in the API.
Also, I don't think there is any way to send a proper HTTP request using an IPv6 address with the current ESP IDF, so tweaking the documentation will not prevent this problem.

Instead, this is a bug in the code that is probably unlikely to occur often (as some web servers may be quite forgiving of bad headers, and using IPv6 addresses for HTTP is not that common).
However, since we discovered the bug in real-world scenarios, and the proposed fix is quite simple and cheap (in terms of overhead), I think it is worth considering merging it.

By the way: To test the results of the HTTP parser, one can use the following snippet - with URLs to test as CLI parameters.

#include <stdio.h>
#include <string.h>
#include "http_parser.h"

#define xstr(s) str(s)
#define str(s) #s

int main(int argc, const char * argv[]) {
	printf("HTTP Parser version: %d.%d.%d\n", HTTP_PARSER_VERSION_MAJOR, HTTP_PARSER_VERSION_MINOR, HTTP_PARSER_VERSION_PATCH);
	for (int i = 1; i < argc; i++) {
		struct http_parser_url purl;
		const char * url = argv[i];
		printf("\nURL: '%s'", url);
		if (http_parser_parse_url(url, strlen(url), 0, &purl) != 0) {
			puts(" could not be parsed!");
			continue;
		} else {
			puts(":");
		}
#define printfield(NAME) do { \
		if (purl.field_set & (1 << NAME)) { \
			printf(" - " xstr(NAME) " = \"%.*s\"\n", purl.field_data[NAME].len, url + purl.field_data[NAME].off); \
		} \
	} while (0)
		printfield(UF_SCHEMA);
		printfield(UF_HOST);
		printfield(UF_PORT);
		printfield(UF_PATH);
		printfield(UF_QUERY);
		printfield(UF_FRAGMENT);
		printfield(UF_USERINFO);
	}
	return 0;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Opened Issue is new
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants