Description
AutoGPT uses a wrapper around the requests
python library, located in autogpt_platform/backend/backend/util/request.py. In this wrapper, redirects are specifically NOT followed for the first request, as specified here. If the wrapper is used with allow_redirects
set to True (which is the default), any redirect is not followed by the initial request, but rather re-requested by the wrapper using the new location, here. The code is shown below:
# Perform the request with redirects disabled for manual handling
response = req.request(
method,
url,
headers=headers,
allow_redirects=False,
*args,
**kwargs,
)
if self.raise_for_status:
response.raise_for_status()
# If allowed and a redirect is received, follow the redirect
if allow_redirects and response.is_redirect:
if max_redirects <= 0:
raise Exception("Too many redirects.")
location = response.headers.get("Location")
if not location:
return response
new_url = validate_url(urljoin(url, location), self.trusted_origins)
if self.extra_url_validator is not None:
new_url = self.extra_url_validator(new_url)
return self.request(
method,
new_url,
headers=headers,
allow_redirects=allow_redirects,
max_redirects=max_redirects - 1,
*args,
**kwargs,
)
However, there is a fundamental flaw in manually re-requesting the new location: it does not account for security-sensitive headers which should not be sent cross-origin, such as the Authorization
and Proxy-Authorization
header, and cookies.
For example in autogpt_platform/backend/backend/blocks/github/_api.py, an Authorization header is set when retrieving data from the GitHub API. However, if GitHub suffers from an open redirect vulnerability (such as the made-up example of https://api.github.com/repos/{owner}/{repo}/issues/comments/{comment_id}/../../../../../redirect/?url=https://joshua.hu/
), and the script can be coerced into visiting it with the Authorization header, the GitHub credentials in the Authorization header will be leaked to https://joshua.hu/.
The standard requests
library does not suffer from this vulnerability. If a redirect occurs (and is followed), headers are not sent cross-origin (different domain, protocol, or port). That can be seen here and here. Those headers are the Proxy-Authorization
and Authorization
headers. Cookies are also not blindly re-transmitted, as they follow the standard cookiejar format.
Proof of Concept
No matter how the request()
wrapper function is used, whether it be the aforementioned GitHub code in AutoGPT, or when request()
is used by the actual AI, any redirect cross-origin will leak any private information which has been set in the headers or cookies. This could mean that users' secrets are leaked, or the server's secrets.
Impact
Leak auth headers, leak private cookies.Description
AutoGPT is built with a wrapper around Python's requests
library, hardening the application against SSRF. The code for this wrapper can be found in autogpt_platform/backend/backend/util/request.py.
The requested hostname of a URL which is being requested is validated, ensuring that it does not resolve to any local ipv4 or ipv6 addresses. This can be seen below in the validate_url()
function:
def validate_url(url: str, trusted_origins: list[str]) -> str:
# ....
# Resolve all IP addresses for the hostname
try:
ip_addresses = {res[4][0] for res in socket.getaddrinfo(ascii_hostname, None)}
except socket.gaierror:
raise ValueError(f"Unable to resolve IP address for hostname {ascii_hostname}")
if not ip_addresses:
raise ValueError(f"No IP addresses found for {ascii_hostname}")
# Block any IP address that belongs to a blocked range
for ip_str in ip_addresses:
if _is_ip_blocked(ip_str):
raise ValueError(
f"Access to blocked or private IP address {ip_str} "
f"for hostname {ascii_hostname} is not allowed."
)
return url
However, this check is not sufficient, as a DNS server may initially respond with a non-blocked address, with a TTL of 0. This means that the initial resolution would appear as a non-blocked address. In this case, validate_url()
will return the url as successful.
After validate_url()
has successfully returned the url, the url is then passed to the real request()
function:
def request(
self,
method,
url,
headers=None,
allow_redirects=True,
max_redirects=10,
*args,
**kwargs,
) -> req.Response:
# Merge any extra headers
if self.extra_headers is not None:
headers = {**(headers or {}), **self.extra_headers}
# Validate the URL (with optional extra validator)
url = validate_url(url, self.trusted_origins)
if self.extra_url_validator is not None:
url = self.extra_url_validator(url)
# Perform the request with redirects disabled for manual handling
response = req.request(
method,
url,
headers=headers,
allow_redirects=False,
*args,
**kwargs,
)
When the real request()
function is called with the validated url, request()
will once again resolve the address of the hostname, because the record will not have been cached (due to TTL 0). This resolution may be in the "invalid range". This type of attack is called a "DNS Rebinding Attack".
Proof of Concept
http://1u.ms/ offers a useful tool which we can use as a proof of concept:
$ dig +short @8.8.8.8 make-1.2.3.4-rebind-169.254-169.254-rr.1u.ms
1.2.3.4
$ dig +short @8.8.8.8 make-1.2.3.4-rebind-169.254-169.254-rr.1u.ms
169.254.169.254
Note that we resolve the exact same address twice, however two different records are retrieved. We can see the "TTL" in the full dig output:
$ dig @8.8.8.8 make-1.2.3.4-rebind-169.254-169.254-rr.1u.ms
; <<>> DiG 9.10.6 <<>> @8.8.8.8 make-1.2.3.4-rebind-169.254-169.254-rr.1u.ms
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7294
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;make-1.2.3.4-rebind-169.254-169.254-rr.1u.ms. IN A
;; ANSWER SECTION:
make-1.2.3.4-rebind-169.254-169.254-rr.1u.ms. 0 IN A 169.254.169.254 # TTL is 0
;; Query time: 468 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Tue Mar 25 03:00:26 AEST 2025
;; MSG SIZE rcvd: 89
$ dig @8.8.8.8 make-1.2.3.4-rebind-169.254-169.254-rr.1u.ms
; <<>> DiG 9.10.6 <<>> @8.8.8.8 make-1.2.3.4-rebind-169.254-169.254-rr.1u.ms
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 14205
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;make-1.2.3.4-rebind-169.254-169.254-rr.1u.ms. IN A
;; ANSWER SECTION:
make-1.2.3.4-rebind-169.254-169.254-rr.1u.ms. 0 IN A 1.2.3.4 # TTL is 0
;; Query time: 492 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Tue Mar 25 03:00:29 AEST 2025
;; MSG SIZE rcvd: 89
A simple fix for this is, in the validate_url()
function, return the list of ip addresses which have been validated. Then, rewrite the URL, replacing the hostname with a validated IP address. Then, directly set the "Host" header in request. This way, you are assured to be accessing http://1.2.3.4/ with the HTTP header "Host: example.com", rather than forcing requests
to perform DNS resolution for example.com. To ensure that https certificates are checked against the proper hostname, you can use something like this:
import requests
import ssl
from urllib3.poolmanager import PoolManager
from requests.adapters import HTTPAdapter
class HostHeaderSSLAdapter(HTTPAdapter):
"""Adapter that connects to an IP address but validates TLS for a different host."""
def __init__(self, ssl_hostname, *args, **kwargs):
self.ssl_hostname = ssl_hostname
super().__init__(*args, **kwargs)
def init_poolmanager(self, *args, **kwargs):
context = ssl.create_default_context()
kwargs['ssl_context'] = context
kwargs['server_hostname'] = self.ssl_hostname # This works for urllib3>=2
self.poolmanager = PoolManager(*args, **kwargs)
# URL with the IP
url = "https://142.250.70.142/"
# Create a session and mount the custom adapter
session = requests.Session()
adapter = HostHeaderSSLAdapter("google.com")
session.mount("https://", adapter)
# Send request with proper Host header
headers = {
"Host": "google.com"
}
response = session.get(url, headers=headers, allow_redirects=False)
print("Status Code:", response.status_code)
print("Response Headers:", response.headers)
Impact
All SSRF protection is bypassable; it could allow querying local services, or other previously blocked addresses. Depends on the situation.
Description
AutoGPT uses a wrapper around the
requests
python library, located in autogpt_platform/backend/backend/util/request.py. In this wrapper, redirects are specifically NOT followed for the first request, as specified here. If the wrapper is used withallow_redirects
set to True (which is the default), any redirect is not followed by the initial request, but rather re-requested by the wrapper using the new location, here. The code is shown below:However, there is a fundamental flaw in manually re-requesting the new location: it does not account for security-sensitive headers which should not be sent cross-origin, such as the
Authorization
andProxy-Authorization
header, and cookies.For example in autogpt_platform/backend/backend/blocks/github/_api.py, an Authorization header is set when retrieving data from the GitHub API. However, if GitHub suffers from an open redirect vulnerability (such as the made-up example of
https://api.github.com/repos/{owner}/{repo}/issues/comments/{comment_id}/../../../../../redirect/?url=https://joshua.hu/
), and the script can be coerced into visiting it with the Authorization header, the GitHub credentials in the Authorization header will be leaked to https://joshua.hu/.The standard
requests
library does not suffer from this vulnerability. If a redirect occurs (and is followed), headers are not sent cross-origin (different domain, protocol, or port). That can be seen here and here. Those headers are theProxy-Authorization
andAuthorization
headers. Cookies are also not blindly re-transmitted, as they follow the standard cookiejar format.Proof of Concept
No matter how the
request()
wrapper function is used, whether it be the aforementioned GitHub code in AutoGPT, or whenrequest()
is used by the actual AI, any redirect cross-origin will leak any private information which has been set in the headers or cookies. This could mean that users' secrets are leaked, or the server's secrets.Impact
Leak auth headers, leak private cookies.Description
AutoGPT is built with a wrapper around Python's
requests
library, hardening the application against SSRF. The code for this wrapper can be found in autogpt_platform/backend/backend/util/request.py.The requested hostname of a URL which is being requested is validated, ensuring that it does not resolve to any local ipv4 or ipv6 addresses. This can be seen below in the
validate_url()
function:However, this check is not sufficient, as a DNS server may initially respond with a non-blocked address, with a TTL of 0. This means that the initial resolution would appear as a non-blocked address. In this case,
validate_url()
will return the url as successful.After
validate_url()
has successfully returned the url, the url is then passed to the realrequest()
function:When the real
request()
function is called with the validated url,request()
will once again resolve the address of the hostname, because the record will not have been cached (due to TTL 0). This resolution may be in the "invalid range". This type of attack is called a "DNS Rebinding Attack".Proof of Concept
http://1u.ms/ offers a useful tool which we can use as a proof of concept:
Note that we resolve the exact same address twice, however two different records are retrieved. We can see the "TTL" in the full dig output:
A simple fix for this is, in the
validate_url()
function, return the list of ip addresses which have been validated. Then, rewrite the URL, replacing the hostname with a validated IP address. Then, directly set the "Host" header in request. This way, you are assured to be accessing http://1.2.3.4/ with the HTTP header "Host: example.com", rather than forcingrequests
to perform DNS resolution for example.com. To ensure that https certificates are checked against the proper hostname, you can use something like this:Impact
All SSRF protection is bypassable; it could allow querying local services, or other previously blocked addresses. Depends on the situation.