Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix running Grav under Load Balancer with custom_base_url #3821

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

ViliusS
Copy link
Contributor

@ViliusS ViliusS commented Apr 24, 2024

Previous change broke running Grav under load balancer with custom_base_url set. All dynamic urls (pages, admin, etc.), returns 404. All static content is not affected.

dump($bits) just after this parse_url()
Original Grav 1.7.45 code accessed directly from http://localhost:8080/korteles/kodai

array:3 [▼
  "scheme" => "http"
  "host" => "domain.com"
  "path" => "/korteles/kodai"
]

Result - OK.

Grav 1.7.45 with this PR applied accessed directly from http://localhost:8080/korteles/kodai

array:1 [▼
  "path" => "/korteles/kodai"
]

Result - OK.

Original Grav 1.7.45 code accessed from load balancer https://mydomain.example.com/test/korteles/kodai (custom_base_url: /test)

array:3 [▼
  "scheme" => "http"
  "host" => "domain.comhttps"
  "path" => "//mydomain.example.com/korteles/kodai"
]

Result - 404 or even some nasty httpd cycles, depending on the configuration of load calancer.

Grav 1.7.45 with this PR applied accessed from load balancer https://mydomain.example.com/test/korteles/kodai (custom_base_url: /test)

array:1 [▼
  "path" => "/korteles/kodai"
]

Result - OK.

Load balancer is configured to rewrite /test/(.*) to /$1, and I ensured that it is reaching Grav under correct path. The patch probably also fixes #3511 .

Also, it's not a good idea to add 3rd party domain into host value, which could be accidently reused some time in the future.

…etgrav#2184]"

Previous change broke running Grav under Load Balancer with custom_base_url
set. Also, it's not a good idea to add 3rd party domain into host value,
which could be accidently reused some time in the future.

This (not counting CHANGELOG) reverts commit 0af3385.
@ViliusS
Copy link
Contributor Author

ViliusS commented Apr 24, 2024

P.S. I'm not sure why that one test case for ><script>alert</script>://localhost fails, but given FIXME comments in the code it probably never worked as it should. Since I don't know what it is testing against I did not touch it, but if you agree that that case is flawed I could just remove it or comment out leaving the same FIXME comments.

@rhukster
Copy link
Member

rhukster commented May 7, 2024

The domain.com domain is used only to force the domain to be populated so that the $uri an be parsed as a path, and it's there specifically to ensure that host header based attacks are not possible. This is also the reason for the script/localhost tests.

With your change, it makes Grav vulnerable again to host-based attacks as it then allows the poor parsing in PHP's parse_url() to be fooled by escaped script tags. The approach of using domian.com is simply to ensure that will not work. the host and scheme are not used after this point, that's why it's not an issue. However, i'm still not sure why your custom_base_url is getting tripped up. I think its something else going on.

@ViliusS
Copy link
Contributor Author

ViliusS commented May 7, 2024

But if host part is not used anywhere adding 'http://domain.com' is actually worse. For standard configuration it does populate host part and leaves path clean, however for load balancer case mentioned above I get:

Without patch calling https://mycustom.example.com/custom_base//www.fake.com/pagea/subpageb

array:3 [▼
  "scheme" => "http"
  "host" => "domain.comhttps"
  "path" => "//mycustom.example.com/www.fake.com/pagea/subpageb"
]

With patch calling https://mycustom.example.com/custom_base//www.fake.com/pagea/subpageb

array:3 [▼
  "scheme" => "https"
  "host" => "mycustom.example.com"
  "path" => "/www.fake.com/pagea/subpageb"
]

Nevertheless, I'm still not sure what you mean by "vulnerable to host based attacks". If you mean that by adding http://domain.com it will protect from SSRF type of attacks I don't think it will. There are multiple other techniques how to exploit parse_url() behaviour.
Reading through various mailing lists I see that these issues were fixed in PHP and cURL years ago https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2016-10397 though. Maybe original issue was tested just against those vulnerable versions?

On top of that, PHP documentation clearly states that parse_url() cannot be trusted for URL validation, especially relative URLs, and additional measures, such as filter_var(), should be added. Or better yet, just compare the beginning of the path/header/whatever to the current absolute URL of the website to make sure you are actually redirecting to yourself.

Thus, I still think adding domain which is controlled by a third-party to your code is not a proper fix. It is dangerous for lot of reasons big and small, e.g. https://medium.com/@maciej.pocwierz/how-an-empty-s3-bucket-can-make-your-aws-bill-explode-934a383cb8b1 At least it should have been http://domain.invalid or any other non routable registered domain.

@ViliusS
Copy link
Contributor Author

ViliusS commented May 7, 2024

P.S. I also found custom parseUrl() method in Grav code itself. I see that it is used in other places, but not here. Not sure what is the purpose of it, but maybe it can be reused for better results.

@rhukster
Copy link
Member

rhukster commented May 8, 2024

So I added a test to the current 'develop' branch:

    public function testCustomBase(): void
    {
        $current_base = $this->config->get('system.custom_base_url');
        $this->config->set('system.custom_base_url', '/test');
        $this->uri->initializeWithURL('https://mydomain.example.com:8090/test/korteles/kodai%20something?test=true#some-fragment')->init();

        $this->assertSame([
          "scheme" => "https",
          "host" => "mydomain.example.com",
          "port" => 8090,
          "user" => null,
          "pass" => null,
          "path" => "/korteles/kodai%20something",
          "params" => [],
          "query" => "test=true",
          "fragment" => "some-fragment",
        ], $this->uri->toArray());

        $this->config->set('system.custom_base_url', $current_base);
    }

**This passes as-is without your patch.. **

You can't simply dump the $bits variable as that's not the complete URI object.

I beleive the current functionality is working as expected. Are you expecting some different?

@ViliusS
Copy link
Contributor Author

ViliusS commented May 8, 2024

I'm not sure what should be put into initializeWithURL() but keep in mind, that https://mydomain.example.com/test/korteles/kodai gets rewritten by the load balancer to https://mydomain.example.com/korteles/kodai, only then processed by Grav.
If I dump $uri object before parse_url() part I get https://mydomain.example.com/korteles/kodai, so I expect that parse_url() will transfer that to path /korteles/kodai.

@ViliusS
Copy link
Contributor Author

ViliusS commented May 8, 2024

Maybe this will be clearer what I'm trying to do:
image

This basically serves as an automated infrastructure for different versions of the same Grav website to preview changes on infinite Git branches before these changes go into production. custom_base_url is basically IDs for different website branch. Setup is pretty simple, and as I said static elements (JS, CSS, Images) works just fine. It's just dynamic content which goes into redirect loop without my patch.

@ViliusS
Copy link
Contributor Author

ViliusS commented May 8, 2024

P.S. I have tested your changes in c97a0ff . It still doesn't work. I get 404 for all pages. Dumped objects in the screenshot is $uri before static::parseUrl() ant $bits after it:

image

@ViliusS
Copy link
Contributor Author

ViliusS commented Aug 4, 2024

Is there anything else I could do to progress this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

custom_base_url vs assets base-url and page lookup
2 participants