Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allowed memory size of 512MB bytes exhausted during diff #209

Open
danielmarschall opened this issue Dec 19, 2023 · 8 comments
Open

Allowed memory size of 512MB bytes exhausted during diff #209

danielmarschall opened this issue Dec 19, 2023 · 8 comments

Comments

@danielmarschall
Copy link
Contributor

danielmarschall commented Dec 19, 2023

Looking through my log files with the latest version 2.8.3 , I noticed the following problem:

Fatal error: Allowed memory size of 536870912 bytes exhausted (tried to allocate 135168 bytes) in diff_util.php on line 68

The URL is:
https://svn.viathinksoft.com/websvn/diff.php?path=/trunk/bin/webreader.phar&repname=vnag&rev=88

In case you need the SVN working copy:
svn co https://svn.viathinksoft.com/svn/vnag

It is a bit surprising because webreader.phar is a rather small file. However, it is a binary file, so I guess the diff util goes crazy?

@k10blogger
Copy link
Member

Check this #128
I did not get time to work on it as after raising the memory limit but it didnt recur and was then not priotize.

@michael-o
Copy link
Member

Stupid question: Does using diff(1) make any difference or is just the processing of a unified diff problematic in PHP itself?

@danielmarschall
Copy link
Contributor Author

@michael-o You mean the "diff" command in Linux? I think it does not allow binary files to be diff'ed. I guess the PHP script levenshtein2 takes too much memory, because the files are too hard to diff, because they are either binary, or if they are "too much unequal"

@michael-o
Copy link
Member

@michael-o
Copy link
Member

@danielmarschall
Copy link
Contributor Author

@michael-o I don't know the internals of the algorithm, so I cannot help much.

About the PHAR files, I call them binary because they contain control characters in between which are not text. To be fair, a large portion of the PHAR files is text, though.

I do not know why diff.php crashes and comp.php not.

@danielmarschall
Copy link
Contributor Author

@michael-o I wanted to ask if you have news about this. I regularly get "DOS" attacks by search engines that call these URLs.

@michael-o
Copy link
Member

michael-o commented Nov 4, 2024

@michael-o I wanted to ask if you have news about this. I regularly get "DOS" attacks by search engines that call these URLs.

No, I haven't but some points come to my mind:

  • Mark the files as binary and exempt from diffs
  • Try diff(1) instead of Horde_Diff:

    websvn/include/diff_inc.php

    Lines 342 to 349 in eb6cf4e

    function do_diff($all, $ignoreWhitespace, $highlighted, $newtname, $oldtname, $newhlname, $oldhlname) {
    if ((!$ignoreWhitespace ? class_exists('Horde_Text_Diff') : class_exists('Horde_Text_Diff_Mapped'))
    && class_exists('Horde_Text_Diff_Renderer_Unified')) {
    return inline_diff($all, $ignoreWhitespace, $highlighted, $newtname, $oldtname, $newhlname, $oldhlname);
    } else {
    return command_diff($all, $ignoreWhitespace, $highlighted, $newtname, $oldtname, $newhlname, $oldhlname);
    }
    }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants