Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugin processing leaves garbage at end of shrinked files #69

Open
RomanSek opened this issue May 14, 2015 · 2 comments
Open

Plugin processing leaves garbage at end of shrinked files #69

RomanSek opened this issue May 14, 2015 · 2 comments

Comments

@RomanSek
Copy link

There's an issue with handling responses from plugin. When postprocess changes file to be smaller than original - HTTrack accepts pointer to new file sting, but ignores new length. This causes garbage at end of files and occasional crash (processed file has same length as unprocessed file).

It looks like this bug was introduced in version 3.48.12 (3.48.11 works as expected).

I can provide my plugin if it'll help, but it's integrated with python, so building it will require additional dependencies.

@RomanSek
Copy link
Author

RomanSek commented Apr 24, 2017

@xroche I think I fixed this issue in my plugin, but I want to make sure I handle changes correctly. This bug affects postprocess callback.
Previously I used

*len = strlen(changedHtml);
*html = (char *) hts_realloc(*html, *len + 1);
strcpy(*html, changedHtml);

in callback function. After version 3.48.11 it stopped working because there was added condition check in code (version 3.48.21): httrack/src/htsparse.c:3343 if (cAddr != TypedArrayElts(output_buffer)) { when realloc didn't change pointer address - plugin output wasn't used by the engine.

I changed plugin code now to:

*len = strlen(changedHtml);
*html = hts_strdup(changedHtml);

I don't free previous html address. Is this preferred method? Let me know and I can prepare example plugin code.

There's also another issue I noticed: in postprocess callback *html string doesn't end with \0 character after the change, so it's mandatory to use *len value to know correct string length. Using strlen or strcpy will give incorrect values. Was this intended?

@RomanSek
Copy link
Author

RomanSek commented May 4, 2017

Using hts_strdup caused memory leak, so I reverted my plugin to previous version using hts_realloc.
I also created PR with fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant