-
Notifications
You must be signed in to change notification settings - Fork 136
Switch from PhantomJS to headless Chrome #96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
domenic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this; I'm very excited!
I left some code style comments, plus a suggestion for overhauling the conversion process. To be clear, we used data URLs because that was the easiest way to get things between processes with PhantomJS. My understanding of the Puppetteer API is that it's much better, so we should use that and abandon the data URL business.
3/4/5.png are a bit worrying, although maybe it's the fonts installed on our different machines and that test should just use a common font instead...
We can get still sync by using process.spawnSync, just at one level removed. We write our own wrapper process that uses svg2png's async version, reading from stdin and writing to stdout, and then we spawnSync node with that file.
bin/svg2png-cli.js
Outdated
| "use strict"; | ||
| const fs = require("fs"); | ||
|
|
||
| const { EOL } = require("os"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use OS EOLs. \n everywhere is much preferable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem. I was trying to be cross-platform friendly, but I guess it's a bit out-of-scope anyway. I'll change back.
lib/converter.js
Outdated
| .then(() => createPNGDataURL(page)); | ||
| }; | ||
|
|
||
| function createPNGDataURL(page) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the pipeline here is to draw the SVG to a canvas then get a data URL for a png and then base64 decode it into binary? That seems terrible :(.
Why can't we just use an <img src="svg.svg"> (or navigate Chromium to the SVG directly) and resize the window appropriately, then use Pupetteer's screenshot API?
We should also be able to avoid a lot (not all) of the SVG dimensions stuff if we can just use the input width and height to set the window width/height. Remember that per https://github.com/domenic/svg2png#exact-resizing-behavior the intent is "to stick as close to the behavior of loading a SVG file in your browser as possible"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with your suggestion of having <img src="svg.svg"> is that, when the API is directly passed a SVG buffer, we'd have to write it to a temporary file so that it could be imported in such a way. I initial ignored the screenshot approach as I assumed it would break for SVGs containing transparency, but I may have been wrong after looking at their docs.
The temporary file writing could impact performance, but I'm not sure how important that is for this library. I doubt anyone is using it in application "hot paths". I assume it's mostly used in build processes or ad-hoc commands. I could write it in a way that the API reads directly from a file when a string is passed, otherwise treats it as a buffer and writes it to a temporary file.
I'll investigate the screenshot approach further and see how I get on. However, I'm not convinced that this will work around the issue of external files in cases where a buffer is passed directly to the API, but I'll check that as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In another project (using chromy), I've had success passing the SVG as a data-uri to open/navigate to; also screenshotting with svg selector works well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good point, I would prefer to avoid temporary files. I'm not sure what the right path is then; data URL, or write the bytes directly to canvas? (via Buffer -> Uint8Array -> ArrayBuffer -> Uint8ClampedArray, none of which involve a copy, just wrapping new types around the same binary data)
lib/svg2png.js
Outdated
| options = parseOptions(options); | ||
|
|
||
| return cp.promise.then(processResult); | ||
| return puppeteer.launch() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just use async/await. Also, please never use Promise.all in this fashion (i.e. with one thing that's a promise and one thing that's not, just so that you can pass them around).
I also see no reason why convert returns the browser that it was given. We can just keep using the same one throughout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was very tempted to use async/await but I held myself back since your README explicitly stated support for Node.js 6, which wouldn't be possible if we adopted async/await. If you're happy to drop support for Node.js 6, I can rewrite using async/await and it will be much more readable.
Without using async/await I was trying to avoid holding references/state outside of the promise flow as I felt that was an anti-pattern. If we're going to use async/await, there will be no need for these. Otherwise, I'll rewrite to store references at the function level to avoid passing references.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, let's drop Node 6; this will be a major version bump anyway.
lib/svg2png.js
Outdated
|
|
||
| return Promise.all([page, page.setContent(svg)]); | ||
| }) | ||
| .then(([page]) => Promise.all([converter.convert(page, options), browser])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above please use async/await and never use Promise.all in this fashion.
test/integration.js
Outdated
| const rimraf = require("rimraf"); | ||
| const chai = require("chai"); | ||
| const chaiAsPromised = require("chai-as-promised"); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, please don't include these changes here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem. I've disabled it in my IDE for this project now. Sorry about that.
test/integration.js
Outdated
|
|
||
| chai.use(chaiAsPromised); | ||
| const expect = require("chai").expect; | ||
| const expect = chai.expect; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or this change
| const path = require("path"); | ||
|
|
||
| const fileURL = require("file-url"); | ||
| const path = require("path"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or these
test/success-tests/rebaseline.js
Outdated
| /* eslint-disable no-console */ | ||
|
|
||
| const path = require("path"); | ||
| const { EOL } = require("os"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As before, no horrible Windows EOLs in my projects. Please revert all changes to this file.
|
I've taken a look at your comments and believe I have addressed them. I've reverted unwanted/accidental changes, changed to the However, I'm still facing issues with external files. I'm going to try and play around with using temporary files to better emulate what it's like when opened in the browser. I believe the issue I'm facing may be because the location of the default page is |
|
OK. I think I've fixed the issue with external files. A temporary HTML file is created and the SVG contents are written to it (along with some boilerplate) and I simply tell puppeteer to navigate to that page. Output now includes external files. However, I don't see any existing tests for remote files (e.g. I don't fancy adding a unit test that depends on a remote file though. That's just asking for trouble. Thoughts? I'll try to find some time soon to test this further, but I'm hoping you can continue your review. I've also left the examples within the readme as-is. I'll let you rewrite them if you feel that it's needed (e.g. in I'm also not very keen on adding the sync approach that you described. Seems very hacky and I'm not sure what benefit it has. Surely those wanting such a hack could do it themselves and we can keep our code lean and clean? As for the fonts; I believe they're different because Chromium is handling fonts correctly and PhantomJS was not, but I'm guessing here. |
|
Just did some testing and remote file references work as well. I'm happy with the code and functionality. Let me know if you find any issues or want anything else changed. |
lib/svg2png.js
Outdated
| const start = svg.indexOf("<svg"); | ||
|
|
||
| let html = HTML_PREFIX; | ||
| const baseUri = options.url ? options.url : "http://localhost/"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
url, not uri
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll get this fixed.
I was actually wondering if the API would be simpler if we simply had a single base option. It might be good if this was exposed as a CLI option as I can imagine it'd be useful for converting files downloaded online. But that's a story for another PR, I guess.
|
I do want to get the .sync functionality back; it should be this library's job to encapsulate such ugliness, and not make users do it. I'm OK with the temporary file approach if it's truly the only way, but I would have thought Puppetteer would have had APIs for loading some content directly from a string, not just from a file. If that's possible that would be nice... if not, maybe we should open an issue on Puppetteer (while still merging this). The Any idea why 18/19.png changed? Not a problem (this stuff is weird), just curious if you happened to have any ideas. My hope is to merge this and apply any of my own tweaks over the weekend and do a release then. \o/ |
|
Would you mind if I omitted the Puppeteer does support setting HTML content of the page, however, it was failing to load external files when done this way. There is a request interceptor API that I was playing around with when I was trying the data URL approach but that wasn't working for some reason (was not being triggered for resources loaded within the SVG). That said, I can give it another shot now that we're using the screenshot approach as it could possible allow us to override requested URLs to be relative to the I might let you play around with the dimensions, if you don't mind. The viewport sizing and clip settings were needed to make it work properly though. All the file sizes have changed. Some increased and others decreased. Not sure why but no doubt simply because we're using a separate tool to create the PNG buffer. I'll look to fix the variable naming issue and take a quick stab at getting it working without the temp file. If you want me to include anything else in this PR (e.g. sync) please let me know. This was my first time using async functions, so I hope they're OK. Loved using them, but I'm not sure about my error handling. |
|
I renamed the variable you mentioned in your review. I also took a quick look at using the request interceptor API to dynamically resolve requests with relative URLs against the URL but my listener is still not being triggered for requests originating from within the SVG. When I load However, I was playing around with this in a linux VM (running on c9.io) instead of my main linux laptop and I noticed that some of the tests were failing. I believe that this is because of some font variations on the operating systems. I was surprised that one of these Arial and more surprised that another was Segoe UI. The VM doesn't appear to have Arial installed correctly (the characters don't look very Arial to me but are not the default fonts) but does have Segoe UI installed. I think that this is why some of the fonts in the success-tests I generated yesterday looked off. I'm guessing my linux laptop doesn't have Segoe UI installed so it was falling back to the default font. It might be good if the tests simply use serif/sans-serf unless the font is bundled (like the Roboto test) so that they are truly cross-platform and avoid such issues. After this has been merged, you might want to take some time to generate the success-tests images on your machine since you'll know best what the aim is. I think/hope that's my work here done 😄 |
|
I've just fixed the expected output for tests using the Segoe UI font. Let me know if there's anything else you want me to do. |
|
Any update on this? I'm not sure if there's any action expected on my part. |
|
Well, I don't plan to merge this until I have time to work on a sync version and investigate simplifications to the sizing logic, which could be a week or two. I'm surprised about all this talk of an interceptor API. I would have anticipated Puppetteer would have had a way to just set the contents from a string and set the URL at the same time, thus making URL resolution just work. |
FWIW, I did get around to testing data URLs with pupeteer's |
|
Personally, I feel like the PhantomJS approach was hacky by allowing a dummy URL to be used without navigation. Chrome's approach seems to be more inline with how browsers actually work. I don't see how data URLs would work with embedded files. |
|
would be nice to have this, currently other solutions such as sharp have cross-platform issues |
|
I had an immediate requirement, so I gave up and made my own library that does exactly this: https://github.com/NotNinja/convert-svg. |
|
Chromy is also quite convenient (just opening the SVG directly and screenshotting it), cf. https://github.com/OnetapInc/chromy/blob/master/examples/screenshot.js |
I think we all know that PhantomJS is past it. It's out-of-date and has so many bugs for rendering SVG images that they are simply not fixing. It's basically not fit for purpose anymore, at least not for svg2png. I have reimplemented svg2png using headless Chrome instead using the Chrome team's puppeteer. This library provides a great API that we can leverage to manipulate the DOM via DevTools integration and it downloads the latest Chromium build on installation, just like
phantomjs-prebuiltdoes.The primary caveat is that I don't see a way of supporting
svg2png.sync. I realise that this could be a deal-breaker, but perhaps the trade-off is worth the compromise. A major version bump would be needed, of course, as this is entirely a breaking change.I've not finished everything yet but I thought I'd open the PR early to get initial feedback on review. The last thing I need to fix could be a deal-breaker in itself if I can't get it working; I'm trying to get the document to be aware of the file's location so that relative HREF's are resolved correctly. Honestly, I'm a bit stumped at how this was working originally with PhantomJS. I'll try to find some time do this soon.
Also, I updated some documentation based on these changes, but I'm happy to revert these changes if you feel you'd rather do such things. No problem with me.
Finally, I've just noticed that my IDE has taken some liberties with import/require ordering. Do you want me to revert those changes or are you happy with them? I personally find them better, but I have it configured this way for my projects, and this project is yours, so I'm happy to revert them if you want me to.