Skip to content

Pagination on the Web is encapsulated magic and introduces usability cliffs #4

@LeaVerou

Description

@LeaVerou

Currently, browsers implement functionality to paginate and print webpages, and provide limited control over this via the print media type and the break-* (or the older page-break-*) properties.

However, this is all magic encapsulated in the browser implementation, and incredibly basic functionality is impossible to do, such as:

  • Creating a table of contents with actual page numbers
  • Printing the page number for a link that points to a local resource (e.g. a section in the document)
  • Printing a header or footer
  • Even leaders are actually incredibly difficult to do well with today’s CSS

There are no easy workarounds, even via JS, the pagination algorithm is entirely encapsulated UA magic and authors have zero access to it. As a result, authors are forced to paginate content manually, which is how tools like Paged.js work. Given how complex pagination is, this is an incredibly fraught and slow process, especially for nontrivial layouts or modern CSS.

The use cases are not complex large projects such as document editors (though those definitely suffer too), but any nontrivial printing task, whether that is a content-heavy website that wants to also have a usable print style, or a document authored with web technologies.

Some proposals have been outlined decades ago in the CSS Paged Media and CSS Generated Content specs:

  • A predefined page CSS counter that can be used wherever
  • Margin @-rules (e.g. @bottom-right) to print out page numbers and headers/footers
  • target-counter() to look up the value of a counter on a different element so that e.g. cross-references become target-counter(attr(href url), page)
  • A leader() function
  • A while pack there was also a proposal for arbitrary paging, which was prototyped by some browsers but never gained wider traction.

However, there are multiple issues at play here:

  1. Spec issues: These features have always been underspecified, some way too broad for their use cases (avoiding overfitting is generally a good thing, but not when it makes a feature infinitely harder to implement), not updated for years so now out of date with the rest of CSS
  2. Implementor issues: Very low browser implementor interest for anything print-related.

As a result, these features have only been implemented by PDF formatters such as PrinceXML or AntennaHouse, i.e. specialized CLI tools that take an HTML document (+ associated resources) and spit out a PDF document. While in heavy use in many industries, including big publishing houses, PDF formatters are not a general solution. First, they lack the resources of browsers, and thus their CSS implementations often lag years behind and are more buggy than browsers. Being able to have page numbers or using CSS that shipped in the last decade should not be a mutually exclusive choice! But more importantly, there should not be a dichotomy about whether a document is only a website (and thus displayed in a browser) or only a printed document (and thus printed with a specialized print formatter). And there is something to be said about making printing with web technologies dependent on a non web platform technology such as PDF.

Clearly, UAs are not willing to implement these, or they’d have done it over a decade ago. We need to figure out of there is a minimum functionality that UAs would be willing to implement which could lower the pain and fragility of these tasks. Even a JS API that gives access to the UA’s native pagination algorithm would be a huge improvement over the current state.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions