Unfortunately the fix I implemented for this issue (i.e. to scale down the overflowing tables using font-size and CSS transformation) is breaking to of the existing automated PDF export tests: largeExcelImport and largeTable. For the latter, at least, the outcome is that the last row of the table, is partially visible at the end of the print page. This is not caused directly by my fix, but the fact that the table is scalled down it means more rows fit the print page, and we hit a situation that is not covered by XWIKI-20741 and XWIKI-21043.
I debugged the issue on paged.js code and the root cause is in the implementation of Chunker#findOverflow, the function used to detect when the content overflows the current print page. See https://github.com/pagedjs/pagedjs/blob/v0.4.3/src/chunker/layout.js#L640-L668 . The issue is that the code is using the top left corner of the text (row/line) to determine if a text node fits inside the current print page. But the text (row/line) might fit partially, which is the issue I'm encountering.
I tried patching paged.js to use the bottom right corner of the text (row/line) instead, which fixes my problem, but it causes 2 test failures on the paged.js side. After investigating the failures I realized that the reason they used the top left corner, instead of the bottom right corner, is to prevent an infinite loop of creating endless print pages. In some edge cases (e.g. if the print page size is small and the font-size is huge, for which paged.js has an automated test), with my fix, the text might not fit the print page and thus is moved to the next page in an endless loop. I suppose their argument is that if a text can start on a print page then it should stay on that print page, because most of the text has a decent font-size; if the font-size is big (e.g. because it's a heading) then maybe it needs a page break before.
I tried to improve my fix in order to not break the existing paged.js tests but in the end I gave up because the code of Chunker#findOverflow function has been refactored on the main branch anyway. Unfortunately, the latest code has only been released as beta.
Next I tested my original fix with the latest version of Chrome, using "User's browser" as PDF generator, and it didn't show the problem. The autmated tests (where the problem occurs) are using an older version of Chrome, 124, while locally I have Chrome 142 (where the problem doesn't occur). The reason we're using an older version of Chrome for running the tests is because there's no newer version provided by the Docker image you're using https://hub.docker.com/r/zenika/alpine-chrome/tags .
So I looked for another Docker image that provides headless Chrome and I found https://hub.docker.com/r/femtopixel/google-chrome-headless/tags which provides the latest version. Moreover, this is maintained by a French guy from Paris , so I gave it a try. With a bit of research and some help from Copilot / Claude I managed to connect to it, but only when using the host Docker network mode. I couldn't connect to the headless Chrome when using the bridge network, which is used by the automated tests. I tried all sorts of things (suggested by Copilot / Claude) but couldn't make it work with the bridge network. In the end, searching on the web, I was hit by:
Basically, the recent versions of Chrome accept remote debugging only from localhost connections (for security reasons). This explains why it worked with host Docker network and not with the bridge network. In order to be able to connect remotely you need to have a (reverse) proxy in front of the headless Chrome, running inside the Chrome Docker container, that forwards incoming HTTP / WebSocket requests to Chrome on localhost.
Looking for when this change happened in Chrome, I found version 126, which explains why zenika/alpine-chrome doesn't provide a more recent version of Chrome... and then finally I found https://github.com/jlandure/alpine-chrome/issues/253 (issue opened for zenika/alpine-chrome), which is still open.
So now I need to see how to set up the proxy in front on headless Chrome.
This message was sent by Atlassian Jira (v9.3.0#930000-sha1:287aeb6)
If image attachments aren't displayed, see this article.