Hi, Your code seems correct. The mediawiki parser in xwiki may not support the full mediawiki syntax and it has some limitations. Maybe you’re hitting them. In order for us to improve it, it would be very nice if you could create some jira issues at http://jira.xwiki.org/browse/XRENDERING/component/12460 providing some input content so that we can reperoduce the issue and fix it. There’s another solution for you which will work better. It’s to generate HTML from the wikipedia page and then use the XWiki HTML parser to read them. This parser is much stronger than the mediawiki one I belive. Thanks -Vincent On 26 Feb 2016 at 09:25:23, alm ([email protected](mailto:[email protected])) wrote:
I am reposting this as I am not sure if registered user on XWiki-Users forum can see it. Kindly request from registered XWiki-Users forum user to reply a blank message to confirm.
I have mediawiki format text for wikipedia article on Speed I used xwiki libaries to convert but the conversion to HTML5 is very poor. This is the code I used:
try {
String str = FileUtils.readFileToString(new File("SpeedwikiText.txt"), "UTF-8"); // XWIKI
// Initialize Rendering components and allow getting instances EmbeddableComponentManager componentManager = new EmbeddableComponentManager(); componentManager.initialize(this.getClass().getClassLoader());
// Get the MediaWiki Parser Parser parser = componentManager.getInstance(Parser.class, "mediawiki/1.0");
// Parse the content in mediawiki markup and generate an AST (it's also possible to use a streaming parser for large content) XDOM xdom = parser.parse(new StringReader(str))
// Generate XHTML out of the modified XDOM WikiPrinter printer = new DefaultWikiPrinter(); BlockRenderer renderer = componentManager.getInstance(BlockRenderer.class, "html/5.0" ) ; // "xhtml/1.0"); renderer.render(xdom, printer);
// The result is now in the printer object
FileUtils.writeStringToFile( new File("xwiki" + name + ".html"), printer.toString());
} catch(Exception e) { System.out.println(e.getMessage()); }
There are many issues: math symbols or formula are not converted and the html output is something like v = \\frac{d}{t}, where is speed.
Top header is missing. (Speed)
tables are not formatted properly.
References section is incomplete.
Ideally I would like HTML to be exactly look like in wikipedia including images where image links are pointing to images hosted on wikimedia and are display in the same location as in wikipedia. Is this achievable and what do I need to need in above code?