On 04/29/2014 08:06 AM, ananthakrishna wrote:
Hello,
I want to load content in large number of(millions) text files into
XWiki pages.The text files contain Unicode Kannada characters(as well as
some English characters).
Is there a way to programmatically load these text file content into a
XWiki page? Also will it work for millions of pages?
First, yes, I think there is a way to import some files as pages; it would look like:
{{groovy}}
import org.apache.commons.io.IOUtils
File importFile = new File("/path/to/test.txt")
println "file exists at " + importFile.getPath() + " : " +
importFile.exists()
def encoding="UTF-8"
def input = new FileInputStream(importFile)
def output = new StringWriter()
IOUtils.copy(input, output, encoding)
def doc = xwiki.getDocument("Sandbox","ImportPage")
doc.setContent(output.toString())
doc.save("imported from "+ importFile.getPath())
println "saved as [["+doc.getFullName()+"]]"
{{/groovy}}
With a user with programming rights (like the default "Admin") you can paste
this snippet
into some page and then the import of the file will be triggered every time you visit the
page.
There are quite some caveats, however:
- I guessed the encoding is "UTF-8"
If your file is in a different encoding, you need to adapt the line: def
encoding="UTF-8"
accordingly, ie using "UTF-16" or whatever it is.
- The content is imported as if it is plain text which contains the content in the right
wiki syntax.
I have no idea if this is even close in your case. If the text is in a different format,
you need to read it
in a different way. If e.g. the files are in office format, you would need to talk to
the
office importer instead.
In that case you can check the documentation for service.officeimporter
in the "scripting reference documentation":
http://platform.xwiki.org/xwiki/bin/view/SRD/Navigation?xpage=embed
I have to admit that I have no idea if the import will work for millions pages.
If you have the "jetty-hsqldb" distribution with that "Unzip it and ready
to run" charm,
then I am quite sure it will not work, as the "hsqldb" is a database not suited
to store that much data.
In that case you might switch to a "real" database, like mySQL, Oracle or
PostgreSQL.
In any case I would recommend to start small ;)
hope this helps,
Clemens