[xwiki-users] Searching workaround for HTML in title-field
Caleb James DeLisle
calebdelisle at lavabit.com
Sat Jun 12 17:17:23 UTC 2010
Ivan Levashew wrote:
> James DeLisle wrote:
>> This is not a problem only with the search field. It's a security policy that
>> XWiki allows it's users to run script. In syntax 1.0 you are allowed to type
>> HTML (and thus script) into the document, in syntax 2.0 you can use HTML in
>> the document by invoking the HTML macro.
>> My opinion is that to prevent users from running script you would have to set up
>> an output filter such as Apache mod_filter and implement a policy which blocks all
>> script which is in parts of the page which are user editable.
> I have a short experience with Jaxer. It is not maintained, but is
> pretty usable if starting from scratch is OK for you. Notable difference
> is that most operations are performed on structured DOM trees as opposed
> to structureless strings in e. g. PHP. The engine is serverside Mozilla,
> so the tricks like innerHTML are working, but strings are exception
> instead of a rule in the world of Jaxer.
We have something like that. We call it XDOM. It can be rendered into XHTML, PDF
OpenOffice export etc.
There is an issue for adding means to manipulate the XDOM using server side script.
1. Lots of content (including all of the .vm templates) is in Syntax 1.0 which doesn't use the new rendering module.
2. Syntax 2.0 parsers contain lots of code and have bugs of their own.
IMO security code should be as small and as heavily reviewed as possible.
> I think, sooner or later it would be evident that next major revision of
> X-Wiki must use structured data instead of all that
> easy-to-forget-because-anyway-it-mostly-works escaping.
> This makes sense because, for instance, one might want to enable users
> to post into common blog but prevent them from storing scripts. There is
> no "string noscript(string)" function. It is far beyond mere escaping.
> Even if such function existed, parsing and serializing is CPU-costly.
I think you could strip all types of script invocation from a stream of xml without actually parsing the xml.
If you can't do it with sed, it's not worth doing ;)
More information about the users