It would be great if you could add a cod snippet on code.xwiki.org that describes this.

Thanks

-Vincent

On Mar 13, 2008, at 2:06 PM, Guillaume Lerouge wrote:

Hi Jim, thanks for the feedback.

I was able to quickly integrate our lab environment's Google Mini Appliance with xwiki today…
The set-up of the appliance was simple (after some experimentation on what to filter out to reduce redundancy and confusion);
1.       List of urls to crawl (e.g http://hostname.domain:8080/xwiki )
2.       List of patterns to follow (e.g. hostname.domain:8080/xwiki)
3.       List of patterns to NOT crawl – I added to the default list the following
a.       contains:?viewer=code
b.      contains:?format=rtf
c.       contains:?format=pdf
d.      contains:?tag=
e.      contains:?xpage=print
f.        contains:?rev=
I don't think there's a risk, but you may want to add "contains:delete", "contains:edit", "contains:inline" & "contains:?editor=" to your list... At worse it will make indexing faster.

Guillaume