Cool Jim! Thanks for sharing this.

It would be great if you could add a cod snippet on code.xwiki.org that describes this.

Thanks
-Vincent

On Mar 13, 2008, at 2:06 PM, Guillaume Lerouge wrote:

Hi Jim, thanks for the feedback.

I was able to quickly integrate our lab environment's Google Mini Appliance with xwiki today…

The set-up of the appliance was simple (after some experimentation on what to filter out to reduce redundancy and confusion);

1.       List of urls to crawl (e.g  http://hostname.domain:8080/xwiki )

2.       List of patterns to follow (e.g. hostname.domain:8080/xwiki)

3.       List of patterns to NOT crawl – I added to the default list  the following

a.       contains:?viewer=code

b.      contains:?format=rtf

c.       contains:?format=pdf

d.      contains:?tag=

e.      contains:?xpage=print

f.        contains:?rev=

I don't think there's a risk, but you may want to add "contains:delete", "contains:edit", "contains:inline" & "contains:?editor=" to your list... At worse it will make indexing faster.

Guillaume