Cool Jim! Thanks for sharing this.
It would be great if you could add a cod snippet on
code.xwiki.org
that describes this.
Thanks
-Vincent
On Mar 13, 2008, at 2:06 PM, Guillaume Lerouge wrote:
Hi Jim, thanks for the feedback.
I was able to quickly integrate our lab environment's Google Mini
Appliance with xwiki today…
The set-up of the appliance was simple (after some experimentation
on what to filter out to reduce redundancy and confusion);
1. List of urls to crawl (e.g
http://hostname.domain:8080/
xwiki )
2. List of patterns to follow (e.g. hostname.domain:8080/xwiki)
3. List of patterns to NOT crawl – I added to the default
list the following
a. contains:?viewer=code
b. contains:?format=rtf
c. contains:?format=pdf
d. contains:?tag=
e. contains:?xpage=print
f. contains:?rev=
I don't think there's a risk, but you may want to add
"contains:delete", "contains:edit", "contains:inline" &
"contains:?
editor=" to your list... At worse it will make indexing faster.
Guillaume