For my wiki, we used to have a crawler like this (snuffing every links out
there), then, due to the useless time spent trying to retrieve pages and
checking rights, I wrote small scripts to provide an indexing page to the
crawler.
This page lists all links that I WANT the crawler to fetch (and filtered for
anonymous users), and is updated regularly from a cron (so it is not
computed upon request but pre-computed).
I really helped, the crawler was bringing performances really low :)
--
View this message in context:
http://xwiki.475771.n2.nabble.com/severe-trouble-with-web-crawlers-tp744216…
Sent from the XWiki- Users mailing list archive at
Nabble.com.