Hi Guys,
Would be great if you could update the existing wiki documentation with the information in
this thread since it wasn't good enough in the first place apparently :)
Thanks!
-Vincent
On Jan 8, 2012, at 4:49 PM, Kaya Saman wrote:
On 01/08/2012 03:53 PM, Guillaume Fenollar wrote:
Hi Kaya,
Yes, if you don't use any front webserver (ie Apache or nginx), you should
put robots.txt directly into /ROOT directory of tomcat (if this one listen
on port 80). After that, you can simply test your set up, trying to join
http://youdomain.org/robots.txt. If you don't find it this way, bots won't
find it neither.
Thanks for the response Guillaume!
I found a site:
http://www.frobee.com/robots-txt-check
which actually tests compliancey of the robots.txt and it seems mine are fine.
Concerning the disallow directives, it is your choice to let the bots to
index what you want/need. My advice would be the make an inventory of space
and actions you don't want to index.
You could take this one as example:
http://cdlsworld.xwiki.com/robots.txt
I took a look at it and will compare that to the example off the Xwiki site.
Finally, it's funny you're asking about the fact that bots could harass
your server, because almost everyone want them (except for bad robots) to
come indexing their websites :-)
Anyway, I don't think that robots could take a remarkable amount of trafic.
But the users who find your content through search engines, will ;-) I
guess it's what you want.
It's not that I don't want things to be indexed or viewed but am getting a
strange issue on one of my Xwiki sites that whenever I load the site, ie start tomcat, the
memory usage is really low ~600MB; then after a while the cpu will start working a little
~10% and the memory consumed by the process will jump up to 1.6GB. There's not much on
that site to begin with, I mean my Wiki site has more information and images etc.. then
this site which is my www site yet the www site is consuming way more memory??
I'm not really sure of how to even begin debugging as I have both webalizer and
awstats working on my reverse Squid proxy infront of tomcat. So far awstats which has been
working from the beginning (3rd Jan this year) shows nearly 9000 hits :-S out of which a
lot come from Googlebot.
That was my only issue.
The URLs of both sites are here:
http://www.optiplex-networks.com
http://wiki.optiplex-networks.com
and footprints are shown here:
PID JID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
51547 22 www 46 44 0 3545M 1590M ucond 1 6:04 0.00% java
28878 14 www 49 44 0 3544M 404M ucond 0 3:47 0.00% java
with JID 14 being the wiki. site and JID 22 being the www. site.....
Regards,
Regards,
Kaya
_______________________________________________
users mailing list
users(a)xwiki.org
http://lists.xwiki.org/mailman/listinfo/users