On 01/08/2012 03:53 PM, Guillaume Fenollar
wrote:
Hi Kaya,
Yes, if you don't use any front webserver (ie Apache or nginx), you should
put robots.txt directly into /ROOT directory of tomcat (if this one listen
on port 80). After that, you can simply test your set up, trying to join
http://youdomain.org/robots.txt. If you don't find it this way, bots won't
find it neither.
Thanks for the response Guillaume!
I found a site:
http://www.frobee.com/robots-txt-check
which actually tests compliancey of the robots.txt and it seems mine are fine.
Concerning the disallow directives, it is your
choice to let the bots to
index what you want/need. My advice would be the make an inventory of space
and actions you don't want to index.
You could take this one as example:
http://cdlsworld.xwiki.com/robots.txt I took a
look at it and will compare that to the example off the Xwiki site.
Finally, it's funny you're asking about
the fact that bots could harass
your server, because almost everyone want them (except for bad robots) to
come indexing their websites :-)
Anyway, I don't think that robots could take a remarkable amount of trafic.
But the users who find your content through search engines, will ;-) I
guess it's what you want.
It's not that I don't want things to be
indexed or viewed but am getting a strange issue on one of my Xwiki sites that whenever I
load the site, ie start tomcat, the memory usage is really low ~600MB; then after a while
the cpu will start working a little ~10% and the memory consumed by the process will jump
up to 1.6GB. There's not much on that site to begin with, I mean my Wiki site has more
information and images etc.. then this site which is my www site yet the www site is
consuming way more memory??
I'm not really sure of how to even begin debugging as I have both webalizer and
awstats working on my reverse Squid proxy infront of tomcat. So far awstats which has been
working from the beginning (3rd Jan this year) shows nearly 9000 hits :-S out of which a
lot come from Googlebot.
That was my only issue.
The URLs of both sites are here:
http://www.optiplex-networks.com
http://wiki.optiplex-networks.com
and footprints are shown here:
PID JID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
51547 22 www 46 44 0 3545M 1590M ucond 1 6:04 0.00% java
28878 14 www 49 44 0 3544M 404M ucond 0 3:47 0.00% java
with JID 14 being the wiki. site and JID 22 being the www. site.....
Regards,
Regards,
Kaya
_______________________________________________
users mailing list
users(a)xwiki.org
http://lists.xwiki.org/mailman/listinfo/users
_______________________________________________
users mailing list
users(a)xwiki.org