Hi,
I'm experiencing frequent crashes that I think are related to Solr, and I'd
appreciate any advice on how to fix them.
I have a fairly large wiki (with several thousand attachments) running
5.4.3 on tomcat7, and it goes down multiple times a day. Sometimes it will
remain unresponsive for a few minutes then come back on its own, other
times it stays down until I restart tomcat. Usage runs anywhere from a few
active users to several thousand, depending on time of day. Crashes happen
even when the number of users is low.
Checking catalina.out and the localhost_access logs, there almost always
appears to be requests related to solr around the time it goes down.
Sometimes catalina.out shows a java heap out of memory error, but other
times there are no errors in the logs at all, the site is just unresponsive.
I can usually reproduce the issue by running several searches back to back,
while browsing pages in another tab. While the searches are in progress,
the responsiveness of browsing pages steadily drops until it times out. It
usually only takes 5 or 10 searches before this happens.
The other thing that makes me suspect Solr is that even when the site
doesn't crash, while running a search in one tab, requests in the other tab
don't finish until the search is complete - it (subjectively, I know) feels
like other requests are suspended while Solr is thinking.
Potentially related is that I occasionally also see JDBC connection pool
related errors in the logs when it goes down - I have tried tweaking my
connection pool size settings, to no avail.
My current java memory settings are:
JAVA_OPTS="-Djava.awt.headless=true -Xms2500m -Xmx2500m -XX:PermSize=64m
-XX:MaxPermSize=256m -XX:+UseConcMarkSweepGC"
I've used everything from the 800m
xwiki.org settings to 3GB+, and it
doesn't seem to make a difference.
I have also reduced the cache settings from 3000 back to 100 in
xwiki.properties, which hasn't helped.
My hibernate settings are:
connection.pool.size = 100
statement_cache.size = 50
dbcp.maxActive = 100
dbcp.maxIdle = 10
dbcp.maxWait = 300000
Mysql max_connections is set to 290
I have changed nearly all of those numbers at some point, but nothing seems
to help.
Any assistance is greatly appreciated - even just tips on how to
troubleshoot what's actually taking the site down, since the logs aren't
always useful.
thanks,
aaron