Hi
Le dimanche 1 décembre 2013, vincent(a)massol.net a écrit :
Hi devs,
Yesterday
xwiki.org crashed and I had configured it to take a heap dump.
I’ve done a quick analysis that I’m sharing here (I’ll continue to analyse):
Memory retained: 1GB
Did it crash OOM ?
Main contenders:
1) Document cache: 178MB
This does not look that big too me
2) Lucene WeightedSpanTermExtractor: 166MB
This however seems important
3) IRCBot Threads: 165MB
This too
4) Velocity RuntimeInstance: 38MB
5) SOLR LRUCache (Lucene Document): 38MB
6) EM DefaultCoreExtensionRepository: 38MB
7) NamespaceURLClassLoader: 23MB
I’ve started analyzing some below.
1 - Document Cache Analysis
=======================
* There are 3552 XWikiDocument in memory for 195MB
* The document cache size is 2000 on
xwiki.org
* Large documents (such as Test Reports) take 6MB each (XDOM caching)
* So if we had only large documents in the wiki the cache would need
2000*6MB = 12TO
* I don’t think this cache is memory aware, meaning it doesn’t free its
entries when memory is low
* 178MB for 2000 docs means average of 89K per document. Huge variation
between docs with big content and docs with no or small content
This means that when memory is low on
xwiki.org it should be enough to
call a few pages with some large content to get an OOM.
4 ideas to explore:
Idea 1: Use a cache that evicts entries when some max threshold is reached
** Infinispan doesn’t support this yet:
https://issues.jboss.org/browse/ISPN-863 and
https://community.jboss.org/thread/165951?start=0&tstart=0
** Guava seems to support size-based eviction with the ability to pass a
weight function:
http://code.google.com/p/guava-libraries/wiki/CachesExplained#Size-based_Ev…
Clearly using a cache that is size aware would be much better.
Idea 2: Another idea is a usage of a distributed cache such as memcached
or elasticsearch. I wonder if the overhead of the network communication is
too high to make it interesting vs not caching the XDOM and rendering it
every time it’s needed.
I think this would not per fast enough at least for some docs like the pref
doc. I have done this is a google app engine experiment and a local cache
was needed in addition to memcache otherwise it would be slow.
Idea 3: Try to reduce even more how the XDOM is stored in memory
Indeed this looks big. Similar to the issue of caching big attachments.
Initially I used a soft reference but real life experience showed that the
JVM would not drop them and OOM was still possible. This would be a good
solution if it works also for XDOM cache
Idea 4: Don’t cache the XDOM and render every time and use a title cache
for titles. Also do that for getting sections. I think they are the 2 main
uses cases for getting the XDOM.
As a short term action, I’d recommend to immediately reduce the document
cache size from 2000 to 1000 on
wiki.org or double the heap memory.
Do both :)
2 - Lucene WeightedSpanTermExtractor Analysis
=====================================
I’m not sure what this is about yet but it looks strange.
* There is 166MB stored in the Map<String,AtomicReaderContext> of
WeightedSpanTermExtractor.
* That map contains 192 instances
* Example of map items: “doccontent_pt” (2.4MB), “title_ru” (1.8MB),
“title_ro” (1.8MB), etc
Any idea Marius?
3 - IRCBot Analysis
===============
* We use 3 IRCBot threads. They take 55MB each!
* The 55MB is taken by the ExecutionContext
* More precisely the 55MB are held in 77371
org.apache.velocity.runtime.parser.node.Node[] objects
I need to understand more why it’s so large since it doesn’t look normal.
I also wonder if it keeps increasing or not.
You should do a dump after a while clearing the cache before, then do the
same a couple of hours later and compare
Ludovic
5 - SOLR LRUCache Analysis
=======================
* It’s map of 512 entries (Lucene Document objects). 512 is the cache size.
* Entries are instances of DocSlice
Looks ok and normal.
6 - EM DefaultCoreExtensionRepository Analysis
======================================
* 38MB in "Map<String, DefaultCoreExtension> extensions”
* 33MB in org.codehaus.plexus.util.xml.Xpp3Dom instances (44844 instances)
which I guess corresponds to the pom.xml of all our core extensions mostly.
Looks normal even though 33MB is quite a lot.
7 - NamespaceURLClassLoader Analysis
================================
* 23MB in org.eclipse.jgit.storage.file.WindowCache
* So this seems related to the XWiki Git Module used by the GitHubStats
application installed on
dev.xwiki.org
This looks ok and normal according to
http://download.eclipse.org/jgit/docs/jgit-2.0.0.201206130900-r/apidocs/org…
Thanks
-Vincent
_______________________________________________
devs mailing list
devs(a)xwiki.org <javascript:;>
http://lists.xwiki.org/mailman/listinfo/devs
--
Sent from Mobile