Ludovic Dubost wrote:
Hi Anca,
Great analysis. This gives a lot of good information to make the right
choice for XWiki Watch.
I suggest we run with 120000 articles to see the trend of query time for
all three methods.
You should run the tests again with indexes on feed_feedentry because
this can bring good improvements.
The optimized query for clicking on "All" seems to be still slow (it
could be bad with more data).
We should also test a text search and a tag or keyword search
Done all, and tried to log the results as good as possible despite the
instability of the results. Everything is available at
http://watch.xwiki.org/xwiki/bin/view/Development/SpeedingUp , along with some
details on how the optimisation was made and a couple of observations.
As an overall view, optimised SQL seems to scale pretty well, which, combined
with the stability, reliability and extensibility of this method, make me +1,
again, for the optimised SQL.
Along with this, me and Cati have been working (designing / coding) on the new
Watch interface that would, among other things, localize the loading messages so
that the slow server requests become a lot more transparent to the user.
Happy coding,
Anca Luca
Ludovic
Guillaume Lerouge wrote:
Hi,
I've summarized Anca's findings below (great testing btw ;-) , I bet we
should do this more often) :
[snip]
| Standard SQL | Lucene | Optimized SQL | Winner
initial loading of the articles, in a newly started server | 30-40 seconds
| up to 20 (15-16) | around 10 | OSQL
initial load of the interface, in a non-newly started server | ~15 seconds |
~4-5 on average (but can go up to 10) | 7 | Lucene
click on the All group | around 7-8-9 seconds | 1 second | 5 on average,
from 3 to 7 | Lucene
click on a feed with 1023 articles | 3 seconds | from under a second to a
couple seconds | under a second (0.7-0.8) | OSQL
pagination navigation | 2-3 seconds | a second on average | 2-3 on average |
Lucene
As we can see Lucene still has the edge a majority of times, but Optimized
SQL comes close in most cases. As far as my understanding of this issue
goes, I'd advise going for SQL optimization instead of Lucene for the
following reasons :
- It is better suited to handle the highly structured data coming from
XWiki Watch
- It already offers a good performance and could deliver even more if
fully implemented
- It goes in the right way in terms of making the XWiki Watch
distribution on par with other XWiki products (such as XWS & XEM) in terms
of code organization (client-side / server-side)
- The Lucene indexing engine integration with XWiki is still error-prone
- Lucene doesn't work for real-time actions that are used a lot in XWiki
Watch
Which is why, on the whole, SQL Optimization seems better than Lucene to me.
Please tell me if I've missed something.
Guillaume