Hello Vincent,
While I strongly believe that a NoSQL-type of storage is a fundamentally
good idea to store activity streams, I believe you may be attracted by
applying ElasticSearch mostly on a superficial basis compared to Solr.
Most analytics systems base indeed on noSQL storages, ElasticSearch and
Solr are examples of such. Many bigger systems are used in other
analytics solutions such as CouchDB and MongoDB. Almost all will
optimize for the chosen views.
My impression is that many persons are excited by ElasticSearch because
it has fancy UIs, whereas Solr may be more optimized for its very
effective caching. In both cases, the creation of an analytics system
will involve designing a storage that architects for making effective
the queries that are expected by the views of the analytics system, e.g.
the row of page-view-counts along recent times. I would expect a Solr or
ElasticSearch based Stats module to have few differences.
One thing that is crucial when using a stats system (and, I believe,
even if trying to adjust the SQL-stored-activity-stream by doing less
writes) is that viewers should not expect a perfect real time updated
view. ElasticSearch and Solr have the same behaviour: real time is only
"near real time". Alternatively, the real-time aspect (as done by Google
analytics for example) should be a completely separated view which
probably bases on in-memory values.
paul
PS: did you consider using hsqlDB for a part of this?
This is in memory and locks are certainly way less hurting.
Persistence should be somewhat decoupled...
PPS: schema evolution is never painless, even in a noSQL system. If a
field needs to be merged or split, there is a price to it, whatever the
storage system.
vincent(a)massol.net <mailto:vincent@massol.net>
21 novembre 2015 12:01
Hi devs,
I think that for data that are both not critical and high volume we
should use ElasticSearch instead of saving them in our RDBMS.
So the idea would be to have an embedded ES in XWiki by default (using
the permanent directory to store its data) and admins could configure
XWiki to use a separate ES instance (very similar to what we do with
SOLR).
Whenever a user modifies/creates/deletes/does operations on
XObjects/etc, this is sent to ES.
The AS UI queries ES to display the data.
The Stats UI does the same.
Pros:
- scalability
- performance
- extensibility. It’s easy to evolve the schema in ES, and we can
easily have several formats (as was proven by the Active Installs code)
I’d like to start a POC in my “free” time.
WDYT?
Thanks
-Vincent
_______________________________________________
devs mailing list
devs(a)xwiki.org
http://lists.xwiki.org/mailman/listinfo/devs