On 08/03/2011 04:20 AM, Jerome Velociter wrote:
Hi Caleb,
This is exciting news !
On Tue, Aug 2, 2011 at 11:51 PM, Caleb James DeLisle
<calebdelisle(a)lavabit.com> wrote:
I have an instance of XWiki finally running on
Cassandra.
http://kk.l.to:8080/xwikiOnCassandra/
Cassandra is a "NoSQL" database, unlike a traditional SQL database it cannot do
advanced queries but it can store data in a more flexible way eg: each row is like a
hashtable where additional "columns" can be added at will.
Do we have a clear view of what XWQL queries will/will not be
supported, or is it too soon to say ?
Since XWQL is based on the idea of JPQL and the schema I designed is the same as the
schema which the XWQL interpreter pretended to be using.
As soon as I have JPQL functioning, simple XWQL queries such as:
"SELECT doc.fullName FROM XWikiDocument as doc WHERE doc.author =
'XWiki.Admin'" will need only be changed to:
"SELECT doc.fullName FROM com.xpn.xwiki.store.datanucleus.PersistableXWikiDocument as
doc WHERE doc.author = 'XWiki.Admin'"
and obviously, "where SQL" will be the same.
The traditional XWQL object querying notation: FROM XWikiDocument as doc,
doc.object(XWiki.XWikiComments) as cmt WHERE ...
maps over to the JPQL statement: FROM
com.xpn.xwiki.store.datanucleus.PersistableXWikiDocument as doc, IN(doc.objects) cmt WHERE
cmt.className = 'XWiki.XWikiComments' AND ...
The most
important feature of Cassandra is that multiple Cassandra nodes can be connected together
into potentially very large "swarms" of nodes which reside in different racks or
even data centers continents apart, yet all of them represent the same database.
Cassandra was developed by Facebook and their swarm was said to be over 200 nodes
strong.
In it's application with XWiki, each node can have an XWiki engine sitting on top of
it and users can be directed to the geographically closest node or to the node which is
most likely to have a cache of the page which they are looking for.
Where a traditional cluster is a group of XWiki engines sitting atop a single MySQL
engine, this allows for a group of XWiki engines to sit atop a group of Cassandra engines
in a potentially very scalable way.
In a cloud setting, one would either buy access to a provided NoSQL store such as
Google's BigTable or they would setup a number of XWiki/Cassandra stacks in a less
managed cloud such as Rackspace's or Amazon's.
How it works:
XWiki objects in the traditional Hibernate based storage engine are persisted by breaking
them up into properties which are then joined again when the object is loaded.
A user object which has a name and an age will occupy a row in each of three tables, the
xwikiobjects table, the xwikistrings table, and the xwikiintegers table.
The object's metadata will be in the xwikiobjects table while the name will be in a
row in the xwikistrings table and the age, a number, will go in the xwikiintegers table.
The NoSQL/Datanucleus based storage engine does this differently, the same object only
occupies space in the XWikiDocument table where it takes advantage of Cassandra's
flexibility by simply adding a new column for each property.
NOTE: this is not fully implemented yet, objects are still stored serialized.
What works
* Document storage
* Classes and Objects
* Attachments
* Links and Locks
* Basic querying with JDOQL
What doesn't work
* Querying inside of objects
* JPQL/XWQL queries
* Document history
* Permissions (requires unimplemented queries)
* The feature you want
I am interested in what the community thinks is the first priority, I can work on
performance which will likely lead to patches being merged into master which will benefit
everyone
You mean global performance of XWiki, or something in a specific area
? FYI in case you would have missed it there was a mail by Paul
Libbrecht about a possible fine tuning of Hibernate cache that could
boost performance ([xwiki-devs] hibernate cache optimization?).
To answer your question, as member of the community I am interested in
performance of XWiki with big number of documents ; I'd say both
generic performance improvements or experimental work on Cassandra
fits in this line :)
Cool. Cassandra itself should scale well, what we really need IMO is to run through the
line of execution and find out what takes the longest and fix.
Caleb
Jerome
or I can work on more features which will benefit
people who want to use XWiki as a traditional application wiki but use it on top of
Cassandra.
You can reply here or add comments to the wiki ;)
Caleb
_______________________________________________
devs mailing list
devs(a)xwiki.org
http://lists.xwiki.org/mailman/listinfo/devs
_______________________________________________
devs mailing list
devs(a)xwiki.org
http://lists.xwiki.org/mailman/listinfo/devs