Hi Caleb,
Exiting news indeed. This looks great and there seems indeed to have
quite a few things already working.
I have one question concerning the mention of the multiple XWiki nodes
connecting in different location to multiple Cassandra nodes. This
would also mean that there is some tweaking in the XWiki Cache or a
new "cluster" mode which allows WAN communication between instances.
Otherwise you could be editing or viewing an older version than what's
really in the cassandra store.
Have you looked at this already ? If you have touched the XWiki Cache
maybe that's why you have performance issues. It is importante to
cache the XWikiPreferences document, as it is highly requested. One of
the things I did on the Google Store work I did a while ago, is to
have a special additional cache in the XWikiContext which would make
sure we don't check the MemCache that was containing the most recent
version number of XWiki documents. This allowed to have a decent
performance. Only the first access to a document in a given HTTP
request would trigger a version number verification.
I was wondering how you handle the queries used by the XWiki core and
the default XAR application ? In the end I believe we need to move
core queries to XWQL to have compatibility accross stores.
I saw that wiki macros don't seem to work. This must be because of
missing objects queries.
In terms of priorities I believe the following are important:
- assesment of which default XE feature is not working and what it
requires to make it work (this would allow to "define" what a
Cassandra XE version would be)
- basic XWQL querying with queries on objects
- history
- permissions
Also at some point
- performance comparaison with a very large number of documents / very high load
Great stuff in any case..
Ludovic
2011/8/2 Caleb James DeLisle <calebdelisle(a)lavabit.com>om>:
I have an instance of XWiki finally running on
Cassandra.
http://kk.l.to:8080/xwikiOnCassandra/
Cassandra is a "NoSQL" database, unlike a traditional SQL database it cannot do
advanced queries but it can store data in a more flexible way eg: each row is like a
hashtable where additional "columns" can be added at will.
The most important feature of Cassandra is that multiple Cassandra nodes can be connected
together into potentially very large "swarms" of nodes which reside in different
racks or even data centers continents apart, yet all of them represent the same database.
Cassandra was developed by Facebook and their swarm was said to be over 200 nodes
strong.
In it's application with XWiki, each node can have an XWiki engine sitting on top of
it and users can be directed to the geographically closest node or to the node which is
most likely to have a cache of the page which they are looking for.
Where a traditional cluster is a group of XWiki engines sitting atop a single MySQL
engine, this allows for a group of XWiki engines to sit atop a group of Cassandra engines
in a potentially very scalable way.
In a cloud setting, one would either buy access to a provided NoSQL store such as
Google's BigTable or they would setup a number of XWiki/Cassandra stacks in a less
managed cloud such as Rackspace's or Amazon's.
How it works:
XWiki objects in the traditional Hibernate based storage engine are persisted by breaking
them up into properties which are then joined again when the object is loaded.
A user object which has a name and an age will occupy a row in each of three tables, the
xwikiobjects table, the xwikistrings table, and the xwikiintegers table.
The object's metadata will be in the xwikiobjects table while the name will be in a
row in the xwikistrings table and the age, a number, will go in the xwikiintegers table.
The NoSQL/Datanucleus based storage engine does this differently, the same object only
occupies space in the XWikiDocument table where it takes advantage of Cassandra's
flexibility by simply adding a new column for each property.
NOTE: this is not fully implemented yet, objects are still stored serialized.
What works
* Document storage
* Classes and Objects
* Attachments
* Links and Locks
* Basic querying with JDOQL
What doesn't work
* Querying inside of objects
* JPQL/XWQL queries
* Document history
* Permissions (requires unimplemented queries)
* The feature you want
I am interested in what the community thinks is the first priority, I can work on
performance which will likely lead to patches being merged into master which will benefit
everyone
or I can work on more features which will benefit people who want to use XWiki as a
traditional application wiki but use it on top of Cassandra.
You can reply here or add comments to the wiki ;)
Caleb
_______________________________________________
devs mailing list
devs(a)xwiki.org
http://lists.xwiki.org/mailman/listinfo/devs
--
Ludovic Dubost
Founder and CEO
Blog:
http://blog.ludovic.org/
XWiki:
http://www.xwiki.com
Skype: ldubost GTalk: ldubost