Re: [xwiki-devs] [proposal] Introduce a new persistence engine.

21 Jan 2011

On 01/20/2011 03:28 PM, Denis Gervalle wrote:
...
  On Thu, Jan 20, 2011 at 18:50, Caleb James DeLisle
&lt;calebdelisle(a)lavabit.com
  wrote: 
>
>
> On 01/20/2011 12:33 PM, Denis Gervalle wrote:
>> On Thu, Jan 20, 2011 at 17:15, Caleb James DeLisle <
> calebdelisle(a)lavabit.com
>>  wrote: >>
>>>
>>>
>>> On 01/20/2011 02:42 AM, Denis Gervalle wrote:
>>>> On Wed, Jan 19, 2011 at 19:54, Caleb James DeLisle <
>>> calebdelisle(a)lavabit.com
>>>>  wrote: >>>>
>>>>>
>>>>>
>>>>> On 01/19/2011 12:13 PM, Vincent Massol wrote:
>>>>>> Hi Caleb,
>>>>>>
>>>>>> I see you're excited, that's good! :)
>>>>>>
>>>>>> Some general comments:
>>>>>> * This looks more like a design for a transaction module than for
a
>>>>> persistence engine. I don't see anything related to persistence
in
> your
>>>>> proposal below. Your proposal could work on stuff others than
storage,
>>>>> right?
>>>>> Yes, this proposal only covers the transaction sub-module of the
>>>>> persistence engine. The so far
>>>>> un-proposed modules include xwiki-store-serialization,
>>>>> xwiki-store-filesystem and a legacy
>>>>> attachment storage module: xwiki-store-filesystem-attachments.
>>>>>
>>>>> Anything which requires transactions could use the
TransactionRunnable
>>>>> although I'm at a loss to
>>>>> think of anything other than storage which would require
transactions.
>>>>>
>>>>>> * I was expecting to see some Store/Storage/Persistence
interfaces
> with
>>>>> proposed APIs and explanation on how they could be implemented both
> with
>>>>> Hibernate and JCR for example. And the relationship with the proposed
>>> new
>>>>> Model defined.
>>>>> I don't like to propose an interface until I have tried to
implement
> it.
>>>>> Also I do not like to
>>>>> propose an implementation until I have tried to use it. At this point
>>> it's
>>>>> far enough off that I
>>>>> would rather wait than propose APIs blind.
>>>>>
>>>>> My experience with attachment store has shown that what we want is a
> set
>>> of
>>>>> functions which provide
>>>>> TransactionRunnables to do various things:
>>>>> aka:
>>>>> TransactionRunnable<T>
> getDocumentSaveTransactionRunnable(XWikiDocument
>>>>> toSave);
>>>>> In a hibernate implementation it would return
>>>>> TransactionRunnable<HibernateTransaction> and in a JCR
>>>>> it would return TransactionRunnable<JCRTransaction>.
>>>>>
>>>>> We cannot have APIs like this until TransactionRunnable is agreed
upon
>>>>> these will return instances
>>>>> of it.
>>>>>
>>>>>> * I was also expecting a strategy defined to migrate users from
the
>>>>> current implementation of the storage to the new one
>>>>> IMO we should change the persistence engine and implement the same
>>> schema,
>>>>> once the persistence
>>>>> engine is rebuilt, then we can consider modifying the schema. The
> schema
>>> is
>>>>> a specification, it may
>>>>> not be perfect but it is something to comply with. It is important to
> me
>>>>> that a we prove that a new
>>>>> persistence engine is able comply with existing specifications before
> we
>>>>> start designing new ones
>>>>> around it.
>>>>>
>>>>>>
>>>>>> I noticed some discussions between Denis and you on IRC about all
> this.
>>>>> Does you latest findings change the proposal below?
>>>>> Everything proposed still holds true but I did add 2 new features.
>>>>>
>>>>> 1. There is a way for a TransactionRunnable<DBTransaction> to
be
> passed
>>> an
>>>>> instance of DBTransaction
>>>>> using a new method called getContext().
>>>>>
>>>>> 2. There is a new class which serves what I believe is an edge use
> case.
>>>>> Suppose you want to define a TransactionRunnable (we will call it
>>>>> YourTransactionRunnable) which
>>>>> must run inside of a DBTransaction but it must _also_ run after an
>>> instance
>>>>> of MyTransactionRunnable.
>>>>> You can make MyTransactionRunnable a
>>>>> "ProvidingTransactionRunnable<DBTransaction,
MyInterface>" and
>>>>> then MyTransactionRunnable must run inside of a DBTransaction and we
>>> define
>>>>> YourTransactionRunnable
>>>>> as a TransactionRunnable<MyInterface>. This also allows
>>>>> MyTransactionRunnable to share information
>>>>> since YourTransactionRunnable.getContext() will provide an
>>> implementation
>>>>> of MyInterface. Of course
>>>>> this feature must be used with care as it provides the tools to write
>>>>> horrible constructs but IMO it
>>>>> is the type of feature which when you need it, there is no other way
>>>>> around.
>>>>>
>>>>
>>>> I just need to add that we have not been able to apply TR and PTR in
>>>> particular to a simple store with a JDBC connected database. From our
>>>> experience, the PTR cause more issue than it solve and we have to find
> a
>>>> better way to convey datas between transactions of a given transaction
>>>> chain. Data dependencies between transaction could not be easily solved
>>> at
>>>> compile time since there is many combinatory situation in real life.
>>>>
>>>> My suggestion would be to provide access to previous transaction using
> a
>>>> hash of Interfaces exposed by previous transaction. Checking
> availability
>>> of
>>>> needed interfaces could be done by transactions à preRun time which
> could
>>>> avoid the need of uselessly running the whole chain, if there is a
>>>> dependency problem.
>>>
>>> What you can do is make DBTransaction (the context) extend Map and then
> put
>>> things in that map. I
>>> really don't like that path though since is means that TR3 uses things
>>> provided by TR1 and doesn't
>>> declare that need. It breaks atomicity by sharing data between
>>> transactions, TR3 can't just be
>>> plugged in somewhere else where TR1 is not used. It also breaks breaks
>>> compile time safety, plugging
>>> TR3 in somewhere else (or reordering them) will compile but fail to run.
>>> That pattern doesn't really afford you any of the benefits of TR other
> than
>>> the try, catch safety
>>> so, to me, it makes sense in that case for the code for TR1, TR2, and
> TR3
>>> to all be inside of the
>>> same TransactionRunnable.
>>>
>>>>
>>>> I also doubt that the way of multiple transactions are executed in a
> same
>>>> single one is correct or useful. Would really prefer to see all
>>> transactions
>>>> run at a given level before going down the chain.
>>>
>>> I don't see this as making sense. My goal is to make sure when something
>>> goes wrong it fails as
>>> early as possible so there is the least amount to rollback.
>>>
>>> SaveDocumentRunnable
>>>  |
>>>  +--SaveObjectRunnable
>>>  |   |
>>>  |   +--SavePropertyRunnable EXCEPTION
>>>  |   +--SavePropertyRunnable <-- Doesn't run.
>>>  |
>>>  +--SaveObjectRunnable <-- doesn't run
>>>  |
>>>  etc. Doesn't run.
>>>
>>> I'm interested to know what is the rationale for wanting them to run one
>>> level then the next?
>>> IE: save document, then all objects then all properties.
>>>
>>>> This would provide the way
>>>> to bundle transaction (and even transaction chain) together by running
>>> them
>>>> in a single transaction. This would also helps not mixing dependencies
>>> since
>>>> currently there is an implicit availability of earlier sibling chain of
>>>> transactions that does not fit the idea of a context in evolution
> checked
>>> at
>>>> compile time and could provide unchecked implicit dependencies.
>>>
>>> I still don't quite follow, can you give a real world example?
>>>
>>
>> Well, let me base my real world example on your sample above.
>> Imagine that documents are linked to their objects using their primary
> key,
>> PK which is generated by the underlying database when you records the
>> document for the first time, and that we needs creating a new documents
> with
>> Objects in a single transaction. Your SaveObjectRunnable will need that
> PK
>> as a FK, how does it get it? Or said in another way, it needs a context
> that
>> provide the PK of the document for which it is saving the objects.
>
> If the object knows what document it belongs to couldn't you use:
> object.getDocument().getKey() ?
> I think allowing the document to know it's primary key is less ugly than
> passing everything around
> in a map.
>
 For sure, this is a contrive example which I try to use to exposed what I
 means. It is not perfect, just too simple in fact. 
It will become a real example if XWiki's Hibernate based storage is ported over.
...

  Alternatively you could use:
 UPDATE object SET object.foreignKey=(SELECT doc.id where doc.name =
 <object.getDoc().getName()>) or
 the like. It's harder on the DB but it will make each TR modular and
 atomic.

 But there is no assurance it will works since you have never say that the
 document should be created to save its objects. 
Indeed there wouldn't. I suppose the best solution is object.getDocumen().getKey().
...

  Well, lets insert a CreateDocumentRunnable before
it, that may have some
 lower TR to complete, like saving document and retrieving the generated  PK.
  How does this CreateDocumentRunnable is maybe a
sibling
 of SaveObjectRunnable provide the PK to it ? 
 I don't understand why the SaveObjectRunnable would be a sibling and of the
 SaveDocumentRunnable. If
 the Object is part of the document then shouldn't it be a child?

 No, SaveObjectRunnable will not be sibling of the SaveDocumentRunnable, it
 will be a child of it, for sure, but it could be a sibling of let say
 CreateDocumentRunnable which have itself some children for doing its own
 job.
   How do
you check
 that SaveObjectRunnable requirement to have that PK is fulfilled ? Have  you
  an idea of the structure needed for this example
? 
 IMO siblings should never depend on one another, if the TR needs something
 from the last TR then it
 really needs to be a child of it. this also has the benefit that you can
 pass around a TR for saving
 a document and it will have all TRs for saving each object and each
 property inside of it.

 To say it an other way, saving a document, could be divided in several
 pieces, and there is no assurance that you could always convert that in
 parent child relationships, especially when a third transaction depends on
 the result of 2 previous one that are not dependent from each other but are
 bundled together in a third one. That would be a perfect world if you could
 really expect that all dependencies are one to many, and never many to one.
    A
  /    \
 B    C
  \    /
    D
 A is composed of B and C, and D depends on A. 
How about this:
getDocumentSaveRunnable() returns:
A
|
+-B
+-C
and then if D requires A it can call:
D.runIn(A) and now you will have
A
|
+-B
+-C
+-D
But this pattern will fail if A adds more subTR's in it's preRun function.
Caleb
...

 Thank you, I really appreciate your review.

 Thanks ;), I really hope it could make it better !
 Denis
  > Caleb

>
>> Denis
>  
>  
>>>
>>>> To conclude, this is a very interesting proposal, that needs more
>>>> refinements before being used wildly in all situation requiring
>>>> transactional processing.
>>  
>>  >>>> Nice idea and good job Caleb !
>>  >>> Thanks :)
>>  >>> Caleb
>>  
>>>  >>>> Denis
>>>  
>>>  
>>>  
>>>>
>>>>> Caleb
>>>>  
>>>>>  >>>>>> Thanks
>>>>>> -Vincent
>>>>>
>>>>>> On Jan 10, 2011, at 2:15 PM, Caleb James DeLisle wrote:
>>>>>
>>>>>>> Hi,
>>>>>>> I have been working hard on filesystem attachments and I
found that
>>>>> synchronizing manual filesystem
>>>>>>> transactions with automatic database transactions was a
difficult
> job
>>>>> and I needed a new tool to do
>>>>>>> it. So I wrote what I am proposing to be the next XWiki
Persistence
>>>>> Engine.
>>>>>>
>>>>>>> I'll start off with the fun part of the proposal, I
have been
> calling
>>> it
>>>>> xwiki-store so far but I am
>>>>>>> so excited about the capabilities of this engine that I
don't think
> it
>>>>> does it justice to name it
>>>>>>> "store" after the place on the corner with milk and
eggs. I am
>>> proposing
>>>>> it be named "XWiki
>>>>>>> Persistence Engine", the directory will be renamed
> xwiki-persistence,
>>>>> the artifact name
>>>>>>> xwiki-core-persistence, and the package name
org.xwiki.persistence.
>>>>> Persistence is an attribute of
>>>>>>> castles, mountains and redwood trees which I think is fitting
for a
>>>>> conservatively designed storage
>>>>>>> engine.
>>>>>>
>>>>>>> Now a little explanation of what I'm so excited
about:
>>>>>>> The common and error prone way of saving things in the
database is
> to
>>>>> open a transaction, enter a
>>>>>>> try clause, do something then commit. If we catch an
exception, then
>>> we
>>>>> rollback.
>>>>>>> something like this:
>>>>>>
>>>>>>> begin transaction;
>>>>>>> try {
>>>>>>>   do something;
>>>>>>>   do something else;
>>>>>>>   commit;
>>>>>>> } catch (Any exception which may occur) {
>>>>>>>   rollback;
>>>>>>> }
>>>>>>
>>>>>>> There are 3 things which can go wrong. 1 we forget to
begin the
>>>>> transaction, 2 we forget to commit
>>>>>>> and 3 we do not rollback properly. What makes things worse is
often
>>> the
>>>>> database will "assume we
>>>>>>> meant to..." and things will work ok most of the time
which makes
>>> things
>>>>> much worse because bugs
>>>>>>> will hide very well.
>>>>>>
>>>>>>> My answer to this problem is a class called
TransactionRunnable. It
>>>>> provides a set of 5 empty
>>>>>>> methods to override: onPreRun(), onRun(), onCommit(),
onRollback(),
>>> and
>>>>> onComplete(). the exact
>>>>>>> circumstances under which each are called is documented in
the
> javadoc
>>>>> comments here:
>>>>>>
>>>>  
>>  >
http://svn.xwiki.org/svnroot/xwiki/contrib/sandbox/xwiki-store/xwiki-store-…
>>>>>>> I wrote TransactionRunnable twice, I wrote it, used it for
>>> attachments,
>>>>> then after having real
>>>>>>> experience as a user, I wrote it again.
>>>>>>
>>>>>>> To repeat our original example with TransactionRunnable
you might
> say
>>>>> this:
>>>>>>
>>>>>>> public class DoSomethingTransactionRunnable extends
>>> TransactionRunnable
>>>>>>> {
>>>>>>>   public void onRun()
>>>>>>>   {
>>>>>>>     do something;
>>>>>>>     do something else;
>>>>>>>   }
>>>>>>> }
>>>>>>
>>>>>>> Now we can use another TransactionRunnable which opens
and closes
> the
>>>>> transaction for us.
>>>>>>
>>>>>>> StartableTransactionRunnable transaction = new
>>>>> HibernateTransactionRunnable();
>>>>>>> new DoSomethingTransactionRunnable().runIn(transaction);
>>>>>>> transaction.start();
>>>>>>
>>>>>>> the runIn() function allows us to run one
TransactionRunnable inside
>>> of
>>>>> another. Supposing we wanted
>>>>>>> to reuse "do something else" in other places, we
can make it a
>>> separate
>>>>> TransactionRunnable and use
>>>>>>> the runIn() function to hook it to our
> DoSomethingTransactionRunnable
>>>>> ie:
>>>>>>
>>>>>>> public class DoSomethingTransactionRunnable extends
>>> TransactionRunnable
>>>>>>> {
>>>>>>>   public DoSomethingTransactionRunnable()
>>>>>>>   {
>>>>>>>     new DoSomethingElseTransactionRunnable().runIn(this);
>>>>>>>   }
>>>>>>> ..
>>>>>>
>>>>>>> The only limitations on running TransactionRunnables
inside of one
>>>>> another are they cannot run more
>>>>>>> than once and they cannot call themselves (this would be an
infinite
>>>>> loop).
>>>>>>
>>>>>>> This pattern makes each job which is done on storage
easily isolated
>>>>> and, as I have so far
>>>>>>> experienced, trivial to test. However, it still leaves the
> possibility
>>>>> that we might forget that
>>>>>>> DoSomethingTransactionRunnable must be run inside of a
hibernate
>>>>> transaction. I have devised a
>>>>>>> solution for this too. Using generics, I offered a means for
the
>>> author
>>>>> of a TransactionRunnable to
>>>>>>> communicate to the compiler what other TransactionRunnable
their
>>>>> runnable must be run in and without
>>>>>>> explicit casting or defining of an intermediary runnable,
this
>>>>> requirement cannot be violated or
>>>>>>> else it wouldn't compile!
>>>>>>
>>>>>>> Finally we have the issue of starting the runnable.
Who's to say I
>>> won't
>>>>> be tired one day and just
>>>>>>> write new DoSomethingTransactionRunnable().start() without
opening a
>>>>> transaction first? If
>>>>>>> DoSomethingTransactionRunnable cannot be safely run outside
of a
>>>>> transaction all it needs to do is
>>>>>>> not extend StartableTransactionRunnable and it won't have
any start
>>>>> function.
>>>>>>
>>>>>>> I have taken a multitude of very easy mistakes and
given the author
> of
>>> a
>>>>> TransactionRunnable the
>>>>>>> tools to make it very hard for the user to make them. Also,
since a
>>>>> TransactionRunnable has no
>>>>>>> reason to be very long (it can just branch off into another
> runnable)
>>>>> this will make testing and
>>>>>>> code review easy in the place where it is most important.
This part
> of
>>>>> the code is entirely generic
>>>>>>> and has no dependence on hibernate or anything else.
>>>>>>
>>>>>>> I propose we move:
>>>>>>> contrib/sandbox/xwiki-store/xwiki-store-transaction/
>>>>>>> to:
>>>>>>> platform/core/xwiki-persistence/xwiki-persistence-transaction
>>>>>>
>>>>>>> And I will propose moving each additional piece in the
coming days.
>>>>>>
>>>>>>> WDYT?
>>>>>>
>>>>>>> Caleb
>>>>>
>>>>>> _______________________________________________
>>>>>> devs mailing list
>>>>>> devs(a)xwiki.org
>>>>>> http://lists.xwiki.org/mailman/listinfo/devs
>>>>>  
>>>>  >>>>>
_______________________________________________
>>>>> devs mailing list
>>>>> devs(a)xwiki.org
>>>>> http://lists.xwiki.org/mailman/listinfo/devs
>>>>  
>>>  
>>>  
>>>
>>  >>>
_______________________________________________
>>> devs mailing list
>>> devs(a)xwiki.org
>>> http://lists.xwiki.org/mailman/listinfo/devs
>>  
>  
>  
>  

> _______________________________________________
> devs mailing list
> devs(a)xwiki.org
> http://lists.xwiki.org/mailman/listinfo/devs

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [xwiki-devs] [proposal] Introduce a new persistence engine.