| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Transactions are an atomic set of database operations, or in this case object operations, that are either successfully commited, so all operations inside are also guaranteed to successfully finish, or fail, in which case all operations inside the transactions are ignored from the beginning of the transaction.
BeanKeeper handles currently only user managed Transaction demarcation, which means you have to
tell the library when a transaction begins, and when it commits. Even if you do not explicitly define transactions,
you will use them implicitly, because each operation of the Store uses them. When you call save()
or remove(), they create a transaction for themselves (if there was none), and use it to execute the required
function. If an error occurs inside these methods, the enclosing transaction will be set to rollback only. This
means, the transaction can only roll back, whatever happens after that.
To explicitly use transactions, first you have to get the TransactionTracker from the Store:
TransactionTracker tt = getStore().getTransactionTracker(); |
Where getStore() is some method of the application which returns the singleton instance of the Store.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The TransactionTracker manages all transactions currently in the application. If you want to keep track of
transaction commit and rollback events, you can register listeners to this tracker, which will be notified on every
commit or rollback event. To do this (assuming tt is the TransactionTracker object):
tt.addListener(new MyTransactionListener()); |
After this code, the MyTransactionListener will be called each time a commit or rollback event is generated.
The listener interface of a TransactionListener has two methods, which are very straight forward:
void transactionCommited(Transaction transaction); void transactionRolledback(Transaction transaction); |
Note that these methods receive the transaction object which generated the event, but those transactions are already finished, so you can not use them for executing operations. Also, if you get a transaction from the tracker, and execute some database specific operation inside these methods, those transactions will not cause these methods to be triggered, to avoid infinite recursion.
The TransactionTracker can be used to get Transaction objects from it to mark the
beginning and end of a transaction. To get a transaction object, you execute the following code:
Transaction tx = tt.getTransaction(TransactionTracker.TX_REQUIRED); |
The parameter of this call can specify how to handle possible currently active transaction objects inside the same thread, it has the following possibilities:
To understand the differences between these modes, we must note, that each Transaction object is associated with the
thread it was created in. In other words, if a method allocated a transaction, each method called from this method will
use this allocated transaction implicitly (if not instructed otherwise). If you get a transaction using TX_REQUIRED,
you tell the tracker, that an explicit transaction is required for the following operations. The tracker will do the minimum
to fulfill your request: If a transaction already is active, so somewhere in the caller stack, somebody has already requested
a transaction, that one is used. If there were no transactions required yet, then a new one is allocated and used, but either way,
a Transaction object is always returned. Keep in mind, that if you set this transaction to rollback, then possibly
you will rollback all the operations executed by the caller methods in the caller stack. Also, this is the mostly used
mode of transaction allocation.
Using TX_NEW tells the tracker, that a new transaction is to be allocated, even if there was a transaction already active in the thread. This means, if you rollback this transaction, the enclosing transaction could still commit successfully. To enable this, the tracker keeps a stack of transactions to each thread. When this transaction finishes (either commits, or rolls back), the previous transaction that was interrupted when requesting a brand new transaction will become acitve again. This is not an embedded transaction, no parameters of the previous transaction are visible in this new transaction, and also no modifications of a possible parent thread are visible.
Using TX_OPTIONAL, you either receive the active transaction currently in the thread, or null, if there is no transactions currently active.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
So, now that we got a Transaction object, we can use it to demarcate our transaction boundaries:
Transaction tx = tt.getTransaction(TransactionTracker.TX_REQUIRED); tx.begin(); ...operations... tx.commit(); |
The above code demonstrates the basic usage of transaction demarcation. When you get a transaction from the tracker, you must
call begin() to start the transaction. You may remember, that this transaction may be the same that one of our callers
already allocated, and most likely begin() was called by that code too. Do not worry, these transactions can handle
embedded transactions, so you do not have to guess, whether begin() was called or not, call it always.
Each begin() call, must have their closing commit() or rollback(). If after a begin() none
of these were called, then the transaction becomes unbalanced, the same way when in an expression parantheses are unbalanced. This will
cause unexpected or unclosed and uncommited transaction, so you might want to avoid these. Most of the time these are caused by exceptions
which alter code execution, and a commit() will be never reached. To fix this, the following code is suggested:
Transaction tx = tt.getTransaction(TransactionTracker.TX_REQUIRED);
tx.begin();
try
{
...operations...
} catch ( ... ) {
... handling code...
tx.markRollbackOnly();
} finally {
tx.commit();
}
|
As you see, in this code, the commit() will be always executed, no matter what happens. Of course, if there is an error,
we do not want a commit(), but rather a rollback(). To simulate this, we mark the
transaction as rollback only in the exception handling code. This way, even if the commit() is executed, the transaction will roll back when it's called.
Transaction objects implement the Map interface. This is useful, if you want to note some information to each
transaction, for example the user who executes the said transaction, or pass information to the
transaction event handler about the transaction.
It is also useful, if you just populate the transaction with important information, so that lower
layers of code can extract this information from the transaction object. For example the transaction is opened in a
Servlet which just received a POST with some data, then this servlet might put the User object, available from the Session
into the transaction, and call buisness logic to execute the required operation. The business logic normally could not determine
the User object by itself, because it is (rightfully so) separated from the presentation layer, but it can easily extract it
from the Transaction object it receives from the TransactionTracker. Note however, that the Transaction
object is not for parameter passing between methods. You should only add information to a Transaction object, if
that information is really about the transaction, not just some convenient way of passing parameters.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In a multithreaded application, such as webapplications, it is important to consider concurrency and parallel access to the database/persistence layer.
The first trivial problem is to make the persistence layer capable of handling calls in a thread-safe manner. This is ensured by the library itself, and not a problem for the caller. More important problem is, the transaction isolation level, which is presented as a configuration possibility in most application servers and databases. These levels are used to counteract some problems when a database is in use from multiple connections simultaniously:
Dirty reads happen if a transaction reads data, that is under modification by another transaction concurrently. The read is said to be dirty, because the object in question might have been already modified (dirty), but was not when the read began.
A non-repeatable read is a query which returns data, but if executed again in the same transaction it possibly returns a result with some records modified. This may be caused by another transaction concurrently modifying the data of the query.
Phantom reads are happening, if a query, executed again in the same transaction returns a different set of results as before. This may be caused by a concurrent transaction which inserts or removes rows which satisfy the query's where clause.
To counteract these problems, which usually make a system unstable or can cause the database or the application to become inconsistent, there are four levels of transaction isolation levels
Default level, counteract none of the above problems.
Counteracts dirty reads.
Counteracts dirty reads and non-repeatable reads.
Countercts all problems.
There are however two serious problems with the implementation of these counter measures in the databases. First, they are mostly implemented with locks. This means, that a table or row may be not accessible during an update. If you want to counteract all problems, which is likely what everyone wants, it is not uncommon that whole tables must be locked during a transaction to ensure that all reads and writes are consistent. This can lead to serious performance degradation. Another problem with locks is, that they can lead to dead-locks, in which case the whole application might hang. This may be prevented by given databases, but it is not guaranteed by default. To overcome these problems, one usually does not have the luxury of setting the isolation level to TRANSACTION_SERIALIZABLE, and one must choose which statements, tables or rows are to be locked, and only the really necessary ones will be locked.
The second problem is with these isolation levels, that they are only available in the scope of the transaction. There are plans in the JDBC standard to overcome this limitation, but it's not there yet, and even that does not solve all issues. So imagine a simple web page of customer listing, and say you have over 100.000 customers (hm.. if that's the case, consider donating to this project :). To list these customers on the page, we really don't want to hold all 100.000 customers in memory, we only list 30 on one page anyway. So we get a list from the database, but read only the first 30 and display them on the page. When the user presses the 'next' button, we want to read the next 30 records. We can do that safely, because we employ TRANSACTION_SERIALIZABLE isolation level, right? Wrong! Most likely this second read will be a different transaction than the first one, so our isolation level means nothing (well, next to nothing) in this case. We read the next 30 records, but we cannot guarantee, that these 30 records will be the same as if they were read in the first transaction. In other words, we might end up with some of the customers repeating from screen one, or we might have skipped some customers because somebody deleted some of the customers from the first screen while we were browsing through screen one.
The good news is, you don't have to deal with any of these issues when using BeanKeeper, because it counteracts all of these
problems at the same time. When you get a List from one of the Store's find() methods,
you get a lazy-list of all records. Lazy meaning, that not all records will be kept in memory, only a few dozens of them, but
despite this, the list is guaranteed to never change. This list can leave the transaction it was created in, because
it does not depend on traditional transaction isolation levels. The library uses versioning, and keeps all versions of an object in the
database, when issuing a query, the library marks the resulting lazy-list with the current timestamp, so all subsequent paging by that
result list will only select the versions of objects which were active when the query was created. This way, no locking is
performed, and lists are always insensitive to changes in the database. This means, that if a transaction started, all other
changes made during this transaction by other possible transactions are not visible.
All this for the price of increased storage space (which is nowdays very cheap), and a little more complex queries for the database to handle.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Transactions of the library will always behave like when using TRANSACTION_SERIALIZABLE isolation level on tranditional databases. That is, to produce repeatable reads inside a transaction, the library will always return the same result list consistently when executing the same query inside the same transaction. No external transaction will change the result of a query. You can think of it, as if during a transaction, no other transactions are running.
The following table sums up differences and similarities of transactions allocated with the given transaction types
got from the TransactionTracker:
Type | Same visibility as parent | Shares parameters with parent | Commits separately | Can be null |
|---|---|---|---|---|
TX_REQUIRED | yes | yes | no | no |
TX_OPTIONAL | yes | yes | no | yes |
TX_NEW | no | no | yes | no |
| [ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated by Robert Brautigam on November, 21 2009 using texi2html 1.78.