Distributed Polling Best Practice

MarkReservedValue

Essentially it is using LogicalDeletePollingStrategy or DeletePollingStrategy with a unique MarkReservedValue on each polling node. And setting MaxTransactionSize.

NOTE: MarkReservedValue (and MarkUnReadValue, MarkReadFieldName), can be set for DeletePollingStrategy, just not in the wizard.

SELECT FOR UPDATE

The second best practice is not to set MarkReservedValue, but to still set MaxTransactionSize.

Some benefits:

-a status column is not needed (when using DeletePollingStrategy)

-on a rollback no writes occur so it is completely safe (in MarkReservedValue rows are 'reserved' first in a separate transaction).

-the same process, without any modification, can be deployed to multiple nodes. Setting a unique MarkReservedValue can be an issue.

-select for update is ANSI-SQL so works well on any database.

Some downsides:

-concurrency/scalability is less as it holds a lock for a long time, that tends to serialize access to the table. If you did a synchronous post to BPEL, that could mean only one DBAdapter instance processing rows at a time. One way or the other, a single transaction is not realistic for either approach. If the time it takes for DBAdapter to poll rows and asynchronously post them to BPEL (default) is much less than the time spent in the BPEL process, then this approach can still provide scalability.

-Load balancing and throughput is less predictable. DBAdapter uses select for update no wait by default, which really causes some instances to get rows but others to abort early and come away with nothing. The maximum throughput will be much less than MaxTransactionSize*#instances/pollingInterval as a result. Used for crude load balancing select for update would probably be better. To make this change edit your toplink_mappings.xml and change it to:

<toplink:lock-mode>lock</toplink:lock-mode>

-There is a bug in 10.1.3.3, where if you have MaxTransactionSize other than "unlimited", and you are using LogicalDeletePollingStrategy, the for update clause will be removed, leading to multiple nodes processing the same rows. The workaround to this is to edit oc4j-ra.xml and set plaformClassName="DB2Platform". The only problem with this workaround is that on each polling interval, even though only MaxTransactionSize rows will be processed, all unprocessed rows will be selected and locked. So if #unprocessed rows >> MaxTransactionSize, you will see a big lock overhead. The rownum <= MTS was basically just to reduce the locking overhead.

-Though select for update should work on any database, the one database it makes the least sense for is on Oracle. On Oracle pessimistic select for update locking is frowned on: one of its key differentiators is that writers do not block readers. Not only will this lower concurency and introduce lock overhead, but will interfere with other programs that are updating the table the DBAdapter is polling from. To the perspective of non-DBAdapter processes select for update is more intrusive.

Tuning on a Single Node First

For a DBAdapter intensive process, such as a database - database integration, performance can be improved by a factor 10 or 100 just by tuning on a single JVM and scaling NumberOfThreads.

You can always add more nodes later, with MarkReservedValue, but try to get the most out of one node first. If the DBAdapter polling is more the initiator of a far more complex BPEL process, then you may want a true distributed approach, so you can scale BPEL.

Oracle Fusion Middleware

Main Menu

Monday, November 8, 2010