Managing Oracle SOA Environment [10g/11g]: Error handling for JCA adapters

Oracle JCA adapters support error handling capabilities.In this post we will see what is message rejection and how they can be handled.

Message rejection:Any message that got error out before being posted to SCA infrastructure is called rejected message.A best example would be when we have file adapter to translate text format to XML.If there are errors in translation then message will be rejected by JCA framework.

We can handle such type of messages by using a mechanism called rejection handlers.This mechanism is supported via Fault-policies This rejection handlers works only with Synchronous process
JCA framework categorizes the errors into two types

Retryable (which can be retried safely)
Non-Retryable (cannot be retried)

Messages are rejected if they are non-retryable for example, translation errors
They will be retired if they can be, for example connection errors
Retryable errors are retried either indefinitely(default behaviour configured globally) or number of times equal to jca.retry.count in composite.xml and reject message after count is exhausted.

All rejected messages are stored in SOA dehydration store table-rejected_message

Configuring message rejection handlers
Rejection handlers are defined in fault-policies.Message rejection handlers comes into picture only when messages got rejected by JCA

If we do not configure rejection handlers then default file based rejection handler will kick off and all rejected messages will be forwarded to <domain_name>/rejmsgs/<managed_server>

Available rejection handlers
Following handlers can be defined in fault-policies

JMS queue (Rejected messages are written to configured JMS queue)
Web service (Configured WS will be called with rejected message)
Custom java (Java class will be executed)
File (Rejected messages are written to specified folder path)

JCA Retries

When errors are retryable, then retrying happens based on either JCA global retry which is indefinite by default or JCA properties in composite.xml
Below JCA properties are used for this purpose in composite.xml
jca.retry.count
jca.retry.interval
jca.rerty.backoff
jca.retry.maxInterval

If we do not specify any JCA properties in composite.xml then global JCA retry will be used. If we specify at composite level then composite JCA properties will take precedence.

We also have an option of changing global JCA count from indefinite to finite number using system Mbean browser from em console.Follow below steps to do this

Right click on soa-infra then select Administration then system Mbean browser
Select oracle.as.soainfra.config then adapterconfig then adapter
Change GlobalInboundJCAretryCount to required number

Please remember this will affect all JCA retries in the domain.Keeping indefinte is also dangerous because this will cause JCA to retry indefiniely resulting one instnace for each retry

In this post I talk about some of the best practices we can think of while designing fault-policies.

JCA binding level retry execution within fault policy retries

If you are using retry actions on adapters with both JCA-level retries for the outbound direction and a retry action in the fault policy file for outbound failures, the JCA-level (or binding level) retries are executed within the fault policy retries.
Lets say you have configured below JCA binding properties in composite.xml

   <property name="jca.retry.count">2</property>
        <property name="jca.retry.interval">5</property>
        <property name="jca.retry.backoff">1</property>

and assume you have following retry configuration in fault policies file for reference binding component for remoteFault

Now when remote fault returned for reference binding component, following retry sequence happens

    * Fault policy retry 1:
          o JCA retry 1 (with 5 seconds interval)
          o JCA retry 2 (with 10 seconds interval)
    * Fault policy retry 2:
          o JCA retry 1 (with 5 seconds interval)
          o JCA retry 2 (with 10 seconds interval)
    * Fault policy retry 3:
          o JCA retry 1 (with 5 seconds interval)
          o JCA retry 2 (with 10 seconds interval)

As a reminder, if your intention is just retrying through fault-policies then do not use JCA level retries in composite.xml

Best practices in Configuring ora-retry fault policy

When you configure a fault policy to recover instances with the ora-retry action and the number of specified instance retries is exceeded, the instance is marked as open.faulted (in-flight state). The instance remains active and keeps running.
Marking instances as open.faulted ensures that no instances are lost and this also causes reattempting retry (even after retry count is over) which you do not like at this point of time.
Here we have a best practice to follow.
You can then configure another fault handling action following the ora-retry action in the fault policy file, such as the following:
•    Configure an ora-human-intervention action to manually perform instance recovery from Oracle Enterprise Manager Fusion Middleware Control Console.
•    Configure an ora-terminate action to close the instance (mark it as closed.faulted) and never retry again.
For example

However, if you do not set an action to be performed after an ora-retry action in the fault policy file and the number of instance retries is exceeded, the instance remains marked as open.faulted, and recovery again attempts to handle the instance.
For example, if no action is defined in the following fault policy file after ora-retry:

<Action id="ora-retry">
       <retry>
          <retryCount>2</retryCount>
          <retryInterval>2</retryInterval>
          <exponentialBackoff/>
       </retry>
</Action>
The following actions are performed:
•    The invoke activity is attempted (using the above-mentioned fault policy code to handle the fault).
•    Two retries are attempted at increasing intervals (after two seconds, then after four seconds).
•    If all retry attempts fail, the following actions are performed:
o    A detailed fault error message is logged in the audit trail
o    The instance is marked as open.faulted (in-flight state)
o    The instance is picked up and the invoke activity is re-attempted
o    Re throws fault to system fault handler
•    Recovery may also fail. In that case, the invoke activity is re-executed. Additional audit messages are logged.

Message rejection handlers

The messages that error out before being posted to the service infrastructure are referred to as rejected messages. For example, the Oracle File Adapter selects a file having data in text format and tries to translate it to XML format (using NXSD). If there is any error in the translation, this message is rejected and will not be posted to the target composite.

Rejected messages are stored in the database (in the rejected_message table) by default. If you do not configure a message rejection handler then default rejection message handler will handle it which stores them on the file system. This handler stores the payload and properties of the message on the file system at a predefined location in WLS_HOME. Currently, the Oracle SOA suite does not provide the capability to resubmit rejected messages; consequently it is your responsibility to take care of the resubmission.If we do not have one then rejected messages may go unnoticed which is not good practice .I recommend to have one to handle and resubmit them.

JCA adapters and fault-policies

In nutshell standard practices for JCA EH are:

Error type	Best practise
Inbound Retryable	Use JCA level retries in composite.xml
Inbound Non-retryable	Use fault-policies message rejection handlers
Outbound Retryable	Use JCA level retries in composite.xml
Outbound Non-retryable	Use fault-policies

Managing Oracle SOA Environment [10g/11g]

Monday, November 11, 2013

JCA adapters and message rejection handling

Saturday, January 14, 2012

Best practices in configuring fault-policies

About Me

Blog Archive