Wednesday, October 29, 2025

#1099 - Detailed Error Dashboards in OCI Log Analytics

 

Introduction

I can easily create an OIC Errors dashboard, with OCI Log Analytics.

The hardest part is just coming up with the widgets to include. I decided on the following, but please extrapolate from my simple demo.

OIC Errors Breakdown  

I decided on the following widgets -
  • Aborted Instances by Integration
  • Error Breakdown by Action Type
  • Integrations with Faults
  • Connectivity Agent Errors by Action
  • Integrations with Assign Errors
  • Integrations with Mapping Errors
  • Invoke Errors by Action
  • Errored Sync Flows by Integration
Let's go through each of these...

Integrations with Mapping Errors

I'm starting with this widget, realising it's out of sequence; however, I use it as an example of how I crafted the query.

Here is the underlying activity stream log message -

"datetime": 1761724999408,
  "logContent": {
    "data": {
      "actionType": "Mapper",
      "errorId": "u1I_CLSdEfCkHu9h7DM6qA",
      "eventId": "u1I_B7SdEfCkHu9h7DM6qA",
      "executedTime": "2025-10-29T08:03:19.408Z",
      "instanceId": "tGUtBrSdEfCH0M9O6IAqnQ",
      "integrationFlowIdentifier": "READWRITELARGEFI_1!01.00.0000",
      "message": "Error processing message in Mapper  ,error details : XPath expression failed to execute, error summary : Error during evaluation of XPath \"ora:doXSLTransformForDoc('resources/processor_342/resourcegroup_345/req_34ea3884757a4ce1b4a650b2f7e42c8a.xsl', $messagecontext_18, 'getFileRefFromFS', $messagecontext_312)\" Error at line 28, column 206, <fault><requestId>...</requestId><errorType>InternalServiceLimitError</errorType><origin>stagefile-service-564cd4c67f-wlzbs</origin><errorCode>SF1001</errorCode><faultName>{http://schemas.oracle.com/bpel/extension}runtimeFault</faultName><retriable>false</retriable><reason>Error occurred while processing activity readContent. Error : File size 150.20 MB exceeds maximum threshold size of 10.00 MB.Please make sure that the file size does not exceed threshold</reason><details>File size 150.20 MB exceeds maximum threshold size of 10.00 MB.Please make sure that the file size does not exceed threshold</details></fault>",
      "opcRequestId":...",
      "parentEventId": "uzic8bSdEfCE4DFbv7Kdwg",
      "projectCode": "...",
      "userId": "niall.commiskey@oracle.com"
    },
    "id": "bcdefa02-b49d-11f0-b9ec-798ce9fc19fc",
    "oracle": {
      "compartmentid": "...",
      "ingestedtime": "2025-10-29T08:03:22.042Z",
      "loggroupid": "...",
      "logid": "...",
      "tenantid": "..."
    },
    "source": "...",
    "specversion": "1.0",
    "time": "2025-10-29T08:03:19.408Z",
    "type": "com.oraclecloud.integration.integrationinstance.activitystream"
  },
  "regionId": "us-phoenix-1"
}

Here is the Log Explorer query - 

'Log Source' = 'OCI Integration Activity Stream Logs' and Message like '%Error processing message in Mapper%' | timestats distinctcount(Instance) as 'Flow Instances with Mapping Errors', distinctcount(Instance) by Integration

The Log Explorer view is as follows - 

Note the Group By Integration - 

I added this field from the Other field list -

Invoke Errors by Action


The query here is - 'Log Source' = 'OCI Integration Activity Stream Logs' and Message like '%Error processing message in Invoke%' | timestats distinctcount(Instance) as 'Flow Instances with Invoke Errors', distinctcount(Instance) by Action

Note the group by Action.

So why am I not grouping on Integration? Because it's not included in the log message - 

{
  "datetime": 1761722207212,
  "logContent": {
    "data": {
      "actionName": "write2File",
      "actionType": "Invoke",
      "endPointConnectionId": "AA_LOCALFILE",
      "endPointName": "write2File",
      "endPointType": "file",
      "errorId": "OwtBybSXEfCwkOV7JiyKKQ",
      "eventId": "OwtByLSXEfCwkOV7JiyKKQ",
      "executedTime": "2025-10-29T07:16:47.212Z",
      "instanceId": "jPvHSrSWEfCH0M9O6IAqnQ",
      "message": "Error processing message in Invoke write2File ,error details : oracle.cloud.cpi.common.core.CpiException: No response received within response time out window. Connectivity Agent may not be running, or temporarily facing connectivity issues to Oracle Integration Cloud Service. Please check the health of the Connectivity Agent in Agent Monitoring page. For additional insight into request eed027e0-c1f2-49c8-bc16-59417b0e9593 processing refer to the Connectivity Agent logs., error summary : oracle.cloud.cpi.common.core.CpiException: No response received within response time out window. Connectivity Agent may not be running, or temporarily facing connectivity issues to Oracle Integration Cloud Service. Please check the health of the Connectivity Agent in Agent Monitoring page. For additional insight into request eed027e0-c1f2-49c8-bc16-59417b0e9593 processing refer to the Connectivity Agent logs.",
      "opcRequestId":..",
      "parentEventId": "jhou_7SWEfCkHu9h7DM6qA",
      "userId": "niall.commiskey@oracle.com"
    },
    "id": "3c54b135-b497-11f0-b9ec-798ce9fc19fc",
    "oracle": {... 

The log field, actionName, surfaces in log explorer as action, hence my use of it.
I could also group by connectionId(e.g. AA_LOCALFILE) or endPointType (e.g. file, ftp etc.). 
 

Aborted Instances by Integration

Maybe check out my post on monitoring aborted instances, before continuing.

Underlying query is - 

'Log Source' = 'OCI Integration Activity Stream Logs' and 'Action Type' in (Abort) | timestats distinctcount(Instance) as 'Aborted Flow Instances', distinctcount(Instance) by Integration

Error Breakdown by Action Type

Underlying query is -
'Log Source' = 'OCI Integration Activity Stream Logs' and Message like '%Error%' | timestats distinctcount(Instance) as 'Flow Instances with Assignment Errors', distinctcount(Instance) by 'Action Type'

A couple of words on 'receive' - the errors here are referring to sync integrations and the return on retrieve.

Integrations with Faults   

Here I am referring to execution of the Throw New Fault action in integrations.

Underlying query is - 
'Log Source' = 'OCI Integration Activity Stream Logs' and 'Action Type' in (Raise_new_error) | timestats distinctcount(Instance) as 'Flow Instances with Faults', distinctcount(Instance) by Integration

Connectivity Agent Errors by Action

Underlying query is - 
'Log Source' = 'OCI Integration Activity Stream Logs' and Message like '%Error%Connectivity Agent may not be running%' | timestats distinctcount(Instance) as 'Flow Instances with Connectivity Agent Errors', distinctcount(Instance) by Action

Integrations with Assign Errors

Underlying query is -
'Log Source' = 'OCI Integration Activity Stream Logs' and Message like '%Error processing message in Assignment%' | timestats distinctcount(Instance) as 'Flow Instances with Assignment Errors', distinctcount(Instance) by Integration

Errored Sync Flows by Integration

Underlying query is - 
'Log Source' = 'OCI Integration Activity Stream Logs' and Message like '%Error in reply to%' and 'Action Type' = receive | timestats distinctcount(Instance) as 'Sync Flows with Errors', distinctcount(Instance) by Integration

Summa Summarum

The above are just some examples of what you can create in OCI Log Analytics. I hope they are of some use to you, and also provide you the basis for creating your own.

Happy Monitoring!









No comments: