Sunday, September 7, 2025

#1084 OIC3: Monitoring performance of target endpoints

Introduction 

How can we monitor specific endpoints in OIC? Here I'm referring to outbound invokes to 3rd party services. 

For example, I have an integration that retrieves SR data from Fusion. This integration is being invoked by a call centre UI, so I need to monitor performance. Here's my simple SR retrieval integration - 

Here's the SR in Fusion - 

Monitoring using OCI Service Metrics 

The metric we need is - 

Check out the Dimensions - 
adapterIdentifier

As you've probably guessed, erp refers to the Fusion ERP adapter.


Here we see the GetServiceRequests endpoint.

You can choose from a variety of time intervals and statistics - 

Before looking at the graph, let's check the times in OIC Observability -

The first flow took ca. 17 secs, the second, just over 1 second.

Here is the Fusion ERP invoke stats for the first flow -

It took 16s 986ms for the getSR request to complete.



Now back to OCI Service Metrics; I've clicked Update Chart, so let's view it - 

I run the integration 10 times and now choose the Mean statistic. This will probably be the most popular, however, you may be interested in Max etc. 

Here the mean is down to ca 300 msecs.

Just to recap what I've covered - the metric Outbound Request Invocation Time can be used to give us an insight into overall erp adapter performance. This we get by adding the dimension adapterIdentifier and selecting erp.

We can also drill deeper, by adding the dimension outboundInvocationEndpointInformation and selecting our endpoint, in my case, the get service request.

We can select different stats - mean, max, sum etc.
Finally, we can create OCI Alarms, based on the query - 

This alarm can be set to fire on a condition of your choice. For example, in our call centre use case, the integration team have an internal SLA for retrieving SR data. This is set to 500 msecs. 

We have the choice of creating the alarm on the integration itself and/or the invoke of Fusion.

Let's carry on with our current query - 

check out the query code editor - here you can amend the query -

OutboundRequestInvocationTime[1m]{resourceId = "yourOIC OCID", adapterIdentifier = "erp", outboundInvocationEndpointInformation = "Get ServiceRequests in ERP Cloud"}.grouping().max() > 500

 
I specify a destination, this will result in OIC admins receiving emails, when the alarm fires.

Back in OCI Service Metrics - there is still one dimension I haven't covered.

The values shown are the statuses received in the query time frame. 

So you could also create alarms, based on the response status from Fusion.

Some folks note that OCI Service Metrics charts are very basic, one very good reason to look at OCI Log Analytics.

Creating Log Analytics Dashboard Widgets


This we will now do, based on our query. The best place to copy the query from is the alarm definition -
I edit an existing dashboard and add a query based widget.

Summa Summarum

OCI Service Metrics for OIC are very useful for ad hoc or programmatic monitoring of your integration flows. The value add of OCI Alarms & Log Analytics just makes it easier for you to get the insight you need, when you need it!

And speaking of alarms, mine has just fired; here's the mail I received -

Alarm status is also set to firing - 

Later, I receive an email, informing me the alarm has been deactivated - I wish my house alarm worked the same way! 

Now to adding more data to the email - what do we require here? Naturally, how long the invoke took (in msecs) and when the violation ocurred.

This can be done by adding the following to the alarm body - 

I test again, to trigger the violation. Here's the email - 

The invoke took 3839 msecs.








 




 
 

















 

Tuesday, September 2, 2025

#1083 Monitoring Integration Processing Time Statistics


Introduction

I originally began this post with the goal of explaining best OIC users can go about monitoring asynchronous flows. However, it soon became clear to me that most of what I cover equally applies to synchronous flows.

Monitoring, or Observability as we term it in OIC, covers a large area. However, at the end of the day, OIC users have concrete questions they want answered; questions such as -
  • What is the average processing time for my async integration createOrder?
  • When do I see peaks?
  • What else is happening when those peaks occur?
  • Anomalies? e.g. scheduled job usually takes 30 minutes, now it's taking 1 hour.
  • I don't sit in front of a monitoring dashboard all day, so can I be alerted to such?  
The above is not an exhaustive list, so just see it as a starting point for this post. 

OCI Log Analytics nicely complements OIC Observability, and I will be covering how we can leverage both of these tools to answer such questions.

OIC Observability

The starting point, as usual, is a very simple integration -

async-processOrder is a dummy integration that does nothing more than wait a certain amount of seconds, before completing. The request payload field - waitInSecs - controls how long we wait.



I've ran 11 flows with different values for waitInSecs - let's get an overview.

Note the default view, highlighting min, max, std. deviation and mean - 

I can un-check/check as required -

Note also the support for percentiles - 

I can also see the flow count breakdown over time - 


Now let's do a load test - starting point -

I run a load test from SOAP-UI - and see the async queue building up


Test completes, but still requests queued -

Let's look at the figures above -

  • Received is 1967
  • Processed is 1773
  • Succeeded is 1773
That means 194 flows are either in progress or queued. 

I check my async concurrency limit in the OIC Observability Dashboard - 

As you can see, I have a limit of 50. This can, of course, be increased by adding more message packs to this instance.

So from the 194 "open" flows, I can safely say that at least 144 are still queued.

I run a couple of more load tests and review the graph again - 

Note the max execution time here of ca 48 seconds - for the load test, the async integration has been invoked with the following payload - 

All the integration does is execute the Wait action, this load test had the request field waitInSecs set to 30. So we can say that this flow was in the queue for ca. 18 seconds, before being popped.
Earlier tests had a lower value for this field, thus the variation in the graph.

Naturally your integrations will contain orchestration logic along with invokes of services etc., so such statements won't be possible for you. 

But I hope you get the idea. 

 

Now to the final widget on the page - 

I've highlighted an icon on the left -

This can be dragged to decrease/increase the time interval - e.g. default view is for 1 day and I want to focus in on a particular part of that day. 




OCI Log Analytics

I have posted many time on OCI Log Analytics, I've even dropped referring to it as OCI Logging Analytics!

This time we will look at getting insight into the time taken to execute flows over a certain time window.

I execute the following integration - 


I ran with waitInSecs set to 30, 20 & 10 seconds.




Now to Log Explorer in OCI Log Analytics -
  
What we're looking at here is the result of a Time Taken Analysis query.

Here is the actual query I used and a BIG THANKS to my OCI Log analytics colleague, Sreeji, for this!

'Log Source' = 'OCI Integration Activity Stream Logs' and Integration = ASYNCWITHWAIT | link 'OPC Request ID' | eventstats distinctcount(Instance) as Instances, distinctcount(Identifier) as Integrations, distinctcount('User ID') as Users, distinctcount('OCI Resource Name') as Environments | stats unique(Instance) as Instance, unique(Identifier) as 'Integration Id', unique(Endpoint) as Endpoint, unique('User ID') as User, unique('OCI Resource Name') as 'OCI Env' | extract field = 'Integration Id' '(?P<Integration>[^!]*)' | rename 'Group Duration' as 'Time Taken' | classify 'Start Time', 'Integration Id', Integration, 'Time Taken' as 'Flows Execution Time Analysis' | fields target = ui -Instances, -Integrations, -Users, -Environments

Now this complex query will be productized in an out of the box dashboard, but, of course, you can start using it now.

I run the integration again, this time specifying a wait of 120 seconds -







I now run another integration validateOrder 3 times - this is a sync integration.

I also run my async integration twice -

In Log Explorer, I delete the integration filter - 

e.g.

'Log Source' = 'OCI Integration Activity Stream Logs' and Integration = ASYNCWITHWAIT | link 'OPC Request ID' | eventstats ...
to 
'Log Source' = 'OCI Integration Activity Stream Logs' | link 'OPC Request ID' | eventstats ...

I run the query and now see the stats for both integrations - 

Summa Summarum

As the 19th century  American humourist, Seba Smith, was wont to say - " there are more ways than 1 to skin a cat" - rather morbid, but do check out her short story - The Money Diggers. Likewise, there are many ways to monitor/observe what's going on in OIC. I mention the following tools, when discussing this topic with customers -

  • OIC Observability
  • OCI Service Metrics for Integration
  • OCI Logging
  • OCI Dashboards
  • OCI Alarms
  • OCI Log Analytics

Here we looked at processing time statistics monitoring using OIC Observability and OCI Log Analytics. 

Please note, this post is not an exhaustive discourse on OIC monitoring, but I do hope it helps answer some of the questions posed in the introduction.