Introduction
Yet another post in the OIC for OCI AI Services series. Today we're looking at AI Vision service. Firstly, what does this service offer?
Why begin with a picture of McSorleys? Because, we'll use this image in some of the following invokes to OCI AI Vision service.
What does AI Vision offer?
Net, net the service offers the following -
- Image Classification
- Text Detection
- Face Detection
- Object Detection
- Video Analysis
Let's look at the basic 3 steps when using AI Vision -
- Ingesting data - e.g. images from object storage or anywhere. You can use OIC to pull in data from anywhere. We ship with the native action for OCI Object Storage as well as a plethora of adapters.
- Understanding data - here's where AI Vision does it's magic, recognising images, parsing text etc. OIC can easily invoke AI Vision, this is what we'll cover today
- Using the Intel - here we take the result(s) from AI Vision and use them in our business processes. OIC is THE business process automation toolkit, so let's kick the tyres!
OCI AI Vision
Here is the Vision menu in OCI.Object Detection
This feature allows one to identify objects and their location within an image along with a confidence score.
I try it out -
But you get the idea. I want to know what's going on in the image, AI Vision tells me that, assigning a degree of confidence to what it finds.
So how can we do this in OIC?
Now to the api docs for OCI AI Vision, here I find the API endpoints -
Now to the api for object detection -
The complete url -https://vision.aiservice.us-phoenix-1.oci.oraclecloud.com/20220125/actions/analyzeImage
Request Payload - the basic input here is the image for analysis.
The final request payload is as follows
{
"features": [
{
"featureType": "OBJECT_DETECTION"
}
],
"image": {
"source": "INLINE",
"data": "base64"
},
"compartmentId": "yourCompartment_ocid}}"
}
The response payload is as follows -
{
"imageObjects": [{
"name": "Person",
"confidence": 0.98758954,
"boundingPolygon": {
"normalizedVertices": [{
"x": 0.6116622686386108,
"y": 0.584307074546814
}, {
"x": 0.6986929178237915,
"y": 0.584307074546814
}, {
"x": 0.6986929178237915,
"y": 0.9633761644363403
}, {
"x": 0.6116622686386108,
"y": 0.9633761644363403
}]
}
}, {
"name": "Chair",
"confidence": 0.984481,
"boundingPolygon": {
"normalizedVertices": [{
"x": 0.2508918046951294,
"y": 0.7415730953216553
}, {
"x": 0.32072916626930237,
"y": 0.7415730953216553
}, {
"x": 0.32072916626930237,
"y": 0.9100103378295898
}, {
"x": 0.2508918046951294,
"y": 0.9100103378295898
}]
}
}, {
"name": "Footwear",
"confidence": 0.9828044,
"boundingPolygon": {
"normalizedVertices": [{
"x": 0.5381702184677124,
"y": 0.9290227890014648
}, {
"x": 0.5808274149894714,
"y": 0.9290227890014648
}, {
"x": 0.5808274149894714,
"y": 0.9576336741447449
}, {
"x": 0.5381702184677124,
"y": 0.9576336741447449
}]
}
}, {
"name": "Person",
"confidence": 0.9810399,
"boundingPolygon": {
"normalizedVertices": [{
"x": 0.5125582814216614,
"y": 0.5717782378196716
}, {
"x": 0.5918540954589844,
"y": 0.5717782378196716
}, {
"x": 0.5918540954589844,
"y": 0.9574788808822632
}, {
"x": 0.5125582814216614,
"y": 0.9574788808822632
}]
}
}, {
"name": "Footwear",
"confidence": 0.97873676,
"boundingPolygon": {
"normalizedVertices": [{
"x": 0.5209354758262634,
"y": 0.9121176600456238
}, {
"x": 0.5540853142738342,
"y": 0.9121176600456238
}, {
"x": 0.5540853142738342,
"y": 0.9327118396759033
}, {
"x": 0.5209354758262634,
"y": 0.9327118396759033
}]
}
}],
"labels": null,
"ontologyClasses": [{
"name": "Chair",
"parentNames": ["Furniture"],
"synonymNames": []
}, {
"name": "Footwear",
"parentNames": ["Clothing"],
"synonymNames": []
}, {
"name": "Person",
"parentNames": [],
"synonymNames": []
}, {
"name": "Clothing",
"parentNames": [],
"synonymNames": []
}, {
"name": "Furniture",
"parentNames": [],
"synonymNames": []
}],
"imageText": null,
"objectProposals": null,
"detectedFaces": null,
"detectedLicensePlates": null,
"imageClassificationModelVersion": null,
"objectDetectionModelVersion": "2.0.3",
"textDetectionModelVersion": null,
"objectProposalModelVersion": null,
"faceDetectionModelVersion": null,
"licensePlateDetectionModelVersion": null,
"errors": []
}
I create the connection in OIC -
I only want to return a precis of the AI Vision response, so my trigger response has been defined as follows -
{{
"imageObjects" : [ {
"name" : "Person",
"confidence" : 0.98758954
}, {
"name" : "Chair",
"confidence" : 0.984481
} ],
"ontologyClasses" : [ {
"name" : "Chair",
"parentNames" : [ "Furniture" ]
}, {
"name" : "Footwear",
"parentNames" : [ "Clothing" ]
} ]
}
I complete the mapping and test -
Mc Sorley's is an institution in New York, the oldest pub in the city, in the hands of the Irish up til this very day. They only serve 2 types of beer, a dark beer, which is rather unpalatable and a lager, which is to everyone's taste. The beer is served in very small glasses, ergo, you don't order 1 you order 4 and if you're with me and the bauld Peter Meleady, 24.
Image Classification
according to the docs - Image classification assigns classes and confidence scores based on the scene and contents of an image -
The tailored response to this api invoke is as follows -
Face Detection
As the name suggests, detects faces and their X, Y positions in the image.
Text Detection
Let's try this out in OCI -
Just to note here, we need to set the featureType=TEXT_DETECTION
Here's the Request payload for the api invoke -
{
"features": [
{
"featureType": "TEXT_DETECTION"
}
],
"image": {
"source": "INLINE",
"data": "base64"
},
"compartmentId": "yourCompartment_ocid}}"
}
The Response payload I initially set to {}. I then run the integration in Debug mode, then copy and paste the json response shown in the activity stream.
I configure the REST trigger to return only a subset of this data -
Video Analysis
The video analysis includes -
Summa Summarum
AI Vision is yet another cool AI service in the OCI stack. This post is just an introduction to the service, but I hope it has whetted your appetite!
Bon appetit!