Sequel to AI-Powered Order Structuring: Advancing to JSON Transformation

Following my earlier post on leveraging generative AI to convert informal order instructions into structured JSON, I’ve been developing a proof of concept with our IDP tool, Abbyy FlexiCapture.

I’ve transformed an order that arrived in the body of an email, with the text extracted during an FlexiCapture workflow:

“Can I order: Same red willington boots as last week, product code WEL1234, price was 15 pounds I think. A roll of hay, 3m deliver that to yard 25kg bag of rabbit feed, the one that costs 19.99 Order number is ord78465 and I wnat everything delivering to recepption apart from the hay.”

Into a structured JSON:

{

“Items”: [

    {

      “quantity”: 1,

      “product_code”: “WEL1234”,

      “product_description”: “Wellington Boots”,

      “unit_price”: 15,

      “pack_size”: “”,

      “item_size”: “”,

      “colour”: “red”,

      “order_reference”: “ord78465”,

      “required_date”: “”,

      “delivery_notes”: “to reception”

    },

    {

      “quantity”: 1,

      “product_code”: “”,

      “product_description”: “Roll of Hay”,

      “unit_price”: “”,

      “pack_size”: “”,

      “item_size”: “3m”,

      “colour”: “”,

      “order_reference”: “ord78465”,

      “required_date”: “”,

      “delivery_notes”: “deliver to yard”

    },

    {

      “quantity”: 1,

      “product_code”: “”,

      “product_description”: “Rabbit Feed”,

      “unit_price”: 19.99,

      “pack_size”: “25kg”,

      “item_size”: “”,

      “colour”: “”,

      “order_reference”: “ord78465”,

      “required_date”: “”,

      “delivery_notes”: “to reception”

    }

],

“Accuracy scores”: [

    {

      “quantity”: 1,

      “product_code”: 1,

      “product_description”: 0.8,

      “unit_price”: 0,

      “pack_size”: 0,

      “item_size”: 0.8,

      “colour”: 0.8,

      “order_reference”: 1,

      “required_date”: 0,

      “delivery_notes”: 1

    },

    {

      “quantity”: 1,

      “product_code”: 0,

      “product_description”: 1,

      “unit_price”: 1,

      “pack_size”: 0,

      “item_size”: 1,

      “colour”: 0,

      “order_reference”: 1,

      “required_date”: 0,

      “delivery_notes”: 1

    },

    {

      “quantity”: 1,

      “product_code”: 0,

      “product_description”: 1,

      “unit_price”: 1,

      “pack_size”: 1,

      “item_size”: 0,

      “colour”: 0,

      “order_reference”: 1,

      “required_date”: 0,

      “delivery_notes”: 1

    }

]

}

The initial output from ChatGPT looks like this:

{

“id”: “”,

“object”: “chat.completion”,

“created”: 1713265964,

“model”: “gpt-3.5-turbo-0125”,

“choices”: [

    {

      “index”: 0,

      “message”: {

        “role”: “assistant”,

        “content”: “{\n \”Items\”: [\n    {\n      \”quantity\”: 1,\n      \”product_code\”: \”WEL1234\”,\n      \”product_description\”: \”Wellington Boots\”,\n      \”unit_price\”: 15,\n      \”pack_size\”: \”\”,\n      \”item_size\”: \”\”,\n      \”colour\”: \”red\”,\n      \”order_reference\”: \”ord78465\”,\n      \”required_date\”: \”\”,\n      \”delivery_notes\”: \”delivering to reception\”\n    },\n    {\n      \”quantity\”: 1,\n      \”product_code\”: \”\”,\n      \”product_description\”: \”Roll of Hay\”,\n      \”unit_price\”: \”\”,\n      \”pack_size\”: \”\”,\n      \”item_size\”: \”3m\”,\n      \”colour\”: \”\”,\n      \”order_reference\”: \”ord78465\”,\n      \”required_date\”: \”\”,\n      \”delivery_notes\”: \”deliver that to yard\”\n    },\n    {\n      \”quantity\”: 1,\n      \”product_code\”: \”\”,\n      \”product_description\”: \”Rabbit Feed\”,\n      \”unit_price\”: 19.99,\n      \”pack_size\”: \”25kg\”,\n      \”item_size\”: \”\”,\n      \”colour\”: \”\”,\n      \”order_reference\”: \”ord78465\”,\n      \”required_date\”: \”\”,\n      \”delivery_notes\”: \”delivering to reception\”\n    }\n ],\n \”Accuracy scores\”: [\n    {\n      \”quantity\”: 1,\n      \”product_code\”: 1,\n      \”product_description\”: 1,\n      \”unit_price\”: 0.8,\n      \”pack_size\”: 0,\n      \”item_size\”: 0.8,\n      \”colour\”: 1,\n      \”order_reference\”: 1,\n      \”required_date\”: 0,\n      \”delivery_notes\”: 1\n    },\n    {\n      \”quantity\”: 1,\n      \”product_code\”: 1,\n      \”product_description\”: 1,\n      \”unit_price\”: 0,\n      \”pack_size\”: 0,\n      \”item_size\”: 1,\n      \”colour\”: 0,\n      \”order_reference\”: 1,\n      \”required_date\”: 0,\n      \”delivery_notes\”: 1\n    },\n    {\n      \”quantity\”: 1,\n      \”product_code\”: 0,\n      \”product_description\”: 1,\n      \”unit_price\”: 1,\n      \”pack_size\”: 1,\n      \”item_size\”: 0,\n      \”colour\”: 0,\n      \”order_reference\”: 1,\n      \”required_date\”: 0,\n      \”delivery_notes\”: 1\n    }\n ]\n}”

      },

      “logprobs”: null,

      “finish_reason”: “stop”

    }

],

“usage”: {

    “prompt_tokens”: 749,

    “completion_tokens”: 517,

    “total_tokens”: 1266

},

“system_fingerprint”: “”

}

Before using the output, some JSON parsing is needed.

I’ve learned that crafting the right prompt is key for accurate, consistent and usable results. My prompts specify:

- The AI’s role.
- A blank JSON template.
- A sample order.
- A JSON example for that order.
- How the AI should assess accuracy.

The terms I use are direct:

- “You are part of a processing solution…”
- “Format the output as JSON entries…”
- “Include these possible JSON nodes…”
- “An example structured output is…”
- “For this order text…”
- “The correct output would be…

Generative AI can struggle with creating usable JSON, so examples are crucial. It needs explicit instructions for consistency and an accuracy scoring system to avoid ‘hallucinations’. There’s a lot to learn, and I’m still on that journey. What are your strategies for ensuring your results are accurate and consistent?

Richard Fishburn – The Automation Advocate

Richard Fishburn – The Automation Advocate

2 Comment on this post

Join the conversation Cancel reply