Subscribe Now

Stop Struggling with Documents – Automate 95% of the Work with AI

Explore how generative document intelligence (GDI) powered by large language models (LLMs) is transforming intelligent document processing (IDP) for high-volume use cases like invoice automation. This video highlights key concepts such as document decomposition, AI-driven data extraction, prompt engineering, and flow orchestration to increase accuracy, reduce manual effort, and accelerate business processes.

Vinny LaRocca: Thanks for joining, everyone. We’re going to get started in a second here. Just as we’re giving the presentation, feel free to put any questions in the chat—we’ll get to them at the end. My name is Vin LaRocca. I am the Director of Engineering for Lydonia Technologies. I’ve spent about a decade in integrated AI and Automation and multiple years with various IDP platforms, one of which we have here today—Pienso. So I’ll turn it over to Dan to introduce himself. 

Dan Leszkowicz: Hi, everyone. Dan Leszkowicz. I head up sales alliances here at Pienso. I’ve been in the tech consulting enterprise AI space for coming up on 16-17 years at this point. Pienso is focused on intelligent document processing, specifically doing IDP with the latest and greatest of AI. Think of the large language models that you see in headlines these days, applying those for some real-world use cases where people need to automate document processing at a high volume. We’re going to take you through what that looks like today. So one of the best places to start is just a basic understanding of what we mean when we say generative document intelligence. The way I like to describe it is really to start by comparing it to what most folks on the phone are probably used to with respect to IDP. So, for any folks who have been through an IDP project in the past several years, you’re probably familiar with the top half of the slide. And for folks who’ve really been deep in it, this slide might be triggering because it brings back memories of some of the hassles of working with those tools. Typically, you’re stitching together a collection of tools, processes, and methodologies to get the outcome—specifically, take a document, let’s say an invoice, a contract, or what have you, and extract some information so you can feed that into a business process downstream. 

Now, our approach is a little bit more streamlined. I’ll go into it in the next slide. But think of GDI as taking more of an LLM native approach. So, rather than having to stitch together these different tools in a DIY fashion, GDI is more of a turnkey solution that allows you to handle that variability of documents in a more streamlined way. 

Vinny LaRocca: Yeah. And for those who maybe aren’t as familiar with some of the methods up top, it typically requires a really significant lift to get some of this stuff up and running. Anybody who’s using this would be familiar with machine learning extraction. Typically, to set that up, we’re talking about manually annotating somewhere between 500 to a thousand documents, which obviously has a significant timeline associated with it, as well as iteration on that depending on what the results come out to, adding more variations and more samples, testing to figure out where we’re actually at, and then maintenance and upkeep year over year. So it doesn’t just go away because you built it. You know, machine learning, and as your invoices change and you get new vendors and things are happening, you need to adjust these models and add new variations and templates. And so we see that the lift, just to keep it running over time, is pretty significant as well. 

Dan Leszkowicz: That’s actually a great point. I’m thinking that we need to update the slide for that squiggly thing to just continue on forever and ever, because you’re never really out of the cycle. So, like I mentioned, GDI handles things a little bit differently. One of the key things to keep in mind with GDI is, behind the scenes, we’re employing what I like to call a society of models or a symphony of models, where we’re using different models to accomplish different tasks within this overall process flow. So, breaking it apart and clicking into each sort of chunk or workload, one of the things that’s really important to do is, instead of just treating a, let’s say, a PDF invoice maybe as a single PDF and throwing it to a model and asking questions and hoping for the best, what we’ve found via work with our customers and research is that it’s really important to break down those documents into the constituent pieces. So, if you think about an invoice—and I’ll use invoice as an example, partially because I think a lot of the folks on the phone are probably looking at automating invoice processing—and that’s actually what we’ll show in our demo today too—but if you think about an invoice, it’s not just a single piece of glass document; there are different things going on within that document. You’ve got some header information, most likely. There is probably a line-item table that could have anywhere from one to hundreds of line items, and then there’s likely some payment information, maybe some footnotes, maybe even some handwritten notes on top of it. And one model is not going to be good at handling all of those things, or at least it’s not going to be best of breed. And so the first thing we do is we deconstruct that document into those constituent pieces so that we can treat each one of them differently, each one of them as a first-class object in this workload. Then, with each of those deconstructed pieces, we can take action using the LLMs via some really sophisticated prompt engineering that is part of this kind of turnkey solution. We’re effectively asking the model to execute some tasks on our behalf—pull out invoice number, payment date, etc., from those deconstructed pieces. 

And then the final, really sort of the secret to our sauce here, is what I’ll call flow engineering. So, think of that as employing that kind of society of models where each model is good at a different thing. Once we’ve extracted the information that we’re looking for, we then pass it to a set of models that are just very good at verifying information. So, think of a model coming in over the top and independently verifying the piece of information that was extracted versus the source document, and saying, “Thumbs up, thumbs down. Does this look good enough to pass through to the business application, or does it require some sort of human-in-the-loop, manual verification?” When you put all this stuff together—extraction, prompt engineering, flow engineering—we call that a custom chain. Think of a chain; I’ll refer to that throughout the session today. Think of it as the fully baked tool that is doing all these things for whatever your workload is. Maybe it’s processing invoices. Maybe it’s processing contracts, etc. 

So, with that, probably the best way to make this real, Vinny, if it sounds good to you, I think I’m going to bring up the demo and show folks what we’re looking at. 

Vinny LaRocca: Yeah, let’s see it. 

Dan Leszkowicz: Awesome. So I’ve pulled up a demo environment. This is all going to be fabricated data, or open-source data, so you’ll see there’s some kind of fictitious documents here. The demo that I’ve got queued up here is for invoice processing. So it’s again a very common workload that we help customers through. So there’s an AP department. They process thousands of invoices a week or a month, and they want to automate it. Specifically, the data entry portion. 

Now, this demo environment is a kind of way for you to visualize what is happening behind the scenes. One thing that I’ll mention, though, is in a production scenario, it’s even easier because we basically are right now looking at a window into the API. In a production scenario, GDI is stood up as an API. You’re sending your documents to it via an RPA tool. Pienso is working its magic, extracting the relevant pieces of information, and then we’re passing those results as JSON typically back to that RPA tool, where it ends up in your line of business application. So even more seamless. But this is a way to kind of peek behind the curtain and see what’s happening. 

So I’m going to start by uploading a document so you can see this happening in real time. Let me get my document here, and I’m going to add it. You can see a pop-up. Like I mentioned, this is a fictitious document. This is not from an actual purchase of wizard ones, but it’ll kind of convey what we’re doing here. So I’m going to hit start. So generally, the layout here, there’s a document on the left, and then we’re going to pull the pieces of information that are going to show up on the right. So you’ve got sort of a side-by-side comparison, makes it really easy to understand what accuracy you’re getting, what straight-through processing we’re getting. So I’m going to hit start extraction. This is probably going to take like 20, maybe 30 seconds, so I’m going to take the Julia Child’s approach, and show you the turkey that is mostly baked in the oven and pull up one that recently finished. So again, similar invoice, and then extracted pieces of information, and as I go through it, you can see the things that out of the box what our invoice processing chain is picking up. 

So, I won’t go through all of them, but you can see invoice number, invoice date, PO number, a combination of letters and numbers actually, the receiver and ship-to address, if they happen to be different, the total amount, and then all the way down here, we’re actually pulling all the line item information. So you can see description, quantity, unit price, and total price pulled for each of these line items. This ends up being really important, this the line item information. That is historically, where a lot of IDP tools have fallen short because tables are honestly, very, very rarely this straightforward. What you often run into is pretty complex tables, or sort of nested tables or tables, where maybe a header row kind of cuts across a couple of pieces of information. It’s really difficult to make sense of it. That’s where this generative document intelligence approach comes in handy compared to traditional ML. Where it’s really difficult to train for all those permutations and templates. 

Vinny LaRocca: Yeah. And you know, we’ve seen with generative AI and LLMs kind of in general, this is sort of a problem area, so being able to pull tables is pretty unique. 

Dan Leszkowicz: Yeah, and that is actually a perfect transition. You know, this is a demo environment. I’m showing some demo invoices, but I think it’s probably good to take a step back and say, what does that mean from a real-world perspective in terms of accuracy and STP? So, I’ll pop back to the slides for a sec for that. 
Yes, Vinny, do you want to give your thoughts here based on what you’ve seen? 

Vinny LaRocca: Yeah. So, I mean, I think, first of all, I just want to explain the difference between accuracy and STP because I think these can be kind of confusing concepts if you’ve never seen them before. When we’re referring to accuracy, what we really mean is, let’s say I have an invoice with 10 fields. And I get 9 out of 10 of those fields. That’s 90% accuracy. But because I missed a field on that document, I’m going to have to have a person look at it and make a correction. So, because I didn’t get 100% of the individual document in this example, that would be 0 straight-through processing. That document is going to have to be looked at. And so, accuracy, in a sense, is a precursor to STP. 

So, you know, we see that you can actually have some pretty high accuracy rates and some really, really poor STP rates, and STP is really what you care about. That’s what’s going to drive your ROI. In some cases, depending on volume, every one percentage point could be equivalent to thousands of invoices. And so, it’s really important to get that as high as possible so that you don’t need to be looking at these documents. And accuracy really isn’t a number that’s worth benchmarking against because it doesn’t really mean anything without the STP number. That’s just worth understanding and pointing out. 
And here, you know, I think that you kind of see that played out here, right? In fact, in this particular example, the generative extractor from the left side in red was really struggling with tables. And so, what we found was, yeah, it’s getting most of the other stuff, but because it can never get the same sets of fields, our STP was pretty low, and I’m still going to have to go look at half of my invoices and manually triage that. And so, that doesn’t really help me from an automation perspective. Pienso, on the other hand, is getting STP results that, quite frankly, we didn’t see, even after all that work with machine learning. We didn’t see STP this high and we certainly haven’t seen it at any other GenAI tools yet. So, these are numbers that other IDP platforms haven’t even been able to sniff yet. 

Dan Leszkowicz: Awesome. Thanks for that. Kind of bringing it all together, there’s a couple of things that we’ve sort of peppered into the conversation, but we can be super explicit about too. Trying to compare, you know, traditional IDP versus what we’re doing here with GDI, there are basically kind of four pillars where I think the difference becomes very clear. 

One of them is the higher coverage that we’re able to get across different documents. So, frequently we’re solutioning with customers. They’re very rarely looking at a single template, like a single structure. If you think about an AP department, they’re processing invoices from hundreds, potentially thousands of different vendors. Each one is a different template. 

That variability is usually where STP goes to die in terms of traditional IDP tools. Because we’re taking this native LM approach and employing that society of models where we can bring in the right expert for the right job, we’re able to handle that variability upfront without a massive training effort. We’ve kind of unpacked high accuracy, and you saw some of the results as they came through. It sort of speaks for itself. 

I’ll really double down on what Vinny said. I really agree with that. That STP is more important. I think that is one of the kind of confusing pieces in the market right now, where a lot of IDP vendors will tout high accuracy. But when you double-click, it’s that situation that Vinny mentions, where there’s always one field that has to get a human touch. 
And my perspective is, well, if everything needs a human touch, what’s the point of the automation in the first place, right? 

Vinny LaRocca: You’re paying for AI, you want an ROI on it, right? And if it hasn’t saved you any time, because you still have people looking through these documents, then why’d you do it in the first place? 

Dan Leszkowicz: Yeah, 100%. And then the final piece here is the lower total cost of ownership, which sort of naturally you’ve probably been thinking as we’ve talked about some of these things in terms of like less training, faster turnaround time, all that sort of stuff, faster time to deployment. One of the pieces that kind of underlies this pillar, though, is the way we price our solution. So, most tools out there price based on some measure of volume. Usually, it’s number of documents processed, or number of pages, or there are some that price based on number of characters, you know, tokens. 
We try to avoid that. We do avoid that. Rather, we price based on the task or the use case. So, for example, processing invoices, one task, one use case. Processing broker commission statements, one task, one use case. And then within that use case, we don’t need to care, and therefore you, as the customer, don’t need to care what sort of volume you’re dealing with. How many pages per doc, how many documents, etc. 
So, what that does is, in addition to just purely keeping the cost much lower, it also smooths out that spend. So, you don’t have to worry about spikes, if one month you have a high volume of documents, etc. 

Vinny LaRocca: Yeah, between some of these IDP platforms and just AI in general, and the way it’s all being charged for, we’re getting to the point where you need, like, a Ph.D. in differential equations to be able to figure out what this stuff is going to cost, and it makes it really difficult, you know, to tell customers what kind of ROI they’re going to get, because there’s variability in this stuff. You can take an average of what you’re going to get in documents annually. That’s a rough, rough number, and then if that changes, or you scale, now all of a sudden your cost has gone up, and you’re kind of wondering, was this worth it in the first place? And, you know, this problem kind of extends past even IDP to really AI in general and Pienso is really solving for that problem here. 

Dan Leszkowicz: So, that’s actually a perfect transition to, I think, the last slide that we wanted to make sure we cover. Demos are great, talking about, you know, generic STP and accuracy is great. But Vinny, can you give us your perspective on a couple of real-world success stories around document processing and what that means in terms of hard ROI, and, you know, savings to the business or improvements? 

Vinny LaRocca: Yeah, yeah, for sure. So, yeah, two examples here. So, on the left, you know, we have a company, 800,000 invoices and 12,000 different vendors. So, 800,000 is a very large amount of invoices just compared to what we see on average, those also encompass many pages. So, again, you get into the scenario where most platforms are going to be priced off a page. And so, the cost to do that with some of the platforms would have been enormous. 12,000 vendors, again, like we mentioned prior, that’s where we see a lot of these platforms start to fall apart is when you get into these bigger numbers, when you’re at scale and you’re actually in production. So, we see a lot of, you know, bake-offs and POCs going on within companies when we start talking to them. And, you know, they’re not taking into account the true scalability of some of these platforms. And so, we were able, even with all that variation and all that volume, to get 90% straight-through processing coming out of that. On the right, we have a different use case and kind of the same story. The volume was a little bit lower, but still, a big amount of vendors. 
And we saw 200% ROI and a 94% reduction in time on task. So, really, what we’re saying there is, what we’re talking about is not necessarily just the IDP piece isolated, but the sort of full end-to-end commission statement process. We were able to reduce basically 94% automation end to end. So, we’re kind of taking a bigger picture there and showing, you know, what the elimination of time truly was by deploying a solution like this. 

Dan Leszkowicz: One of the things that I think is interesting is in both these, you know, massive uptake in straight-through processing, less time on task, and all that sort of stuff. What I’m finding is every organization has a different motivation for tackling a project like this. So, in some cases, there’s, you know, an outsource vendor who’s doing this manually today. In some cases, it’s onshore vendors. And obviously, that’s even more cost-intensive, and the goal is to reduce that need, right? That headcount wherever it sits. 
Just as often, if not more often, actually, just an organization we’re talking to earlier today, the goal isn’t to remove those headcounts, it is to allow them to do higher-order tasks, right? Those are folks who have other things to do besides data entry and data validation, and freeing up hours and hours of their week allows them to tackle those things that are more helpful for the business, and honestly more fulfilling to those individuals too. So, that is one of the motivations kind of behind some of these hard ROI numbers. 

Vinny LaRocca: Thanks for joining, everybody. Appreciate it. 

Dan Leszkowicz: Thanks, everyone. Thank you, Vinny. 

Follow Us
Related Videos
Add to Calendar 12/8/2021 06:00 PM 12/8/2021 09:00 pm America/Massachusetts Bots and Brews with Lydonia Technologies On December 8, Kevin Scannell, Founder & CEO, Lydonia Technologies, will moderate a panel discussion about the many benefits our customers gain with RPA.
Joining Kevin are our customers:
  • James Guidry, Head – Intelligent Process Automation CoE, Acushnet Company
  • Norman Simmonds, Director, Enterprise Automation Expérience Architecture, Dell TechnologiesErin
  • Cummings, CIO, Norfolk & Dedham Group

We hope to see you at Trillium Brewing on December 8 for craft beer, great food, and a lively RPA discussion!
Trillium Brewing, 100 Royall Street, Canton, MA