This article is based on a presentation at IBA’s Spring 2023 Conference on NLP/NLG given by Scott Curtis, VP of Service Delivery at Deloitte Consulting.
Imagine this: VAP Semiconductor Corporation, a fictitious organization, receives questions about potential financial improprieties and irregularities in their bank reporting by some of their employees. An expert in electronic discovery is hired to investigate the claims and is given three data sets: five years of emails from 25 specific employees, five years of text message conversations between employees, and five Terabytes (TB) of documents and spreadsheets from file servers.
There’s more data in those sets than you might initially think, pointed out Scott Curtis, Vice President of Service Delivery at Deloitte Consulting. An email message could have someone BCC’d and have a number of individuals CC’d, a user could reply to the message and take someone else off the email thread, and then there could be even further messages between two users beyond that. In a population like this, there are probably around half a million records of emails for five years across 25 employees, Curtis said, plus an unexpectedly wide breadth of data in the text messages and documents.
How might the expert analyze all of this collected data?
Curtis presented this fictitious case study during his presentation at the Spring 2023 Analytics Conference on Natural Language Processing and Natural Language Generation, titled “NLP: Reading Your Emails And Your Contacts.” While other presenters during the conference explored the impact of NLP for biomedicine, pharmacy, and other corporate contexts, Curtis delved into how NLP can be used to analyze unstructured text data in the business world.
In the “VAP Semiconductor Corporation” example, twenty years ago, a trained reviewer might go through the messages one by one, searching by keywords with a program that’s similar to an “automated Ctrl+F” command, Curtis said. He described the process as time consuming, expensive, and laborious.
Natural Language Processing (NLP) changed this, though, giving reviewers newer, less tedious ways to analyze data, including identifying certain concepts and patterns to focus on or eliminate. In this example, emails from the five employees on the financial reporting group about meeting up for lunch might not be relevant, but discussions about balance sheet income statements are probably important — especially if the concept is found in emails with someone from an unrelated department.
“By training NLP models to identify certain patterns, we can identify all of the emails that match a concept, group them into a cluster, and apply those against the other hundreds of thousands of messages,” Curtis said. “Anything that matches a cluster that isn’t relevant, we can push to the side.”
Abilities and Advantages of NLP
One of the main goals of NLP, Curtis said, is to make significant improvements over the current process. Improvements may include decreasing time or money spent on a task, improving accuracy and confidence in a process, or reducing human involvement.
“If technology only offers a 10% gain, it might only be worth it on a really significant matter,” Curtis said. “But if it’s a 50% gain in the time it takes me to get to the right answer, or if it increases my confidence that I have the right answer, it’s worth considering.”
In his role at Deloitte, Curtis has focused on the different ways that analytics and technology can be used to improve what is done in and around contracts to help clients. In his second case study example, Curtis described a fictional cycling company that needed to identify the active purchase agreements that existed among more than 10,000 contracts. In this case, NLP could be used to identify which contracts are purchase agreements, then what differences exist among the export clauses in the purchase agreements.
“We could identify five different iterations of export clauses, put those into buckets, then show the client how many agreements exist in each bucket,” Curtis said. “We could say, ‘these are the ones that have the language that you said you want, so you don’t need to do anything.’ In the others, they would be able to determine with their legal department what actions they need to take.”
NLP and related technology can also support problem solving, Curtis said. For example, if a client from the fictional cycling company asked a follow-up question, NLP would make it easy to perform additional tests or searches on the documents to find other attributes that might be of interest.
Managing Expectations Around NLP
Finally, Curtis offered ways to navigate the “expectations gap” that might come up from clients if a company has NLP technology. He interacts often with clients who understand that they need to use the broad umbrella of Artificial Intelligence (AI), and maybe even specifically NLP. They might understand the broad capabilities of NLP, but expect that they’ll get results from just pushing the “magic AI button.”
Curtis would explain to clients that they can use preexisting models, either created by Deloitte or a software company they’ve partnered with, or just Amazon Web Services (AWS). If a client doesn’t have the resources to create and train with new models, AWS may still be able to give them 80% of the data they need with significant cost savings.
“Working with our clients around their expectations is a large part of what we do on these projects,” Curtis said. “We make sure that they can get what they’re looking for at the end of the project, and that we set their expectations accordingly for how it works.”
Leave a Reply