NLP and text mining have grown together in recent years.
The value of NLP in combination with text mining for business growth has become too hard to ignore.
Today I'll explain why Natural Language Processing (NLP) has become so popular in the context of Text Mining and in what ways deploying it can grow your business.
Before we get started, let's define both terms:
Text Analysis (a.k.a Text Mining) definition: it's the process of understanding and sorting text, making it easier to manage. Text analysis could possibly be the last piece of the puzzle of growth every business is trying to solve. After all, in the information-saturated era we live in, what can be of more value than the organising of this information in a structured and meaningful way that we humans can understand.
Natural language processing (NLP) definition: it's a subfield of artificial intelligence concerned with the interactions between computers and human (natural) languages, in particular how to program computers to understand, interpret and manipulate human language.
In this article, we will walk through a business case. We'll look at all the solutions and compare them, so that you can see why NLP takes text mining to the next level.
Tom is the Head of Customer Support at a successful product-based, mid-sized company. Tom works really hard to meet customer expectation and has successfully managed to increase the NPS scores in the last quarter. His product has a high rate of customer loyalty in a market filled with competent competitors. Things are going well in Tom’s perspective.
But suddenly, he starts to notice a higher volume of support tickets. Tom is really worried because he can't view each ticket manually to be sure what's caused the sudden spike.
At first, he goes the laborious route.
He decides to hire a data analyst. The analyst sifts through 1,000s of support tickets, manually tagging each one over the next month to try to identify a trend between them.
After about a month of thorough data research, the analyst comes up with a final report bringing out several aspects of grievances the customers had about the product. Relying on this report Tom goes to his product team and asks them to make these changes.
Afterwards, Tom sees an immediate decrease in the number of customer tickets. But those numbers are still below the level of expectation Tom had for the amount of money invested.
He also has the following concerns:
In a quest for alternate solutions, Tom begins looking for systems that were capable of delivering quicker and could also cater to his changing needs/queries. It didn’t take long before Tom realized that the solution he was looking for had to be technical. Only leveraging computational power could help process hundreds of thousands of data units periodically and generate insights that he’s looking for in a short span of time.
Having realised that, Tom reaches out to a software consultancy company.
Thanks to technology, their solution:
Tom’s manual queries are treated as a problem of identifying a keyword from the text. So for example if Tom wants to find out the number of times someone talks about the price of the product, the software firm writes a program to search each review/text sequence for the term “price”.
The main principle being that if a word appears in text it can be assumed that this piece of text is “about” that particular word.
E.g. "I like the product but it comes at a high price."
This approach is closely linked to the former one. Both operate on the principle of pattern identification, but only predefined ones.
More often than not a text is not about just any particular word. For instance, in the example above ("I like the product but it comes at a high price"), the customer talks about their grievance of the high price they’re having to pay.
So there is an inherent need to identify phrases in the text as they seem to be more representative of the central complaint. These phrases are what is referred to as rules.
Any system that uses these pattern rules to mine aspects from the text are called rule-based systems and they have the following benefits:
These two principles have been the go-to text analytics methods for a long time. Most services in this domain are based majorly on creation of rules.
Rule creation has been a win for Tom:
Like with any good story, there's a catch. A few months down the line, Tom sees similar trends in increasing tickets. He doesn’t understand, he’s already made iterations to the product based on his monitoring of customer feedback of prices, product quality and all aspects his team deemed to be important.
Worried about the growth of his company, Tom seeks advice from an NLP scientist - Sarah. After a brief conversation with Sarah, Tom realises he’s been getting it all wrong...
In the context of Tom’s company, the incoming flow of data was high in volumes and the nature of this data was changing rapidly.
Rule-based methods lacked the robustness and flexibility to cater to the changing nature of this data.
Sarah further explains that although Tom was monitoring the data with respect to aspects he considered to be red flags (like pricing, size etc.), the red flags in the data were constantly changing and it’s almost impossible to move at the pace of the changing data using handcrafted rules.
Tom realises he was only seeing what he wanted in the data. He wasn’t really seeing what the data had to show.
Sarah advises that Tom works with an NLP-powered Customer Experience Analytics company and explain his problems to them. And Tom does so.
A deep-tech AI company uses the power of Machine Learning & Statistics through NLP. The central idea revolves around:
If there is anything you can take away from Tom's story, it is that you should never compromise on short term, traditional solutions, just because they seem like the safe approach. Being bold and trusting technology will definitely pay off both short and long time.
As most scientists would agree the dataset is often more important than the algorithm itself. We, at Sentisum, have mastered the use of deep learning models and curating your data to gain insights for our customers and we do the same for not one but multiple tasks like Sentiment Analysis, Keyword Extraction, and many others.