Consumer Feedback

Why text analytics and NLP are NOT the answer?

With 90% of the world’s data created in the last 2 years the world data is growing at a scary pace…there are so many ways now for consumers to share data and information that organizations everywhere need to analyze and deal with textual data. Obvious examples are customer service (returns, complaints), QA (failures, missing parts, packaging), product (popular features, negative reviews, competitive analysis) and market research (analyzing brands, products and sentiment).

With so much text to look into it just make sense to leverage technology to help you slice it into buckets and areas of interest. This is where Text Analytics and Natural Language Processing (NLP) come it.

 

So what are Text Analytics or NLP?

Text analytics (Sometimes referred to as text data mining) is the process of deriving high quality information from text. This is typically achieved through finding patterns and trends by means such as statistical pattern learning. Text analytics usually involves structuring the input text (usually parsing, along with the addition of some derived linguistic features and the removal of others, and inserting into a database), deriving patterns within the structured data, and finally evaluation and interpretation to make meaningful observations. Text analytics typically doesn’t involve the semantics in the text and is more about text patterns discovery.

NLP is a component of text analytics that performs a special kind of linguistic analysis that helps a machine “read” text. NLP is about understanding Natural Language, as Natural language is what humans use for communication. The data could be speech or text and as such the main goal is to understand what is the semantic meaning of it.

NLP and text analytics are complimentary, where typically text-mining uses NLP, because it makes sense to mine the data when you understand the data semantically

 

How does NLP work?

First, the computer must understand what each word is. It tries to understand if it’s a noun or a verb, if it’s past or present tense, and so on. This is called Part-of-Speech tagging (POS).

NLP systems also have a vocabulary and a set of grammar rules coded into the system. Modern NLP algorithms use statistical machine learning to apply these rules to the natural language and determine the most likely meaning behind what was said.

The end goal is to have the computer understand the meaning of what was said/written. This is challenging as some words may have several meanings (polysemy) or different words having similar meanings (synonymy), but developers encode rules into their systems and train them to learn to apply the rules correctly.

 

So where is the problem?

The short answer is humans training NLP systems to “read” natural language. They put in a vocabulary and set of rules for the software to look for these words as a way to figure out meaning. The problem is that language is constantly evolving, and younger people create new ways of expressing yourself around a topic that didn’t exist before. How can you train a machine to look for something that doesn’t exist yet? Obviously once you realized that there is a new way to talk about a topic you now need to bring back the experts to train the system again to recognize the new keywords, which is time consuming and likely costly. At the end what it means is that you missed the bus…by the time you realize there is a new way to talk about something that is important to you likely the train had left the station and you missed the

meaning of this.

Lets pick and example. Lets say we’re a smartphone brand and want to analyze what consumers are saying about our latest phone’s battery life. We can try to scan online reviews and search for variations of the word “Battery”, but what happens if consumers are using phrases such as “doesn’t last long enough” or “phone died on me in the middle of the work day”?

 

What’s the right way to do things?

With Artificial Intelligence and Self Training algorithms you can skip the person-training-machine steps which limit the scope of the machine understanding and is also slow in terms of response time and skip directly to a machine-training-machine scenario, growing to unlimited scale and immediate response to any variation of a meaning.

 

Conclusion

Current NLP technologies rely on humans and thus are slow to setup, miss a lot of the meaning

in texts and are slow to adapt. In a world where 90% of the world’s data created in the last 2 years you can’t rely on humans or manual labor to figure things out.

The good news is that now there is enough data to make sure you can get answers to your questions, and all you need is just to analyze the data. Revuze is an innovative technology vendor that addresses just this with the first self-training, fast setup and low touch solution that typically delivers 5-8X the data coverage compared to anything else, and it does it without humans…

Read More

Data photo 1

90% of the data was created recently

Market Research in a world where 90% of the data was created recently

According to a recent report by IBM’s Marketing Cloud,  90% of the world’s data was created in the last 2 years!!! Isn’t it amazing? The world had grown 10X in data in 730 days at a rate of 2.5 quintillion bytes of data a day!…there are so many implications for this – where do you save all this data,  how do you manage it, is there such a thing as too much data? What does it mean for the future? Will we have 10X data growth again in 2 years?

How do you shift your market research initiatives to handle such data volumes?

 

Using old Market research in a new world?

Market research used to be about data sampling and focus groups and surveys and in general peeking through the market peephole and estimating the overall market behavior and preferences based on a small group of individuals.

As data became more prevalent business started to use systems that can search data, but keep in mind these systems were set to search data that was likely 1/20 or so of what exists today.

So basically in the “old world” you either used brute manual effort or systems that could handle some data but most likely much less data then exists today.

 

How to process lots of data?

When processing lots of data, the data is typically unorganized (Unstructured is the commonly used term). Data can come from every brand encounter with consumers – emails, calls, surveys, website feedback….it also is available in the public domain online on eCommerce sites, review sites, social media etc.

This is a lot of data…sometimes in different languages, with a lot of different formats! To tackle it you need several core competencies:

  1. Deep understanding of languages
  2. Ability to relate data to a topic
  3. Learn to recognize latest ways to talk about a topic
  4. Understanding of sentiment

 

What is typically lacking in “old world” systems

Mainly autonomous decision making. When you need to go through lots and lots of data, you can’t rely on humans. Hoping humans will setup a computer system to analyze and handle any type of data is unrealistic. There are so many variations in the way people express themselves around a brand or product or feature that you can’t expect one person or even a team to figure all of these up. On top of it add languages, different data formats, ways consumers express sentiment and complexity just grows and grows…

Ideally if we wanted a technology that helps us handle unlimited data it would have to be one that can easily scale to multiple languages and data formats, automatically decipher the topics your consumers are talking about, automatically recognizes sentiment and can sum it all up for you.

 

Why is this difficult

Let’s pick and example.  Let’s say we’re a smartphone brand and want to analyze what consumers are saying about our latest phone’s batter life. We can try to scan online reviews and search for variations of the word “Battery”, but what happens if consumers are using phrases such as “doesn’t last long enough” or “phone died on me in the middle of the work day”?

 

Conclusion

Current market research technologies rely on humans and thus are slow to setup, miss a lot of things and are slow to adapt. In a world that generates more and more data each year, and the data grows so quickly, you can’t rely on humans or manual labor to figure things out.

The good news is that now there is enough data to make sure you can get answers to your questions, and all you need is just to analyze the data. No more need for feedback groups, surveys etc.

The sad news is that most tools out there to help you do this were not build for this task. Revuze is an innovative technology vendor that addresses just this with a self-learning, fast setup and low touch solution that typically delivers 5-8X the data coverage compared to anything else, and it does it without humans…

 

Read More

stages-of-business-development-and-growth_3446-636

Market research without (human) limits

According to a recent IBM research by 2020  US analyst and data jobs will grow 15% to a whopping 2.35M positions! It seems the more data there is we need more people to handle the data, especially market data. Isn’t something wrong with this picture? The more technology we have, better computers, more software options, smart machines – we still need more and more people? What do we need in terms of technology to be able to handle market feedback in a more efficient way?

Why is it so complex?

Market research is about processing lots of data. The data is also very much unorganized (Unstructured is a commonly used term). Data comes basically from every brand encounter with consumers – emails, calls, surveys, website feedback….it also is available in the public domain online on eCommerce sites, review sites, social media etc.

This is a lot of data…sometimes in different languages, with a lot of different formats! To tackle it you need several core competencies:

  1. Deep understanding of languages
  2. Ability to relate data to a topic
  3. Learn to recognize latest ways to talk about a topic
  4. Understanding of sentiment

 

Deep understanding of languages

As brands become global so are their consumers. Reviews and feedback can be provided in any number of languages and markets. Deciphering this feedback requires command of the languages in the markets where the brand sells. The larger the brand typically it will open up more markets and this in return will cause the brand to need more capabilities in new languages supported.

So if we wanted a technology that helps us mitigate this specific point it would have to be one that can easily scale to multiple languages and data formats.

 

Ability to relate data to a topic

As humans, we can’t process large amounts of data. If a brand has 50,000 feedback data points a month about a product (600,000 a year – which is not outrageous), we wouldn’t expect a person to review these data points, memorize them and summarize them to peers. Its just too much. We need the help of technology. But what type?

Most intelligent text processing technologies out there rely on people (hence the growing number of analysts) to define these groups of topics. Typically a core of 8-12 topics that are common practice such as Price, Service, Quality etc. But consumers are not limited to these topics, which means lots of data is left out of the feedback circle.

Ideally, we need here technology that can automatically decipher the topics your consumers are talking about and serve them back to you without human prep/bias.

 

Learn to recognize latest ways to talk about a topic

Another issue with the topic recognition setup by humans is to recognize new ways to talk about something. Millennials and newer generations keep inventing new ways to express themselves. A product can be “cool”, “good”, “great”, “solid” or “dope” – how do we keep up? One way is to continue to rely on humans to learn the new phrases, implement them into systems and track the new topics. Its time consuming, meanwhile we may miss market feedback or opportunities, and it requires us to keep piling up analysts…

If we wanted a technology that helps us mitigate this specific point it would have to be one that can learn to recognize new ways of saying “good” or “bad” as well as new discussion topics worthy of brand attention.

 

Understanding of sentiment

Similar to the previous clause, sentiment can be described in many ways/formats/languages and sometimes feedback also lacks sentiment…to be able to correctly identify and keep up with feedback you need a flexible way to pick up on new forms of sentiment as they appear (and not in retrospect) and also know to recognize when there is no sentiment included.

 

Conclusion

Current market research technologies rely on humans and thus are slow to setup, miss a lot of things and are slow to adapt. Revuze is an innovative technology vendor that addresses just this with a self learning, fast setup and low touch solution that typically delivers 5-8X the data coverage compared to anything else, and it does it without humans…

Read More