How much will it cost to learn to use a new survey software system?
As a software vendor, I am often surprised how fo…Read more
This article looks at the quality of data in Social Listening analytics. Getting hold of data these days is easy. Data is plentiful too. Getting bad data is easy. Getting false interpretations of data is easy. The argument that big data must be accurate because there is such a large volume of data is a flawed argument too. Wrong assumptions and inferences about the data can lead to huge mistakes.
Big data is often out of date – hugely according to a Deloitte’s survey where respondents were asked to review data held about them. Computer systems make bad judgements – or no judgement such as when I ordered three books on sport on my wife’s Amazon account; she is now inundated with offers for more books on a topic in which she has no interest. And, let’s not forget the false logic that is made from big data and then applied to thousands or millions of people. Add to this that with enough data, you can probably bend the data to highlight or “say” what you want. Big data offers a lot, but it must be used responsibly.
Market research has generally traded on small data – small samples with lots of questions and data fields. The number of respondents is often less than 1000, sometimes less than 100. The data has (hopefully) been sampled carefully and checked carefully, but it is open to misinterpretation and samples that are too small, particularly small sub-samples within the full sample. The advantage that small data (like market research data) has over big data is that most practitioners in the field are arguably more aware that they need to treat their data with care. There are also organisations like ESOMAR and national associations that don’t guarantee high standards but contribute hugely to ensuring that research professionals work to high standards.
Where does this leave social listening data? Social listening data is taken from public posts in social media. The number of posts analysed will usually run to many thousands or even millions. So, if we are cautious about the problems cited above with this type of big data will everything be OK?
Out of date isn’t a risk with social listening data assuming that it is properly date-stamped. The risk of inaccuracy within the data is low as real people really did post the stuff that is being analysed. There’s the risk that people are lying, but that’s true about any survey data – or, in fact, most data. Lying is less likely when giving opinions about brands, for example, than, perhaps, on a matchmaking website where someone might want to appear as nice as me (ahem!).
The single biggest risk to social listening data is that it is not filtered to remove irrelevant data. It is no coincidence that there are around 1000 companies offering social listening analysis and reporting. Most of those (let’s estimate this at 98% or more of those companies) know nothing about market research and its standards, which only reflects the fact that grabbing data and producing some results is an easy business where choosing your supplier is important.
Filtering out irrelevant data is an important part of the process that needs to be in place before a project goes live. There are many examples of huge errors if data is not filtered to remove irrelevance – for example Apple (the computer company) vs. apple (the fruit), Dove (the soap) vs. Dove (the chocolate) vs dove (the bird), Delta (the airline) vs. Delta (loudspeakers) vs. Delta (credit card company) vs. delta (tourists to river deltas). There are many more examples, of course.
Removing this irrelevant data is of prime importance. It is achieved by text analytics whereby programs are written to exclude irrelevant data. It begs the question “Can it be perfect?” The answer is “no”, but believing any significant volume of data is 100% accurate is simply ignoring a truth. However, getting close to 100% accuracy is the key here.
The accuracy of Social Listening data may vary between 30% accuracy and 90% accuracy depending on your supplier. If irrelevant data is not excluded, results will be at best misleading and at worst total garbage. Complex algorithms are built to process social listening data to ensure only the correct data is used in analysis.
Another concern when analysing social listening data is that posters tend to have extreme views that are not representative – most posters are either delighted or angry. There is some truth in this but it is often exaggerated. One of the key measures that comes from a good Social Listening project is that you can derive a Net Sentiment Score – this is not dissimilar in concept to the widely used Net Promoter Score, which, as it happens, takes the extremes by subtracting detractors from promoters.
The benefit of the Net Sentiment Score is that it is a much more sensitive reflection of what is happening. Whereas the Net Promoter Score, particularly in sectors such as finance, insurance and FMCG, hardly moves in many cases from year to year, the Net Sentiment Score will give real insight into what is happening at a given time. And, there will be a reason, which, if acted upon, can benefit a company hugely.
Social listening data is different to other big data. It’s different from a lot of other market research data. If handled professionally, it offers more insight than many other forms of research and analysis. Its accuracy will depend on the methodologies used to process the data. It’s a market where there are plenty of poor sources of data, but that doesn’t mean that it is to be avoided. Choosing a professional tool like our Listening247 service can give real and accurate insights.
To find out more about our Listening247 service, ask us for a free brand report that will give you an overview of a brand or service that you would like to research. Contact firstname.lastname@example.org. for more information.