Productivity using market research software matters but how do you measure it?
In my experience, I find that most people who buy…Read more
If you want to understand rim weighting or target weighting more, read these articles, first of all.
My previous two articles on rim weighting, target weighting and effective sample sizes have led to several people communicating with me. Our free rim weighting calculator has now been sent to over 200 individuals or companies. As a responsible provider of this type of information, both the rim weighting and target weighting calculator come with warnings about their use. Let’s look at those issues in a bit more detail – and test yourself to see if you think weighting matters and whether you need to think about it more before you use it.
Weighting is a commonly used method in market research to adjust incorrect samples. The reasons will amount to the same thing but due to several different reasons. These include:
- The sample was self-selecting meaning that the sample may not be representative
- The sampling was carried out wrongly
- Some quotas were difficult or impossible to fill
- You ran out of time and the client wanted the data!
Weighting the data is a solution, but whether it is a good solution may be a different thing altogether. The key point is that as soon as you weight data, you are reducing the effective sample size unless you have got a perfect sample. In other words, if you had been able to collect data from the right sample, you could have interviewed that smaller sample size (and, let it not be forgotten, save some money). And, that’s what the effective sample size, in a nutshell, means. You start with, say, 400 respondents, but after weighting, your effective sample size is, say, 340. In other words, in this example, you could have sampled 340 correctly and you would have the same level of representativeness from your data.
I am now going to have a brief rant and, then, I will stop. I have seen researchers look at their sample after fieldwork, realise that it is way off what it should be and glibly tell the data processing team to weight the data without a thought. It fixes the problem on the surface, but if the effective sample size turns out to be far less than the original sample size, you may not be delivering as robust data as your client thinks. And, have I mentioned it, you may have wasted a lot of money in fieldwork costs? If a sample of 400 respondents becomes an effective sample size of 340, that’s 60 interviews at $x per interviewer, which means that if the sampling could have been controlled, you could be $x times 60 better off without reducing the quality of the data. Although, can you tell the client that? Anyhow, rant over.
What I am saying is that it is essential, yes essential, that the effective sample size is checked. Our software products, MRDCL and QPSMR, both allow you to see the effective sample size when you run analysis. If the software you use doesn’t show you this figure, I would recommend that you find a way to check it or, if you wish, use our free Excel target weighting and effective sample size calculator. Yes, it’s free for anyone to use.
The main thing that will affect the effective sample size is the range of the weighting that has to be applied to. Try these three tests to see whether you can guess the effective sample size.
OK, so no cheating, let’s see if you can guess the effective sample sizes in these four examples.
Example 1 – You want 50% males and 50% females, but your sample is 400 males and 600 females. What will the effective sample size be?
Example 2 – You set rim weighting targets of 50% males, 50% females and 40% 16-34, 30% 35-54 and 30% 55+. Your survey collects data from 300 Males 16-34, 200 Males 35-54, 100 Males 55+, 20 Females 16-34. 100 Females 35-54 and 100 Females 55+. What effective sample size will have from your 1000 respondents?
Example 3 – You want to weight to a matrix of gender within three age groups. You want a target of 20% Males 16-34, 15% Males 35-54, 10% Males 55+, 30% Females 16-34, 15% Females 35-54 and 10% Females 55+. However, it’s a holiday travel survey and older respondents are, perhaps, more likely to respond. Your actual sample is 5% Males 16-34, 15% Males 35-54, 25% Males 55+, 15% Females 16-34, 10% Females 35-54 and 30% Females 55+. What is the effective sample size if you had 2000 respondents?
Well, example 1 is not too bad, you will have an effective sample size of 960 from your sample of 1000. Although, weights of 1.2 and 0.8 need to be applied to males and females respectively, it is not reducing your effective sample size too much.
In example 2, the effective sample size is 900 from your 1000 respondents. In other words, if you could have controlled sample, you could have reduced fieldwork costs by 10%. Again, it’s not too bad, but let’s look at example 3.
In the example 3, the sample size is 2000, but the effective sample is approximately 1082. And, you can see why. Some big and small weights need to be applied to different respondents to achieve the targets you need. For example, males 16-34 need to be scaled up by 4 times to meet their true sample profile whereas females 55+ are over-represented by a factor 3 meaning that need a weight of 0.33. It might have been impossible to control the sample if this was a self-completion survey, but you can see that if sampling could be controlled, the sample could have been almost halved.
Whilst I always recommend checking the effective sample size when weighting data, the indicators of a much-reduced effective sample size are where you have one or more cells that are under-represented or over-represented. Typically, the more you “stretch” your weighting, the more likely it is that one or more of your cells will be far from the desired target. With rim weighting, it is easy to plug in five or six variables, but it is easy not to check the interlocking cells from the six rim targets, it is possible (and even probable) for big/small weights to be needed.
Rim weighting can have further problems when there is a high correlation between one or more variables. For example, let’s say your rim targets are set for age, gender, region, type of car owned and income. There may be a high correlation between owners of luxury cars and income levels and, perhaps, age. This can cause data to be stretched and extreme weights to be needed to reach your desired targets.
Rim and target weighting can be a good solution when you have a big sample size. It may reduce your effective sample size substantially, but if you are starting with a good-sized sample, that may not be a problem. However, with smaller samples and rim weighting targets that have too many variables or categories within each target, the results are likely to be more extreme, thus reducing your effective sample size. Weighting can be a solution, but it should be used with care. There are fuller explanations in our target weighting and rim weighting blog articles which both have free Excel working models.
If you need help with your weighting, either advice on what to do or you need software to carry out the calculations, I will be pleased to offer our free software tools or (not free unless it is quick advice!) consultancy services.