We surely all know the feeling of disappointment when the best chocolates in the Celebrations box are all gone and there’s only the rubbish ones left in the tub. Myself being particularly fond of the Malteasers and Galaxy chocolates, I’ve often felt a little suspicious that there might not be as many of them in a box than the other chocolates. But, I’ve never had a chance to put my suspicions to the test…
This all changed a couple of days ago, as we came into the possession of a 2.4 kg box of celebrations(!). So, for fun, my partner and I decided to count up the number of each type of chocolate in the box. (You might take this as a sign of social distancing-induced boredom, but no, this kind of behaviour is pretty much the norm for us).
And with that, we finally had the data we’d been longing for:
There were 267 chocolates in total in the box, and 8 types of chocolate. If there were equal numbers of each type of chocolate, that would mean 33.375 chocolates of each type - or, to be more realistic, somewhere between 32-34 of each type.
But in our box, there were a whopping 52 Mars Bars, and 44 Milky Ways… and only around 20 Galaxy and Galaxy Caramels. Unfortunate, since neither of us like Mars Bars or Milky Way all that much.
Now, the distribution of chocolates within our box got me ever more suspicious. Could Mars Inc be overfilling Celebrations with certain types of chocolate at the expense of others?
Nothing that a good bit of statistics and probability can’t help us answer!
Let’s first suppose that the company aren’t hand-picking every single chocolate to ensure the same number of chocolate types in each day.
Instead, let’s suppose the company produce equal amounts of each type of chocolate, and a random selection goes into each box.
In this scenario, even if usually, the number of each type of chocolate should turn out roughly equal, we’d expect there to sometimes be boxes with a surprisingly larger number of some types of chocolates over others.
But how likely would we be to see the distribution of chocolate types as unequal as what we observed in our own Celebrations box?
Since our data consists of counts of categorical data, we can answer this question with a chi-squared test comparing our observed frequency to the expected average frequency (i.e. equal numbers of each type of chocolate).
chisq.test(table, p = rep(0.125, 8))
## ## Chi-squared test for given probabilities ## ## data: table ## X-squared = 27.142, df = 7, p-value = 0.0003142
With a p-value of 0.0003, our observed frequency is looking pretty unlikely, if the chocolates really are selected randomly from an equal number of each chocolate type.
Next, focussing just on the Mars Bars - since they’re the most frequent (and most suspicious) chocolate in our box by far… if the chocolate selection is made at random, from a pool of chocolate types of equal number, what’s the probability of ending up with 52 Mars Bars in a box of 267?
Here, we have a binary outcome: Mars, or not Mars. This therefore becomes a classic binomial problem. If each type of chocolate is equally likely to be selected, the probability p of picking a Mars Bar is 1/8, or p = 0.125. The probability of picking something other than a Mars Bar is 1 - p = 0.875.
Then we can calculate the probability of picking 52 Mars Bars out of 267 using this formula:
P(x) = N! / k!(N-k)! . pk(1-p)N - k
where N is the total number and k is the number of successful outcomes (in this case, the number of Mars Bars).
Although we can calculate this by hand, r provides a handy function, dbinom() to calculate binomial probabilities.
dbinom(52, size = 267, prob = 0.125)
##  0.0003406686
So, the probability of obtaining 52 Mars Bars in a box of 267 Celebrations, if all types of chocolate are equally likely to be picked, is 0.0003.
Now, this is only the point probability , aka the probability of obtaining exactly 52 Mars Bars in the box, but we can also calculate the probability of obtaining 52 or more Mars Bars in our Celebrations box to give us a p-value:
# with lower.tail set to FALSE, pbinom gives P(X > x) so I have set n equal to 51 pbinom(51, size = 267, prob = 0.125, lower.tail = FALSE)
##  0.0007809731
Not looking much better.
To make even clearer how unlikely this is, here’s where our outcome sits on a probability distribution:
At this point, it’s looking more and more like the selection of chocolates in the box is biased.
However, I am making my inferring on the basis of just a single sample. To be more certain, I’d need to do this with more Celebrations boxes… but with 2.4 kg of chocolate, it doesn’t look like we’ll be buying more anytime soon!
In the absence of more samples, I decided to collect anecdotal evidence from google. My google search revealed to me that other people have had similar experiences to ours:
First, I found some stellar reporting from the Liverpool Echo here, confirming an abundance of Mars Bars in their sample.
Then, I found this hero who also counted up the number of chocolates in his Celebrations box, and even got 13 more people involved! And what would you know, Mars Bars were again the clear winner.
But why is this going on? Why are our Celebrations boxes overfilled with Mars Bars?
Now, a simple answer could be that Mars Bars are in fact the most popular chocolate type, and the company are merely this catering to the tastes of the nation.
It turns out, YouGov have conducted a poll on this subject.
The results possibly won’t shock you:
Malteasers are the nation’s favourite, across all age groups - and also my personal favourites, so I’m clearly basic af.
Mars on the otherhand, is the nation’s least favourite…
Clearly this isn’t a case of the company catering to the nation’s tastes… so what else could explain this? Is it an attempt to change public opinion? Are Mars Bars cheaper to produce? Or is it all designed to encourage us to buy more boxes, in the hope that the next one is the one, the mythical box with 52 Malteasers?
We may never know. But one thing I do know is that I won’t be sharing my Maltesers.
This blogpost is all made in the name of fun and probability - I have no beef with Mars Inc -