Sunday, June 26, 2016

Lies, damned lies and bad statistics: analysis of the EU referendum result has been mostly terrible

I'm still recovering from the shock of Thursday's EU referendum vote, and am becoming increasingly irritated at the level of the debate since the result was known on Friday - some of the statistical (let alone political) analysis of the result has been abysmal, betraying individual prejudices very nicely indeed. Before I launch into mine, I admit straight off that this post comes from my perspective as someone strongly in favour of remain, and needs to be read with that in mind.

This post is also not about the standard of debate during the campaign. Lots has been written about that, and Full Fact have done an excellent job in debunking the myths, including those in the picture above.

I'm going to take some of the things I've seen and heard since the referendum and give my perspective on each one, explaining why it's at best misleading and at worst downright nonsense, from a statistical perspective. I'll try to keep to that, but personal opinion will inevitably crop up.

1. "We must accept the (overwhelming) view of the British people"

I've heard this bandied about by politicians more than once since Friday and it's becoming increasingly irritating. In fact 51.9 % of people voted for the UK to leave the EU. In no way is that "overwhelming". Yes, it's the result of the referendum and politicians have to abide by it, but talk of acceptance is to deny the existence of the 16 million people who voted for the UK to remain. Our voices need to be heard too.

2. "But only 37.4% of people voted leave so the result can't be valid"

Irritating though it is that almost 13 million people didn't vote in the election, the result is the result. You could equally argue that only 34.7% of people voted remain. The fact is that you can't say anything about the opinions of the people that didn't vote. We simply don't know what their preference is. No vote, no say.

3. "Scotland voted to remain in the EU"

Nicola Sturgeon has been saying this a lot. I don't blame her (much). It suits her narrative and the aim of her party, which is of course to gain independence for Scotland. However, the vote was not on whether or not Scotland should remain in the EU, it was on whether the UK should remain in the EU. Sixty-two percent of people in Scotland voted for the UK to remain within the EU, with Scotland as a part of that larger entity. This is a subtle but important distinction. If Scotland is to become part of the EU as an independent country, a fair few things need to happen, one of which is that a second Scottish referendum votes in favour of leaving the UK. You could argue that this time the SNP would get what they want out of this, but there is still a significant proportion of people in Scotland who want to stay in the UK, and also (judging by the results of the referendum) a significant proportion who want the UK to leave the EU. Throw all this into the mix and nothing is certain.

By the way, Sturgeon is not the only person to make this mistake and talk of areas voting to leave or remain in the EU. I think she does it knowingly, others are not so bright.

4. "London should declare independence from the rest of England and stay in the EU"

Firstly the vote was a national vote, about what the UK as a whole should do, so it certainly doesn't follow that because 59.9% of Londoners voted for the UK to remain that 59.9% of them would vote for London to become an independent city state. Secondly, there were plenty of people in London who voted leave, over 1.5 million of them in fact. Thirdly, London itself was split - 5 boroughs voted a majority leave. There's clearly more going on than meets the eye. I'll save my views on Little London triumphalism for another day.

5. "Old people have screwed over the young/Cosmopolitan lefties are out of touch with everyone else"

When you go into the polling booth the only information you give is what is on the voting slip. In this case that was a simple leave/remain choice. You don't provide any demographic data, or anything about your job, voting intentions or anything else. All of the stories about how different age groups etc voted are either based on polls or, as in these Guardian charts, on characteristics of areas rather than of individuals. There are dangers with both of these analysis methods, and it's just not conceivable that all 16 million remainers are lefty urbanites, any more than all 17 million leavers are crabby old thickos. At best some of this analysis is misguided, at worst it's downright offensive.

6. "Places that voted leave have no right to ask for their funding to be protected."

a. Places didn't vote to leave or to remain - as I've said a couple of times earlier in this post this was a national vote, with the areas being used as a convenient way to count up the votes. Another option might have been to send all the boxes up to Manchester town hall and count them there. Frankly we might have heard less place-based nonsense if they had been.

b. I see nothing wrong in elected officials in areas that are set to lose out because of the loss of EU money attempting seek assurances that their regeneration plans be protected. I'd do the same in their position.

c. Many of these places had no greater vote for leave over remain than remain's lead over leave in London. For example, 43.5% of Cornwall residents voted for remain. It cannot be the case that a split like that condemns that whole area any more than the 60/40 remain/leave vote in London means all Londoners are brilliant right-thinking individuals.

Good and bad data visualisation

I've run through some of the things that have irritated me in the analysis of the referendum result. There's no doubt that poor statistical literacy and bad data visualisation have contributed to that. Presenting the results as "first past the post"-type results by area has led to some really misleading conclusions.

This regional-level analysis (from the BBC but it could have been from anywhere as everyone has been presenting the results like this) masks big variations at sub-regional level and has been the fuel to the fire for nonsense graphics like this:

I mean, seriously. Over 2,500 retweets and rising, *come on* people.

This graphic from the BBC shows the local authority breakdowns that make it clear why that little gif showing London, Scotland and Northern Ireland letting the rest of the UK leave the EU by themselves is nonsensical. Both London and Northern Ireland had areas with a majority of leave votes.
However, because this map presents the results for each local authority area as either majority leave or majority remain it's also misleading. It's this kind of presentation that leads the mind to think that everyone in those areas is either for one side or the other. In fact the picture is much more subtle than that. This Guardian graphic goes some way to addressing the problem:

It divides leave/remain areas into those with a >15% majority and the rest, but it still uses a blue/yellow dichotomy that is unhelpful. That idea of showing areas as having equal population size is interesting, and people seem to like it, but that in itself is misleading.

Another way of presenting the same information comes from Views of the World.

This map shows areas resized according to population size. I quite like this, although it does have the tendency to make London look like a giant boil in the south east. I still don't like the leave/remain dichotomy presented like this. I think it would be better to present either the percentage leave or the percentage remain by area. This gets away from the idea that whole areas plumped one way or the other and allows for the fact that, for example, Greenwich's remain percentage (55.6) was closer to Cornwall's (43.5) than it was to Lewisham's (69.9) yet all these maps would lead you to lump Lewisham and Greenwich together in cosmopolitan London and leave Cornwall out in the cold (see point 6 above).

The BBC did do a map of percentage remain. Unfortunately the colour scheme means it looks rather insipid.

London doesn't stand out as being quite so different now.... (admittedly this is partly because this is a traditional map rather than a cartogram), and you can see there are loads of areas being condemned as racist strongholds that had 40-49% votes for remain.)

Top 20s

This graphic from the BBC shows the 20 areas with the highest remain vote share.

And this the 20 with the highest leave:

Those crowing about cosmopolitan London might like to note that areas in London appear within both these lists. London as a city is as divided as the country.

A footnote: Divided London?
Newham and Tower Hamlets are also interesting cases. They have similar levels of poverty and deprivation, yet their votes in this referendum are quite different. Tower Hamlets voted 32.5% leave, 67.5% remain, whereas Newham voted 47.2% leave, 52.8% remain. Both are lumped together as "remain" areas by the traditional analysis methods, but there's clearly something going on here that makes them different. Lewisham and Greenwich are also usually seen as pretty similar in most area classifications, but there's a 14 percentage point difference in their remain vote. I'd prefer to see these and other differences explored a little further rather than wasting any more time crowing about how London is so different to the rest of the country.

Edited to add: This tweet from Hidden London shows the extent of the divide clearly, mapping the remain vote share across the capital: