Notes and Quotes - Bad Science (Ben Goldacre)

Finished on 14/3/18 in Brisbane, Australia.

"At a homeopathic dilution of 200C (you can buy much more higher dilutions from any homeopathic supplier) the treating substance is diluted more than the total number of atoms in the universe, and by an enormously huge margin. To look at it another way, the universe contains about 3 x 10^80 cubic metres of storage space (ideal for starting a family): if it was filled with water, and one molecule of active ingredient, this would make for a rather paltry 55C dilution."

"The logo of the Cochrane Collaboration features a simplified ‘blobbogram’, a graph of the results from a landmark meta-analysis which looked at an intervention given to pregnant mothers. When people give birth prematurely, as you might expect, the babies are more likely to suffer and die. Some doctors in New Zealand had the idea that giving a short, cheap course of a steroid might help improve outcomes, and seven trials testing this idea were done between 1972 and 1981. Two of them showed some benefit from the steroids, but the remaining five failed to detect any benefit, and because of this, the idea didn’t catch on.

Eight years later, in 1989, a meta-analysis was done by pooling all this trial data. If you look at the blobbogram (in the logo on the previous page)you can see what happened. Each horizontal line represents a single study: if the line is over to the left, it means the steroids were better than placebo, and if it is over to the right, it means the steroids were worse. if the horizontal line for a trial touches the big vertical ‘nil effect’ line going down the middle, then the trial showed no clear difference either way. One last thing: the longer a horizontal line is, the less certain the outcome of the study was.

Looking at the blobbogram, we can see that there are lots of not-very-certain studies, long horizontal lines, mostly touching the central vertical line of ’no effect’; but they’re all a bit over to the left, so they all seem to suggest that steroids might be beneficial, even if each study itself is not statistically significant.

The diamond at the bottom shows the pooled answer: that there is, in fact, very strong evidence indeed for steroids reducing the risk - by 30 to 50 percent - of babies dying from the complications of immaturity. We should always remember the human cost of these abstract numbers: babies died unnecessarily because they were deprived of this life-saving treatment for a decade, They died, even when there was enough information available to know that would save them, because that information had not been synthesised together, and analysed systematically, in a meta-analysis."

"Throughout history, the placebo effect has been particularly well documented in the field of pain, and some of the stories are striking. Henry Beecher, an American anaesthetist, wrote about operating on a soldier with horrific injuries in a World War II field hospital, using salt water because the morphine was all gone, and to his astonishment, the patient was fine. Peter Parker, an American missionary, described performing surgery without anaesthesia on a Chinese patient in the mid-nineteenth century: after the operation, she ‘jumped upon the floor’, bowed, and walked out of the room as if nothing had happened."

"But these are just stories, and the plural of anecdote is not data."

"The philosopher Professor Harry Frankfurt of Princeton University discusses this issue at length in his classic 1986 essay 'On Bullshit. Under his model, bullshit is a form of falsehood distinct from lying: the liar knows and cares about the truth, but deliberately sets out to mislead; the truth-speaker knows the truth and is trying to give it to us; the bullshitter, meanwhile, does not care about the truth, and is simply trying to impress us:

“It is impossible for someone to lie unless he thinks he knows the truth. Producing bullshit requires no such conviction… When an honest man speaks, he says only what he believes to be true; and for the liar, it is correspondingly indispensable that he considers his statements to be false. For the bullshitter, however, all these bets are off: he is neither not he side of the true nor on the side of the false. His eye os not on the facts at all, as the eyes of the honest man and of the liar are, except insofar as they may be pertinent to his interest in getting away with what he says. He does not care whether the things he says describe reality correctly. He just picks them out, or makes them up, to suit his purpose.”"

"It's a chilling thought that when we think we are doing good, we may actually be doing harm, but it is one we must always be alive to, even in the most innocuous situations. The paediatrician Dr Benjamin Spock wrote a record-breaking best-seller called Baby and Child Care, first published in 1946, which was hugely influential and largely sensible. In it, he confidently recommended that babies should sleep on their tummies. Dr Spock had little to go on, but we now know that this advice is wrong, and the apparently trivial suggestion contained in his book, which was so widely read and followed, has led to thousands, and perhaps even tens of thousands, of avoidable cot deaths. The more people are listening to you, the greater the effects of a small error can be. I find this simple anecdote deeply disturbing."

"Sometimes whole areas can be orphaned because of a lack of money, and corporate interest. Homeopaths and vitamin pill quacks would tell you that their pills are good examples of this phenomenon. That is a moral affront to the better examples. There are conditions which affect a small number of people, like Creutzfeldt-Jakob disease and Wilson disease, but more chilling are the diseases which are neglected because they are only found in the developing world, like Chagas disease (which threatens a quarter of Latin America) and trypanosomiasis (300,000 cases a year, but in Africa). The Global Forum for Health Research estimates that only 10 per cent of the world's health burden receives 90 per cent of total biomedical research funding. Often it is simply information that is missing, rather than some amazing new molecule. Eclampsia, say, is estimated to cause 50,000 deaths in pregnancy around the world each year, and the best treatment, by a huge margin, is cheap, unpatented, magnesium sulphate (high doses intravenously, that is not some alternative medicine supplement, but also not the expensive anticonvulsants that were used for many decades). Although magnesium had been used to treat eclampsia since 1906. its position as the best treatment was only established a century later in 2002, with the help of the World Health Organisation, because there was no commercial interest in the research question: nobody has a patent on magnesium, and the majority of deaths from eclampsia are in the developing world. Millions of women have died of the condition since 1906, and many of those deaths were avoidable."

"Frequently, journalists will cite 'thalidomide’ as if this was investigative journalism's greatest triumph in medicine, where they bravely exposed the risks of the drug in the face of medical indifference: it comes up almost every time I lecture on the media's crimes in science, and that is why I will explain the story in some detail here, because in reality - sadly, really - this finest hour never occurred.

In 1957, a baby was born with no ears to the wife of an employee at Grunenthal, the German drug company. He had taken their new anti-nausea drug home for his wife to try while she was pregnant, a full year before it went on the market: this is an illustration both of how slapdash things were, and of how difficult it is to spot a pattern from a single event. The drug went to market, and between 1958 and 1962 around 10,000 children were born with severe malformations, all around the world, caused by this same drug, thalidomide. Because there was no central monitoring of malformations or adverse reactions, this pattern was missed. An Australian obstetrician called William McBride first raised the alarm in a medical journal, publishing a letter in the Lancet in December 1961. He ran a large obstetric unit, seeing a large number of cases, and he was rightly regarded as a hero - receiving a CBE - but it's sobering to think that he was only in such a good position to spot the pattern because he had prescribed so much of the drug, without knowing its risks, to his patients.* By the time his letter was published, a German paediatrician had noted a similar pattern, and the results of his study had been described in a German Sunday newspaper a few weeks earlier.

Almost immediately afterwards, the drug was taken off the market, and pharmacovigilance began in earnest, with notification schemes set up around the world, however imperfect you may find them to be. If you ever suspect that you've experienced an adverse drug reaction, as a member of the public, I would regard it as your duty to fill out a yellow card form online at yellowcard.mhra.gov.uk: anyone can do so. These reports can be collated and monitored as an early warning sign, and are a part of the imperfect, pragmatic monitoring system for picking up problems with medications."

"This study was big - very big - and included all the children born in Denmark between January 1991 and December 1998. In Denmark there is a system of unique personal identification numbers, linked to vaccination registers and information about the diagnosis of autism, which made it possible to chase up almost all the children in the study. This was a pretty impressive achievement, since there were 440,655 children who were vaccinated, and 96,648 who were unvaccinated. No difference was found between vaccinated and unvaccinated children, in the rates of autism or autistic spectrum disorders, and no association between development of autism and age at vaccination. Anti-MMR campaigners have responded to this work by saying that only a small number of children are harmed by the vaccine, which seems to be inconsistent with their claims that MMR is responsible for a massive upswing in diagnoses of autism. In any case, if a vaccine caused an adverse reaction in a very small number of people, that would be no surprise - it would be no different from any other medical intervention (or, arguably, any human activity), and there would be, surely, no story."

"In the aggregate, these ‘breakthrough’ stories sell the idea that science - and indeed the whole empirical world view - is only about tenuous, new, hotly contested data and spectacular breakthroughs. This reinforces one of the key humanities graduates' parodies of science: as well as being irrelevant boffinry, science is temporary, changeable, constantly revising itself, like a transient fad. Scientific findings, the argument goes, are therefore dismissible.

While this is true at the bleeding edges of various research fields, it's worth bearing in mind that Archimedes has been right about why things float for a couple of millennia. He also understood why levers work, and Newtonian physics will probably be right about the behaviour of snooker balls forever. But somehow this impression about the changeability of science has bled through to the core claims. Anything can be rubbished."

"In 2006, after a major government report, the media reported that one murder a week is committed by someone with psychiatric problems. Psychiatrists should do better, the newspapers told us, and prevent more of these murders. All of us agree, I'm sure, with any sensible measure to improve risk management and violence, and it's always timely to have a public debate about the ethics of detaining psychiatric patients (although in the name of fairness I'd like to see preventive detention discussed for all other potentially risky groups too - like alcoholics, the repeatedly violent, people who have abused staff in the job centre, and so on). But to engage in this discussion, you need to understand the maths of predicting very rare events. Let's take a very concrete example, and look at the HIV test. What features of any diagnostic procedure do we measure in order to judge how useful it might be? Statisticians would say the blood test for HIV has a very high ‘sensitivity', at 0.999. That means that if you do have the virus, there is a 99.9 per cent chance that the blood test will be positive. They would also say the test has a high ‘specificity’ of 0.9999 so, if you are not infected, there is a 99.99 per cent chance that the test will be negative. What a smashing blood test.

But if you look at it from the perspective of the person tested, the maths gets slightly counterintuitive. Because weirdly, the meaning, the predictive value, of an individual's positive or negative test is changed in different situations, depending on the background rarity of the event that the test is trying to detect. The rarer the event in your population, the worse your test becomes, even though it is the same test. This is easier to understand with concrete figures. Let’s say the HIV infection rate among high-risk men in a particular area is 1.5 per cent. We use our excellent blood test on 10.000 of these men, and we can expect 151 positive blood results overall: 150 will be our truly HIV-positive men, who will get true positive blood tests; and one will be the one false positive we could expect from having 10,000 HIV-negative men being given a test that is wrong one time in 10,000. So, if you get a positive HIV blood test result, in these circumstances your chances of being truly HIV positive are 150 out of 151. It's a highly predictive test.

Let's now use the same test where the background HIV infection rate in the population is about one in 10,000. If we test 10,000 people, we can expect two positive blood results overall. One from the person who really is HIV positive, and the one false positive that we could expect, again, from having 10,000 HIV-negative men being tested with a test that is wrong one time in 10,000.

Suddenly, when the background rate of an event is rare, even our previously brilliant blood test becomes a bit rubbish. For the two men with a positive HIV blood test result, in this population where only one in 10,000 has HIV, it's only 50:50 odds on wether they really are HIV positive.

Let's think about violence. The best predictive tool for psychiatric violence has a ‘sensitivity' of 0.75, and a ‘specificity’ of 0.75. It's tougher to be accurate when predicting an event in humans, with human minds and changing human lives. Lets say 5 per cent of patients seen by a community mental health team will be involved in a violent event in a year. Using the same maths as we did for the HIV tests, your *0.75' predictive tool would be wrong eighty-six times out of a hundred. For serious violence, occurring at 1 per cent a year, with our best '0.75' tool, you inaccurately finger your potential perpetrator ninety-seven times out of a hundred. Will you preventively detain ninety seven people to prevent three violent events? And will you apply that rule to alcoholics and assorted nasty antisocial types as well?

For murder, the extremely rare crime in question in this report, for which more action was demanded, occurring at one in 10,000 a year among patients with psychosis, the false positive rate is so high that the best predictive text is entirely useless.

This is not a counsel of despair. There are things that can be done, and you can always try to reduce the number of actual stark cock-ups, although it's difficult to know what proportion of the 'one murder a week' represents a clear failure of a system, since when you look back in history, through the retrospectoscope, anything that happens will look as if it was inexorably leading up to your one bad event. I'm just giving you the maths on rare events. What you do with it is a matter for you."