Followers

Search This Blog

Saturday, February 1, 2020

Book Review-Failure by Stuart Firestein.


I had read Stuart Firestein’s previous book: Ignorance. It was well written, well argued, and tempered with anger about where the sciences are at this moment in history. It struck a chord with me because the book spoke out fiercely against the prevailing psyche in academia, something that was derived by the need to publish or perish. The author made a very strong point about how this aversion is destroying the fundamentals of research and pursuit of new knowledge as well as compromise the integrity of everyone involved in science.

Indeed, Prof. Firestein is reiterating his point in this follow up. He expresses the thought that it is an absolute imperative for scientists and technologists to commit to rigorously accepting and examining our failures; he admonishes us to actively seek opportunities to create failures, and he proclaims that it is the failures that will fuel our innovation engines.

Prof. Firestein cogently argues in fifteen succinct chapters why we must seek out failures.
In those fifteen chapters, he makes the case for taking more chances, and experiencing failure. He is able to layout a very convincing case that not only is failure something from which we need to learn from; indeed, failure is something that we absolutely need to demand of our researchers and scientists in order to make advances in science.  

He makes his case mostly in the pharmacological and biological world since that is his milieu in the sciences, but the knowledge and the lessons that he provides us are general in nature. The advices are something that could be applied to both applied and pure research and for things that are far broader than just the biological world.

In Chapter One, Prof. Firestein lays out the case that we are terrible at defining what failure is because of the negative nature of the word failure. He cites Gertrude Stein’s quote: “A real failure does not need an excuse. It is an end of itself.”  The quote concisely defines the bad failures, the stupid silly kinds that we all do because we were negligent, and those failures that lead us somewhere interesting. The latter are the ones that we need to talk about, the ones that piques our interest, pushes and allows us to investigate further, ask better questions. Those are the ones that. that reveal surprising questions and/or gives us a chance to re-evaluate our assumptions, understanding, and biases.  

In Chapter Two, he discusses the meaning of Samuel Beckett’s famous quote: “Ever tried. Ever failed. No matter. Try again. Fail again. Fail Better.” Prof. Firestein goes into detail on what he thinks Fail Better means and he discusses what he thinks what we should do to Fail Better. This chapter was the one that really hooked me onto this book because I've always been fascinated by Beckett's quote. I hadn't really thought about what failing better meant until I read Prof. Firestein’s arguments. It clarified some of my thoughts on the subject, so kudos to him for allowing me to think about it and leading me to a clear definition of failing better.

In Chapters 3 and 8 Prof. Firestein goes after the scientific method. He takes out the scalpel and dissects the whole idea of how we do science, or the official written way we are supposed to do science.  His willingness to take on the mythology of the scientific method, which turned him into an apostate to the temple of knowledge that is big science, is encouraging and very courageous. I think coming from somebody like Prof. Firestein, who is a respected researcher and a product of the system, adds weight to the argument and he doesn't disappoint. The two chapters are very forceful, and it shows a lot of very deeply thought out argument against the strawman that is the Scientific Method.

Chapters 4 and 5 are his argument on why failure is something that is beyond what we think it is. We usually believe that failure is something that should be ameliorated and something that should lead us to a positive result. His argument is that failure is something much beyond that, much like what Nicholas Taleb’s take about Anti-fragility. Being anti-fragile means something beyond grittiness and resilient, it means more than just being able to survive the bad fortune, it means being able to benefit and thrive when circumstances are against you. In Prof. Firestein’s argument, failure leads to attaining a higher level of understanding of what we're trying to study and it leads us to discovering heretofore unknown dynamics within our knowledge base. It is the negative result which will leads us to better and broader understanding of nature. In Chapter 5 Prof. Firestein goes into a very impassioned argument for the integrity of failure. The integrity of failure means that we are honest with our results, we are committed to intellectual honesty in our work, we are willing to broadcast our failures to our fellow researchers because we are dedicated to the advancement of science over shielding our own fragile egos and reputations.

Chapter 6 and 7 are interesting because they go into how we're teaching the future of research and scientific investigations and how we are putting a wrong public face on what scientific research truly entails. The crux of it is that teaching future scientists the scientific method as the means to do research we are handcuffing them to a mythology of what scientific investigation is, which in turn stifles broad questioning of concepts and ideas. In addition, by telling the non-scientific world that the scientific method is the dominant mode of doing research, we are building up a fictional impression in the general public of what scientists do on a daily basis, thereby mythologizing doing science.

Chapters 9, 12, 13, and 14 has Prof. Firestein going deep into his own milieu of biological and pharmacological research. The chapters were interesting because I have no background in the area, so I waded in with great interest but with scant background to really dig into what he was trying to get at, I enjoyed it but I’m not sure I got everything that I could have out of it. This failure was all on my part of not understanding.

In Chapters 10, 11, and 14, Prof. Firestein really gets going philosophically. It was great reading; it was very interesting reading. He talks about overcoming are negative connotation of what data that does not meet with our hypothesis should mean to us and how we can get over that mental obstacle. In Chapter 11 Prof. Firestein talks about Karl Poppers, a philosopher who worked exclusively in the area of understanding what science is, or how to differentiate between real science and bad science.  It was a very educational chapter for me as I have always been interested in Popper's work, yet I have not read Poppers writing. Chapter 14 is where Prof. Firestein goes full force into the philosophical idea of a plurality. Most of us are devoted to a monistic belief, that there is only one single truth in this scientific world and that is just not true. In his dabbling in philosophy Prof. Firestein discovered this and he shares it with us and it was really a Tour de force chapter of writing where he takes you along with his experience in high level research and exploration; to think about what scientific reality is and about what our interpretation of reality is, what our mindset does to our scientific understanding of nature. A monistic scientific culture just doesn't ring true, given what we know now, demonstrating the principle that Prof. Firestein had argued all along: that our understand of the sciences are temporary, it lasts as long as the advent of the next discovery. The pluralistic one is so much more complete.

The book itself is a short one; although it is dense with ideas, ideas that we don't usually think about, ideas that we don't usually want to talk about, ideas that challenges our very existence as researchers and scientists. It is a fantastic read because it really does make you think about the meaning of scientific work, it challenges the closely held believe that you have regarding what you are doing. It is very healthy for people to read this; indeed, I believe it should be required reading for anyone who wants to get into the sciences, because it will change your viewpoint completely. I am reading this as an engineer,  I am not a scientist so my work is somewhat different because of what my company wants me to work on and what I need to do to get the desired results, which is not strictly the pursuit of pure and unadulterated truth, but it does gives me food for thought and it admonishes me to be honest and truthful when I am confronted with failure, and I can look at failure without fear or shame.

Saturday, January 4, 2020

Stats For Spikes-Correlation and Causation


This article (Paine 2016) caught my attention recently. It talks about the case of Charles Reep, a former Royal Air Force Wing Commander who was tracking play-by-play data for matches and serving as a quantitative consultant for Football League teams as early as the 1950s.


https://fivethirtyeight.com/features/how-one-mans-bad-math-helped-ruin-decades-of-english-soccer/amp/?__twitter_impression=true&fbclid=IwAR0MNCiSu4nJIcGYvW5dRoTif1mzNc6MJzo8c-AFLU-mDWqZgWOCnT75tIw


The article recalls how Reep’s analytics caused him to conclude that the number of passes made in soccer is directly correlated to scoring. His admonition is that shooting after three passes or less have a higher probability for scoring a goal.

But Reep was making a huge mistake. Put simply, Reep started with each goal scored and looked at how many passes were made prior to scoring. His starting point was goals scored. The problem is that most goals scored in soccer do come after three passes or less, because that is the nature of the game, it is sporadic, and the passing game get disrupted frequently by the defense. What he did not count were the goals missed after just three passes, that block of data is missing because of his focus on just scoring the goal.
In a previous article, Neil Paine of the website Five Thirty-Eight refuted that bit of wisdom gleaned from Reep’s agglomeration of soccer data.

https://fivethirtyeight.com/features/what-analytics-can-teach-us-about-the-beautiful-game/

But subsequent analysis has discredited this way of thinking. Reep’s mistake was to fixate on the percentage of goals generated by passing sequences of various lengths. Instead, he should have flipped things around, focusing on the probability that a given sequence would produce a goal. Yes, a large proportion of goals are generated on short possessions, but soccer is also fundamentally a game of short possessions and frequent turnovers. If you account for how often each sequence-length occurs during the flow of play, of course more goals are going to come off of smaller sequences — after all, they’re easily the most common type of sequence. But that doesn’t mean a small sequence has a higher probability of leading to a goal.

To the contrary, a team’s probability of scoring goes up as it strings together more successful passes. The implication of this statistical about-face is that maintaining possession is important in soccer. There’s a good relationship between a team’s time spent in control of the ball and its ability to generate shots on target, which in turn is hugely predictive of a team’s scoring rate and, consequently, its placement in the league table. While there’s less rhyme or reason to the rate at which teams convert those scoring chances into goals, modern analysis has ascertained that possession plays a big role in creating offensive opportunities, and that effective short passing — fueled largely by having pass targets move to soft spots in the defense before ever receiving the ball — is strongly associated with building and maintaining possession. (Paine 2014)

To reiterated, he should have focused tracking the number of possessions and whether those possession turned into goals.  Given the complexity of the game, it was perhaps understandable that Reep made this mistake, and given that the state of the art of statistical analysis in sports was still rudimentary, it was perhaps predictable. The unfortunate thing is that Reep was able to convince an entire nation’s soccer establishment, not just any nation, but the nation where the game was born, the nation who’s excellence in the game was globally recognized to go off on a wild goose chase. People should have known better. Maybe.
This brings us to an oft repeated but rarely observed tenet of using statistics in applications: Correlation does not equal causation. The saying may sound glib, but it is remarkably dead on.  If you find some kind of correlation between two events, then our habit and inclination is to jump to the conclusion that the two events have a causal relationship; that is, one event caused the other to occur, or that we can deterministically and reasonably predict the latter event will result from the occurrence of the first event. Unfortunately for us that is rarely the case. Establishing causality takes a bit of mathematical formal checking, just because the statistics show some kind of correlation exists between the two events, however minimal, doesn’t necessarily mean that they have a causal relationship.

In order to establish causality, a lot of number crunching needs to happen, and a lot of statistical metrics need to meet certain established thresholds before we can declare causality. That is a completely different arm of statistical sciences call inferential statistics. Far too involved for me to try to explain here and now, even assuming I can explain it. A rather large and dodgy assumption.
Another thing that Reep’s error illustrates is the Survivorship bias. The story of Abraham Wald and the US warplanes is a favorite on social media and business writers because it perfectly demonstrates the linear and direct thinking most people employ when they see data, or results without taking into account the underlying situation.

Abraham Wald was born in 1902 in the then Austria-Hungarian empire. After graduating in Mathematics he lectured in Economics in Vienna. As a Jew following the Anschluss between Nazi Germany and Austria in 1938 Wald and his family faced persecution and so they emigrated to the USA after he was offered a university position at Yale. During World War Two Wald was a member of the Statistical Research Group (SRG) as the US tried to approach military problems with research methodology.
One problem the US military faced was how to reduce aircraft casualties. They researched the damage received to their planes returning from conflict. By mapping out damage they found their planes were receiving most bullet holes to the wings and tail. The engine was spared.


The US military’s conclusion was simple: the wings and tail are obviously vulnerable to receiving bullets. We need to increase armour to these areas. Wald stepped in. His conclusion was surprising: don’t armour the wings and tail. Armour the engine.

Wald’s insight and reasoning were based on understanding what we now call survivorship bias. Bias is any factor in the research process which skews the results. Survivorship bias describes the error of looking only at subjects who’ve reached a certain point without considering the (often invisible) subjects who haven’t. In the case of the US military they were only studying the planes which had returned to base following conflict i.e. the survivors. In other words what their diagram of bullet holes actually showed was the areas their planes could sustain damage and still be able to fly and bring their pilots home. (Thomas 2019)

What Reep saw was goals, he was fixated on them rather than the big picture, he fell into the trap of reaching the first and most obvious conclusion rather than try to explore the structure of the game. Sometimes prior experience is very useful and not everything new is golden.

Works Cited

Paine, Neil. 2016. "How One Man’s Bad Math Helped Ruin Decades Of English Soccer." http://www.fivethirtyeight.com. October 27. Accessed December 24, 2019. https://fivethirtyeight.com/features/how-one-mans-bad-math-helped-ruin-decades-of-english-soccer/amp/?__twitter_impression=true&fbclid=IwAR0MNCiSu4nJIcGYvW5dRoTif1mzNc6MJzo8c-AFLU-mDWqZgWOCnT75tIw.
—. 2014. "What Analytics Can Teach Us About the Beautiful Game." http://www.fivethirtyeight.com. June 12. Accessed December 24, 2019. https://fivethirtyeight.com/features/what-analytics-can-teach-us-about-the-beautiful-game/.

Thomas, James. 2019. "Survivorship BIas." McDreeamie Musings. April 1. Accessed December 28, 2019. https://mcdreeamiemusings.com/blog/2019/4/1/survivorship-bias-how-lessons-from-world-war-two-affect-clinical-research-today.