Do you wonder how statistics lies?

Statistics lies in presence of ignorance.

ignorance

By definition, ignorance means lack of knowledge, understanding, or information about something.

Statistics lies when a person presenting statistical data lacks knowledge, understanding, or information about statistical-methodological techniques which enables us to analyse data in a scientifically objective way.

Although we live in an evidence-based world, our societies have not done enough to enable citizens development of statistical literacy – the key skill of the digital era.

In the past, being able to read and write enabled citizens to have more prosperous lives. Today’s data-driven world requires statistical literacy.

Does this mean that if a presentation of statistics is done by those who have studied statistics we do not need to worry about statistics lying?

No. One can study statistics with a focus on developing new theories (mathematical statistics) or learning how to apply statistical methodologies to real-world data (applied statistics). Applied statisticians are much more familiar with modern statistical thinking — the key skill to analyse data in a scientifically objective way, while mathematical statisticians are frequently not equipped with such understandings. 

Why is it so?

While things work beautifully in a theory, this is not the case with real world data. We are not living in a linear world, yet, ‘linearity’ is the most common assumption that needs to be satisfied when analysing data. 

Not living in a linear world means that collected data are filled with non-linearities. To correctly address complications arising due to non-linearities, a high level of modern statistical thinking is required. 

The modern statistical thinking connects understandings of the key concepts of modern statistical science with causal thinking. Such connection is of immense importance because we live in a cause-and-effect world — the world where most questions of interest are causal in their nature.

What is new?

In the early 20th century, a theory in physics called quantum mechanics was developed. Quantum mechanics enables calculations of properties and behaviours of physical systems.

The quantum mechanics showed us that our world is all about cause-and-effect relationships — an action (a cause) manipulates behaviour of an object or a subject, and the effect is the impact of the action (of the cause/manipulation) that we observe.

Connecting this information with the fact that most questions of interest are causal in their nature leads us to the following conclusion: those who analyse data and present its outcomes must understand how to analyse causal relationships in a scientifically objective way.

Why are modern statistical thinking and causal analysis still absent or underrepresented from most statistics courses?

The science of the 20th century was not responsible only for development of quantum mechanics, but also for development of statistical methods that enable us to analyse causal relationships in a scientifically objective way.

Briefly said, it started with William S. Gosset and Student t-test in the beginning of the 20th century, continued with development of randomisation machinery by Sir Ronald Fisher, and development of modern sampling approaches and confidence interval by Jerzy Neyman.

In the second half of the 20th century Donald B. Rubin developed a causal model – broadly known as the Rubin Causal Model (Holland 1986). This causal model is a foundation for cause-and-effect studies in varieties of fields, from medicine and public health to economics, environment, biology, law and business. The model enables analysis of causal relationships also with data collected outside of an experimental framework, a so called observational data.

Prior Rubin’s work, analysis of causal relationships outside of experimental data was ‘strictly forbidden’. This means, that it has been less than half of a century since major contributions in modern statistical science.

It takes time for researchers to catch-up on these developments and to change curriculum of applied statistics courses. Most statistical curricula are built on a long tradition that is rooted in classical methods like hypothesis testing, linear regression, and analysis of variance (the most abused statistical method). These topics became standard more than 70 years ago and are still seen as an essential foundation. Updating curricula is often slow, especially in large institutions.

Another reason is that analysis of causal relationships is conceptually harder — it requires students to think beyond formulas, develop understanding about causal assumptions, causal designs, and causal interpretations. This is a bigger cognitive leap than teaching well-defined procedures like computing a p-value.

Modern statistical approaches are more abstract and less tidy from a theoretical standpoint while classical methods often rely on clean, linear assumptions that are easier to teach and test.

How the curriculum of applied statistics education should change?

The most important thing is that the curricula shifts from technical and theoretical details to modern statistical thinking and the use of methods and techniques to derive data-insights in a scientifically objective way.

Students should learn basics about causality in statistics, e.g., how statistical science defines the cause, what is a causal design and how to design an objective causal design.

Is there a course that consists of such applied statistics curriculum?

Yes, our founder Dr. Ana Kolar has been developing such curricula for the past 10 years. Some of her in-person and online courses can be attended at University of Helsinki. For those interested in online learning, you can find available courses here.

Dr. Kolar uses experiential learning approach to teaching which enables deep learning. She believes that students should have the opportunity to deepen their knowledge during the learning process, because this is the only path to knowledge. She is rated as an excellent teacher and she is fun too!

Can I learn about how to analyse causal relationships by studying books and articles?

Yes. There are plenty of books, articles and online lectures that one can learn from, but without a proper guidance, it will take years or even a decade to grasp foundations of causal inference in its completeness.

The causal inference is one of the most complex topics in statistics. It is a highly demanding field of study that requires a heavy use of ‘thinking’ and a holistic approach when developing understanding about how to develop a causal design and satisfy causal assumptions. The use of modern statistical thinking plays a significant role in this process. 

Want to learn more about how to analyse data in a scientifically objective way?

Join our online courses!

More To Explore

A New Approach To Statistical Thinking

Statistical thinking requires a new approach due to recent developments of the modern statistical science. This new approach puts causal thinking at the heart of the key statistical thinking concepts, which reflects new developments of modern statistical science in the field of causal inference. This new approach is based on

Continue reading »