The Importance of Statistics in Evaluating Data

In this post we will begin diving into the third component of the data literacy definition, evaluation and analysis. We will revisit components one and two, collection and management, in upcoming posts but the next few posts will look at the importance of evaluation and some key definitions and the application of some statistical methods.

Statistics is the field of mathematics that involves learning from data. It is fundamental in making discoveries, decisions, and predictions as well as helping us have a deeper understanding of a subject and our world. Data analysis, or evaluation, employs two types of statistics: descriptive statistics and inferential statistics.

statwordcloudDescriptive statistics summarize and describe the nature of the data and provide and idea of properties of the data. They distill a large amount of information into a simpler form that allows us to easily make sense of the data.  Descriptive statistics include distribution (frequency) the measures of central tendency (mean, median, and mode), and dispersion (standard deviation). Some other familiar descriptive statistics you may know are grade point average (GPA) and batting averages.

Inferential statistics draw conclusions from data. Specifically, inferential statistics make predictions about a population from smaller sample of that population. The most common inferential statistics allow us to compare means (t-tests and analysis of variance (ANOVA)) and make predictions (regression).

We will revisit each of these statistics in upcoming posts and we will use them to evaluate some real-world data. But for now, the important thing is to think about the statistics you commonly hear used in the news and in conversation and what you don’t frequently hear. I am sure you often hear and use average when you talk about home prices, salaries, height, weight and age but how often do you get the standard deviation and distribution when you are presented a statistic from a news source or a neighbor. These additional statistics are very important in developing a full picture of the data and provide insight into the quality of the information you are provided, or you derive.

Statistics allow us to reliably learn from data and to use quantitative information to differentiate between reasonable and suspect conclusions. The ability to be comfortable with and to use statistics to make these kinds of observations is very important skill given the amount of data that is generated every day and the amount of information provided by people whose motivations are not clear or genuine. It is easy for a dishonest and unscrupulous person to incorrectly use statistical methods to manipulate data and derive unwarranted conclusions. It is also easy for an uniformed person to draw incorrect conclusions by not fully understanding the statistical pitfalls such as overgeneralization, causality, incorrect analysis, and violating the assumptions of an analysis. For these reasons it is more important than it ever has been to have an understanding of statistics and to be data literate. Please join me in exploring data literacy and statistics by commenting below. Please subscribe to my blog and follow me on Twitter to get updates as I further explore data literacy and provide examples and methods for evaluating data.

Image credit

One thought on “The Importance of Statistics in Evaluating Data”

  1. Great topic, I am woefully ignorant of statistics and applying data in general. I will certainly be reading your future posts on these topics to help educate myself, thank you!

Comments are closed.