Calculating Descriptive Statistics

Once the data has been coded and double-checked, the next step is to calculate Descriptive Statistics. The three main types of descriptive statistics are frequencies, measures of central tendency (also called averages), and measures of variability. Frequency statistics simply count the number of times that each variable occurs, such as the number of males and females within the sample. Measures of central tendency give one number that represents the entire set of scores, such as the mean. Measures of variability indicate the degree to which scores differ around the average.

Descriptive research designs typically only require descriptive statistics. However, all other types of research designs will require both descriptive and inferential statistics. Since it is important for the reader to have a good understanding of the sample that the study was conducted on, the first statistics for all research designs should include descriptive statistics of the personal information for the sample. The types of personal information to be included will vary depending on the type of research study. All studies should report descriptive statistics on gender and age. Other variables might include grade level, marital status, years of work experience, educational qualifications, socio-economic status, etc. It is up to the researcher to thoughtfully consider what the reader needs to know about the sample to make an informed decision about whether the sample is representative of the overall population.

Frequency and percentage statistics should be used to represent most personal information variables. However, if participants reported their exact age, then the mean and standard deviation should be calculated for the age variable. Frequency statistics should be reported whenever the data is discrete, meaning that there are separate categories that the participant can tick. For example, marital status can have categories of single, married, divorced, widowed, and separated. Educational qualifications can have categories of secondary school, diploma, degree, post-graduate diploma, masters, and doctorate.

However, measures of central tendency and variability should be reported for variables that have continuous data, meaning that the scores can vary along a continuum of numbers. For example, age is on a continuum from 0 to 100 or so, academic achievement generally varies from 0 to 100, and number of pages a student reads in a week can vary from 0 to maybe 300. These are all continuous variables, so a measure of central tendency and variability should be reported to represent these variables.

Recall that a frequency is simply the number of participants who indicated that category (aka "Male"). However, it is oftentimes difficult to interpret frequency distributions because the frequency by itself is meaningless unless there is a reference point to interpret the number. Percentages are easier to understand than frequencies because the percentage can be interpreted as follows. Imagine there were exactly 100 participants in the sample. How many participants out of those 100 would fall in that category? In Table 3, if there were 100 participants in the study, 55 would be female. Percentage is calculated by taking the frequency in the category divided by the total number of participants and multiplying by 100%. To calculate the percentage of males in Table 3, take the frequency for males (80) divided by the total number in the sample (200). Then take this number times 100%, resulting in 40%. At this point, a simple table with the frequency and Percentage of personal information variables will suffice. In the Tables and Figures page, I will describe how to convert these tables into APA format or graphically represent it in a figure.

Gender |
Frequency |
Percentage |

Male | 80 |
40% |

Female | 110 |
55% |

Missing | 10 |
5% |

Total | 200 |
100% |

Once descriptive statistics for the personal information have been calculated, then it is time to move onto the variables under study. In most cases, a total score for each variable will have been calculated in the previous step, Coding the Data. APA standards require that researchers report descriptive statistics on the major variables under study, even for studies that will use inferential statistics, so the nature of any effect can be understood by the reader. This means that all research studies must report the mean and standard deviation for all variables under study. The mean is necessary to summarize that variable across all participants; the standard deviation is necessary to understand how much each participant varies around that mean.

At the moment, it is enough to calculate the mean and standard deviation and combine them all in one table. If a causal-comparative design is used that compares two or more groups on these variables (aka compares males and females on academic achievement), then it is necessary to calculate the mean and standard deviation separately for each group. If a pre- post-test design is used, then the mean and standard deviation will need to be calculated separately for the pre-test and the post-test. Table 4 gives the means and standard deviations for a study that compares teachers in private and public schools on three variables associated with early literacy practices. Recall that the mean is calculated by summing the scores, and then dividing this sum by the number of scores. Calculating the standard deviation is a bit more complicated. Microsoft Excel will quickly and automatically calculate each statistic using the *=average* and *=stdev* functions. For examples of how to calculate frequencies, averages, and variation by hand, click here.

Variable |
Private Mean |
Private SD |
Public Mean |
Public SD |

Read Books Aloud | 3.42 |
1.12 |
2.34 |
0.12 |

Tell Stories | 3.52 |
1.05 |
2.12 |
1.11 |

Sight Words | 5.46 |
0.60 |
5.89 |
0.32 |

When reporting frequencies, do not add any places after the decimal point; only report whole numbers. When reporting percentages, means, and standard deviations, typically include two decimal points.

At this point, we **cannot** say that there is a significant difference between Public or Private school teachers on any of these variables. There will *always* be differences between scores for different groups of people. Inferential statistics are necessary to determine whether these differences are big enough to be considered significant. In other words, to determine if the differences between groups are large enough to say that there is any meaningful difference between the two, one must calculate an inferential statistic, described in the next chapter.

If a research study has Research Questions, then either a percentage or a mean will likely be calculated to answer the research question. Once the descriptive statistics for the personal information and key variables have been calculated, then it is time to answer any research questions. Refer to the Research Questions that were developed. Calculate the appropriate statistic to answer each research question separately. Refer to the Methods of Data Analysis to determine which statistics should be calculated to answer each research question.

Again, it is very important that the researcher is very careful when calculating statistics to avoid careless errors. Incorrect calculations can lead a researcher to draw incorrect conclusions, making the study invalid and untrue. Therefore, check every calculation multiple times in order to maintain the highest ethical standards in research.

For a step-by-step example of a descriptive research study and how to calculate descriptive statistics, click here.

Return to Educational Research Steps

Copyright 2013, Katrina A. Korb, All Rights Reserved