Summary Statistics Excel Calculator & Charts
What This Excel File Does
I have designed this Excel calculator to get a quick overview of a distribution (when I realized that I’ve been calculating the same things over and over again). The spreadsheet has 2 main purposes:
- Calculate summary statistics for a data set (that you enter, see Entering Your Data below)
- Draw charts like frequency histograms or comparison of a distribution to normal distribution
Summary Statistics Calculated
Basic explanation of individual measures and links to more detailed information about their calculation and interpretation are here: Summary Statistics.
The following are some of the charts you can create with this file (these are pre-set, but with a little editing you can of course create other charts as well). You can display all charts in base (input data) units or in multiples of standard deviation.
The classical frequency histogram. You can of course set different interval size and X axis scale (see instructions below on this page). The red and green bars represent the sums of left tail and right tail outliers, in case some observations fall outside the X axis scale you have set.
The two charts above compare your data set to normal distribution, either by plotting them both in one chart (the first chart), or by plotting the difference (the second chart).
This chart displays the ratio of your data set frequency and normal distribution frequency. It is useful especially for comparing frequency on the tails, which may not be well visible on other charts.
Entering Your Data
Simply paste your data into column B in the sheet “Data”, starting from cell B17 and going down.
The order of the data does not matter for the calculations. Detailed instructions are available in the sheet “Notes”.
The data entered by default (as you download this file) is S&P500 index day-to-day percentage changes from 4.1.1980 to 2.3.2012.
Population vs. Sample
Some of the summary statistics (variance, standard deviation, skewness, and kurtosis) are slightly different depending on whether your data set is a sample or the whole population. You can set this in the combo in cell G2 in the sheet “Data”.
The difference is the more significant, the smaller the data set size (N). If you are not familiar with the difference between sample and population, you can see it explained here: Population vs. Sample Variance and Standard Deviation.
In the sheets “Charts” and “Charts 2″ you can see more details about the distribution, its comparison to normal distribution, and the charts.
You can select units to be used in the intervals and charts in the combo in cell D2:
- Either the same units as in your input data, e.g. dollars or percentage points (“Same as Input Data Units”)
- … or multiples of standard deviation (“Standard Deviation Multiples”)
By default, chart X axis minimum and increment are calculated automatically based on your input data range. You can override the automatic scale in cells D6 and D7.
Built-in Excel Functions vs. Manual Calculations
Wherever possible, the summary statistics in the sheet “Data” are calculated using the built-in Excel functions such as STDEV.P, STDEV.S, SKEW, or KURT (adjusted to population or sample accordingly). This is to minimize the file’s size and improve speed of calculations.
Alternatively, you can calculate them manually. If you want to see the exact formulas behind these built-in Excel functions, they are available in the second file (Summary_Statistics_Manual.xlsx).