Excel spreadsheets for descriptive statistics, probability and sampling theory, and statistical inference
Below are links to a variety of Excel spreadsheets that you can download and use. The collection was originally created by Bill Lepowsky in 2010.
You can describe bivariate categorical data with this spreadsheet; you can also use it to carry out the chi-square test for independence.
Click on the following link to see a Word document explaining how to create a table of bivariate categorical data from an Excel spreadsheet of raw data. It will take maybe 10 minutes to learn to do this. Once you know how, it takes less than 1 minute to create the table.
Univariate Numerical Data
To create a dotplot from a list of numerical data: create_a_doplot.xls
To create back-to-back dotplots from two lists of numerical data: back-to-back_dotplots.xls
To create a frequency table and spikeplot, given a list of numerical data: create_freq_table_and_spikeplot.xls
To create a histogram (either adjusted for density or not adjusted) from a frequency table of grouped numerical data: create_a_histogram_from_grouped_data.xls
To create both a frequency table and a histogram from a list of numerical data (Note: you create your own data groups by choosing your own class boundaries):
To create a boxplot and the 5-number summary (quartile values, etc.) from a list of numerical data: create_a_boxplot.xls
To generate complete descriptive statistics for 1 numerical variable: mean, median, standard deviation, frequency table, dotplot, spikeplot, histogram, boxplot, etc.:
Click on this link to see an excellent two-page explanation in Word (written by Laney math instructor Kathy Williamson) of how to use Excel to calculate the mean and SD of a list of numerical data:
Bivariate Numerical Data: Scatterplots, Correlation, and Regression
Enter a list of bivariate numerical data in order to see the scatterplot, SD line, regression line, correlation coefficient, etc. (Note: You can also compute and see residuals):
Given any desired value of the correlation coefficient r from +1 to -1, you can see what a scatterplot looks like that has that particular value of r; when you change the value of r, you can see how the scatterplot and the regression line change:
PROBABILITY AND SAMPLING THEORY
With the following spreadsheet, you choose the value of n (number of trials) and p (success probability), and then you can learn about each value of f (number of successes), as well as see the binomial spike diagram for f:
binomial-procedures-probabilities.xls (updated 11/22/2012)
You can see the binomial spike diagram for the frequency f, given various values of n and p; in particular, you can see how the spike diagram changes as n increases:
You can see the binomial spike diagram for the relative frequency p-hat, using various values of n and p; in particular, you can see how the spike diagram changes as n increases:
You can use the following spreadsheet to explore the destinations when many IID trials are performed:
DISTRIBUTION TABLES — z (standard normal) table, t table, chi-square, and F table:
Confidence Intervals and Hypothesis Testing
You can use this to find the p-value if you know the actual value of the test statistic:
You can use the following spreadsheet to see if a confidence interval based on one particular random sample contains p. Also, you can create many random samples and generate a confidence interval for each one: what percentage of those confidence intervals contain p?
Each of the following is for a particular context and can be used to find a confidence interval and carry out a hypothesis test:
Use the following spreadsheet to carry out a chi-square test for independence, given a table of bivariate numerical data:
Use this spreadsheet to carry out a chi-square test for goodness-of-fit:
Analysis of Variance
Use this spreadsheet for 1-way ANOVA: