**Excel spreadsheets for descriptive statistics, probability and sampling theory, and statistical inference**

Below are links to a variety of Excel spreadsheets that you can download and use. The collection was originally created by Bill Lepowsky in 2010.

**DESCRIPTIVE STATISTICS**

**Categorical Data**

You can describe bivariate categorical data with this spreadsheet; you can also use it to carry out the chi-square test for independence.

**bivariate_categorical_data_and_chi-square_test_for_independence.xls**

Click on the following link to see a Word document explaining *how to create a table of bivariate categorical data *from an Excel spreadsheet of raw data. It will take maybe 10 minutes to learn to do this. Once you know how, it takes less than 1 minute to create the table.

**using-excel-to-create-a-table-of-bivariate-categorical-data1.doc**

**Univariate Numerical Data**

To create a dotplot from a list of numerical data: **create_a_doplot.xls**

To create back-to-back dotplots from two lists of numerical data: **back-to-back_dotplots.xls**

To create a frequency table and spikeplot, given a list of numerical data: **create_freq_table_and_spikeplot.xls**

To create a histogram (either adjusted for density or not adjusted) from a frequency table of grouped numerical data: **create_a_histogram_from_grouped_data.xls**

To create both a frequency table and a histogram from a list of numerical data (Note: you create your own data groups by choosing your own class boundaries):

**create_histogram_from_data_list.xls**

To create a boxplot and the 5-number summary (quartile values, etc.) from a list of numerical data: **create_a_boxplot.xls**

To generate complete descriptive statistics for 1 numerical variable: mean, median, standard deviation, frequency table, dotplot, spikeplot, histogram, boxplot, etc.:

**1-numerical-variable-all-plots-up-to-n-1000.xls**

Click on this link to see an excellent two-page explanation in Word (written by Laney math instructor Kathy Williamson) of how to use Excel to calculate the mean and SD of a list of numerical data:

**how-to-use-excel-to-calculate-mean-and-sd-kathy-williamson.doc**

**Bivariate Numerical Data: Scatterplots, Correlation, and Regression**

Enter a list of bivariate numerical data in order to see the scatterplot, SD line, regression line, correlation coefficient, etc. (Note: You can also compute and see residuals):

**correlation-and-regression-up-to-1000-pairs.xls**

Given any desired value of the correlation coefficient r from +1 to -1, you can see what a scatterplot looks like that has that particular value of r; when you change the value of r, you can see how the scatterplot and the regression line change:

**creating_scatterplot_with_desired_r-v2.01.xls
**

(with an accompanying technical note **regression-technical-note-v.1.04.pdf** or **regression-technical-note-v.1.04.doc**)

**PROBABILITY AND SAMPLING THEORY**

**Binomial Distributions**

With the following spreadsheet, you choose the value of n (number of trials) and p (success probability), and then you can learn about each value of f (number of successes), as well as see the binomial spike diagram for f:

**binomial-procedures-probabilities.xls** (updated 11/22/2012)

You can see the binomial spike diagram for the *frequency *f, given various values of n and p; in particular, you can see how the spike diagram changes as n increases:

You can see the binomial spike diagram for the *relative frequency *p-hat, using various values of n and p; in particular, you can see how the spike diagram changes as n increases:

**binomial_spike_plots_–_phat.xls**

You can use the following spreadsheet to explore the destinations when many IID trials are performed:

**destinations-at-bottom-of-triangle-grid-with-graphs-2-19-13.xls**

**DISTRIBUTION TABLES — z (standard normal) table, t table, chi-square, and F table:**

**z, t, chi-square and F tables.xls**

**STATISTICAL INFERENCE**

**Confidence Intervals and Hypothesis Testing**

You can use this to find the p-value if you know the actual value of the test statistic:

**find p-value, given z, t, chi-square or F.xls**

You can use the following spreadsheet to see if a confidence interval based on one particular random sample contains p. Also, you can create many random samples and generate a confidence interval for each one: what percentage of those confidence intervals contain p?

**confidence-intervals-do-they-contain-p-2-15-134.xls**

Each of the following is for a particular context and can be used to find a confidence interval and carry out a hypothesis test:

**1_population,_1_dichotomous_variable.xls**

**2_populations,_1_dichotomous_variable.xls**

**1_population,_1_numerical_variable,_large_sample_case.xls**

**2_populations,_1_numerical_variable,_large_sample_case.xls**

**1_population,_1_numerical_variable,_small_sample_case.xls**

**2_populations,_1_numerical_variable,_small_sample_case.xls**

**Chi-Square Tests**

Use the following spreadsheet to carry out a chi-square test for independence, given a table of bivariate numerical data:

**bivariate_categorical_data_and_chi-square_test_for_independence.xls**

Use this spreadsheet to carry out a chi-square test for goodness-of-fit:

**chi-square_test_for_goodness-of-fit.xls**

**Analysis of Variance**

Use this spreadsheet for 1-way ANOVA: