# Statistics and Probability

1.0 | UNIVARIATES AND VIVARIATES | ||

S1 | Exploring Univariate and Bivariate Data | ||

S1.1 | Given the mean, variance, and standard deviation of a data set, compute the mean, variance, and standard deviation if the data undergo a linear transformation. | ||

S1.2 | Compute and interpret the within-group and between-group variation when comparing two or more data sets. | ||

S1.3 | Compute residuals and use plots of residuals to assess the adequacy of a simple linear regression model. | ||

S1.4 | Identify influential points in a bivariate data set and predict and verify the effect of their removal on the least-squares line. | ||

S1.5 | When applicable, use logarithmic and power transformations to achieve linearity and use the transformed data to make predictions. | ||

S1.6 | Explore categorical data via contingency tables, computing and interpreting marginal, joint, and conditional relative frequencies and examining measures of association. | ||

2.0 | SAMPLING AND STUDY DESIGN | ||

S2 | Sampling and Study Design | ||

S2.1 | Describe the strengths and weaknesses of sampling methods, including simple random sampling, stratified random sampling, convenience sampling, voluntary response, and cluster sampling. Recognize potential diffi culties in implementing each method. | ||

S2.2 | Critically assess the validity of conclusions drawn from surveys such as political polls, recognizing possible biases such as size bias and non-response bias and understanding the role of question formulation. | ||

S2.3 | Know and recognize in context the concepts of treatment group, control group, and experimental unit and demonstrate the importance of doubleblind protocol, random assignment, experimental unit, and replication. | ||

S2.4 | Understand and describe how to implement completely randomized and randomized block designs (including matched-pair designs), recognizing when and how blocking can lower variability. | ||

3.0 | PROBABILITY MODELS | ||

S3 | Probability Models | ||

S3.1 | Know the subjective and relative frequency interpretations of probabilities, including an informal understanding of the law of large numbers. | ||

S3.2 | Use basic probability rules such as the addition rule, law of total probability, and complement rule to compute probabilities in a variety of models. | ||

S3.3 | Use Bayesâ€™ Theorem to solve conditional probability problems, with emphasis on the interpretation of results. | ||

S3.4 | Know the definition of random variable and be able to derive a discrete probability distribution based on the probability model of the original sample space. | ||

S3.5 | Compute the expected value and standard deviation of discrete random variables and know the effect of a linear transformation of a random variable on its mean and standard deviation. | ||

S3.6 | Apply standard discrete distributions, including the binomial, geometric, and hypergeometric. | ||

S3.7 | Know the defi nition of independence of two discrete random variables and use the joint distribution to determine whether two discrete random variables are independent. | ||

S3.8 | Use tables and technology to determine probabilities and percentiles of normal distributions. | ||

S3.9 | Use simulation methods to answer questions about probability models that are too complex for analytical treatment at this level, e.g., interacting particle system models. | ||

4.0 | SAMPLING DISTRIBUTIONS | ||

S4 | Sampling Distributions | ||

S4.1 | Given the mean and standard deviation of each random variable in a set of random variables, compute the mean of the sum and, assuming independence, compute the variance and standard deviation of the sum. | ||

S4.2 | Know an informal statement of the Central Limit Theorem and understand its relevance to sampling distributions. | ||

S4.3 | Assuming a normal model or the applicability of the Central Limit Theorem, compute probabilities for the sample mean, including probabilities that are needed to compute p-values. | ||

S4.4 | Apply the (large sample) distribution of the sample proportion to compute probabilities for the sample proportion and know and use rules of thumb for the applicability of the large sample distribution. | ||

S4.5 | Assuming a normal model or the applicability of the Central Limit Theorem, derive a P% confidence interval for the mean under the assumption that the population standard deviation is known. | ||

S4.6 | Compute control limits for commonly used control charts and use these to assess whether a process is out of control. | ||

5.0 | POINT AND INTERVAL ESTIMATION | ||

S5 | Point and Interval Estimation | ||

S5.1 | Compute bias, variance, and mean squared error of estimators of the mean and proportion. | ||

S5.2 | Know the logic of confidence intervals, the meaning of confidence level, and the effect of changing sample size, confidence level, and variability on the width of the interval. | ||

S5.3 | Compute and interpret confidence intervals for one mean and for the difference between two means (in both the paired and unpaired setting) when the standard deviation is unknown, using the t distribution. | ||

S5.4 | Compute and interpret (large sample) confidence intervals for one proportion and the difference between two proportions using the normal distribution. | ||

S5.5 | Compute the sample size required for a fixed confidence level and interval width for confidence intervals for means and proportions. | ||

6.0 | SIGNIFICANCE TESTING | ||

S6 | Significance Testing | ||

S6.1 | Know the terminology and logic of significance testing, including null and alternative hypotheses, p-value, Type I and Type II errors, and power. | ||

S6.2 | Assuming a normal model and known standard deviation, carry out a signifi cance test for a single mean, with emphasis on understanding the computation and interpretation of the p-value, and compute the power curve of a test. | ||

S6.3 | Carry out (large sample) significance tests for one proportion and the difference of two proportions, with emphasis on proper interpretation of results. | ||

S6.4 | Carry out signifi cance tests for one mean and the difference of two means (paired and unpaired) using the t distribution, with emphasis on proper interpretation of results. | ||

S6.5 | Carry out chi-squared significance tests of homogeneity, independence, and goodness-of-fit, with emphasis on proper interpretation of results. | ||

S6.6 | Assuming a normal model and known standard deviation, compute the sample size necessary to achieve a pre-specifi ed power at a pre-specified value of the population mean. | ||

S6.7 | Demonstrate, in the context of specific studies, the understanding that a result can be statistically signifi cant while of insignificant practical importance and that a failure to reject a null hypothesis may be due to low power and does not necessarily imply the null hypothesis is true. | ||

7.0 | INFERENCE FOR REGRESSION | ||

S7 | Inference for Regression | ||

S7.1 | Know the statistical model for regression, including linearity, normality of errors, and constancy of error variance. | ||

S7.2 | Compute and interpret a confidence interval for the slope of a regression line using the t distribution. | ||

S7.3 | Test hypotheses about the slope of a regression line, with emphasis on interpretation of results. | ||

8.0 | ASSESSING ASSUMPTIONS OF STATISTICAL MODELS | ||

S8 | Assessing Assumptions of Statistical Models | ||

S8.1 | Demonstrate knowledge of the assumptions required for all of the inferential procedures (confidence intervals and signifi cance tests). | ||

S8.2 | In the context of specific studies, recognize aspects of study design that either support or offer evidence against required assumptions. | ||

S8.3 | Demonstrate knowledge of the possible effects of incorrect assumptions (i.e., improperly specified models) on inferential procedures and of the robustness of inferential procedures to departures from specified assumptions. | ||

S8.4 | Show in context an understanding that statistical models are approximations to reality and that care should be exercised in assigning too much precision to measures such as confidence levels or p-values. |