STATISTICS

STATISTICS

Univariate statistics

Looks at the variability of one item. Analysis of a single variable for the purpose of description.

Bivariate statistics

Looks at the variability of 2 subsets. The analysis of 2 variables simultaneously for the purpose of determining the empirical relationship between them. This type of analysis is primarily aimed at prediction.

Multivariate statistics

Multivariate statistics provide for analysis where there are many independent (IVs) and dependent variables (DVs) which are correlated to each other to varying degrees. Help you to understand complex relationships among variables. Typically talking about 2 or more DVs..

Inferential statistics

Tells us how much confidence we have when we generalize from a sample to a population. Infer from a sample to a population.

Parametric statistics

A group of statistical techniques that make strong assumptions about the distribution of the outcome variable (eg, that it is normally distributed). In short, if we have a basic knowledge of the underlying distribution of a variable, then we can make predictions about how, in repeated samples of equal size, this particular statistic will "behave," that is, how it is distributed. Assumptions: normal distribution, equal variances, interval level of measurement for DV, independent observations.

Nonparametric statistics

A group of statistical techniques that don't make strong assumptions about the distribution of the outcome variable. Specifically, nonparametric methods were developed to be used in cases when the researcher knows nothing about the parameters of the variable of interest in the population (hence the name nonparametric). In more technical terms, nonparametric methods do not rely on the estimation of parameters (such as the mean or the standard deviation) describing the distribution of the variable of interest in the population. Therefore, these methods are also sometimes (and more appropriately) called parameter-free methods or distribution-free methods. These tests have less statistical power than parametric tests and are more likely to make Type II errors. In general, these tests fall into the following categories:

Tests of differences between groups (independent samples);

Tests of differences between variables (dependent samples);

Tests of relationships between variables.

Assumptions: independent observations, nominal or ordinal level of measurement, sample size less than 30, skewed distribution.

STATISTICAL HYPOTHESIS TESTING

Steps of statistical hypothesis testing

State the research problem and nature of the data
State null and alternative hypotheses
Choose the level of significance (alpha level)
Select the test statistic
Determine critical vale needed for statistical significance
State a decision rule for rejecting the null
Compute the test statistic
Compare the test statistic to the decision rule and make a decision

Null hypothesis

Also called the "hypothesis of chance". The null hypothesis usually stated that the observations are the result purely of chance. The null hypothesis is what statistical procedures test. The purpose of most statistical tests, is to determine if the obtained results provide a reason to reject the hypothesis. The null hypothesis says that the results are simply due to chance. The null hypothesis states that there is no difference or relationship.

Alternative hypothesis

Also known as the "competing hypothesis". This states that the results are not due to chance. It says that there is a real effect, that the observations are the result of this real effect, plus chance variation.

Type I error

A type I error occurs if, based on the sample data, we decide to reject the null hypothesis when in fact the null hypothesis is true. This is like having a fire alarm without a fire (detecting an effect which is not there). Reducing the chances of making this type of error may increase the chance of a Type II error.

Type II error

A type II error occurs if, based on the sample data, we decide not to reject the null hypothesis when in fact the null hypothesis is false. This is like having a fire without an alarm (having an effect but not detecting it). Reducing the chances of making this kind of error can increase the chance of making a Type I error.

Statistical significance

A result is described as "statistically significant", when it can be demonstrated that the probability of obtaining such a difference by chance only, is relatively low.

Test statistic

This is the statistic that will assess the evidence against the null hypothesis.

Alpha level or P-value

The p-value represents a decreasing index of the reliability of a result. Specifically, the p-level represents the probability of error that is involved in accepting our observed result as valid, that is, as "representative of the population." The higher the p-level, the less we can believe that the observed relation between variables in the sample is a reliable indicator of the relation between the respective variables in the population. For example, a p-level of .05 (i.e.,1/20) indicates that there is a 5% probability that the relation between the variables found in our sample is a "fluke."

Alpha

The Type I error rate. The probability of rejecting the null hypothesis when it is true.

Beta

The Type II error rate. Beta represents the probability of failing to reject the hypothesis tested when that hypothesis is false and a specific alternative hypothesis is true. For a given test, the value of beta is determined by the previously determined value of alpha, certain features of the statistic that is being calculated (particularly the sample size) and the specific alternative hypothesis that is being entertained. The probability of a Type-II error in hypothesis testing, when the null hypothesis is false. Common values are 0.1 or 0.2.

Power

The power of a test refers to the probability of detecting an effect when the effect truly does exist. To calculate the power of a given test it is necessary to specify alpha (the probability that the test will lead to the rejection of the hypothesis tested when that hypothesis is true) and to specify a specific alternative hypothesis. Power is effected by sample size, effect size, and alpha level set by the researcher. The power is equal to 1-beta. Statistical power increases as sample size increases because larger samples decrease the opportunity for sampling error, the larger the sample size the better the odds that the sample is representative of the population.

Effect size

The magnitude of a finding. It is the proportion (or %) of variance in the DV which is explained by the IV. It is usually expressed in a number between 0 and 1, with larger numbers representing larger effects. Those .80 and above are considered "large" and those .20 are considered "small". For example, in ANOVA, you can use the eta-squared statistic to gauge the effect size.

MEASURES OF CENTRAL TENDENCY

Mean

A measure of central tendency (the center of the data). The mean is the arithmetic average of the scores in the population. Numerically, it equals the sum of the scores divided by the number of scores.

Median

A measure of central tendency (the center of the data). The median of a population is the point that divides the distribution of scores in half. Numerically, half of the scores in a population will have values that are equal to or larger than the median and half will have values that are equal to or smaller than the media

Mode

A measure of central tendency (the center of the data). It is the score in the population that occurs most frequently.

MEASURES OF DISPERSION

Measures of spread; how far the data tend to range from the center.

Range

A measure of dispersion. The range is the difference between the highest and lowest score. Numerically, the range equals the highest score minus the lowest score.

Interquartile range

Divides the data in to 4 equal groups and sees how far apart the extreme groups are. To do this you put the data in numerical order; divide the into 2 equal high and low groups at the median; find the median of the low group which is called the first quartile; find the median of the high group with is the third quartile. The interquartile range is the difference between the first and third quartiles.

Variance

A measure of dispersion. The variance is a statistical measure of variation or dispersion or scatter of set of values from their mean. Explains the variation in the data.

Standard deviation

A measure of dispersion. Measures the spread from the mean. It is the typical or "standard" amount of scores that deviate from their mean. Generally, it is the average distance from the data to the mean. To calculate the standard deviation of a population it is first necessary to calculate that population's variance. Numerically, the standard deviation is the square root of the variance. Unlike the variance, which is a somewhat abstract measure of variability, the standard deviation is easier to conceptualize.

MISCELLANEOUS

The normal curve

The normal curve is bell-shaped and symmetrical and has certain mathematical properties. The normal distribution consists of 6 standard deviates (3 on each side of the mean). The mean, median, and mode all occur at the point (at the center and highest point of the curve). It is the basis for inferential statistics and hypothesis testing.

Measurement bias

A systematic distortion which can affect the quality of data collected. It can result in inaccurate findings and innaccurate conclusions drawn from those findings.

Alternative explanations

These include measurement bias, rival hypotheses, and chance. Research designs help eliminate bias and rival hypos, while statistics help eliminate chance explanations.

Central limit Theorem

The Central Limit Theorem is a statement about the characteristics of the sampling distribution of means of random samples from a given population. The Central Limit Theorem consists of three statements:

[1] The mean of the sampling distribution of means is equal to the mean of the population from which the samples were drawn.

[2] The variance of the sampling distribution of means is equal to the variance of the population from which the samples were

drawn divided by the size of the samples.

[3] If the original population is distributed normally (i.e. it is bell shaped), the sampling distribution of means will also be normal.

Sum of Squares

The sum of squared differences of data values from their mean.

Z-score

The z score for an item, indicates how far and in what direction that item deviates from its distribution's mean, expressed in units of its distribution's standard deviation. We can use a Z score when we encounter a variable that is based on measurements from two different populations, converting them to a standard z score.

F-statistic

The ratio of two s squares (i.e. estimates of a population variance, based on the information in two or more random samples). When employed in the procedure entitled ANOVA, the obtained value of F provides a test for the statistical significance of the observed differences among the means of two or more random samples.

Degrees of freedom

The number of values in the final calculation of a statistic that are free to vary. The number of independent pieces of information contained in the data set that are used for computing a given summary measure or statistics (like the mean). For example, if you have 1 df, you have one independent piece of the information.

Independent observations

A subject’s scores on a DV are not influenced by the other subjects in the group.

Randomization

The process of randomly assigning study units between the study treatments.

Confounding

In estimating the effect of a factor 'A' on an response, confounding is the distortion of this effect by a second factor 'B' that is associated both with 'A' and with the response.

Ecological fallacy

Falsely drawing conclusions about individuals based on the observation of groups.

Reductionism

A strict limitation of the kinds of concepts to be considered relevant to a phenomenon.

NONPARAMETRIC STATISICS

Nonparametric statistics

Tests of differences between groups (independent samples);

Tests of differences between variables (dependent samples);

Tests of relationships between variables.

Assumptions: independent observations, nominal or ordinal level of measurement, sample size less than 30, skewed distribution.

Nonparametric tests- differences between independent groups

Differences between independent groups. Usually, when we have two samples that we want to compare concerning their mean value for some variable of interest, we would use the t-test for independent samples in Basic Statistics); nonparametric alternatives for this test are the Mann-Whitney U test, and the Kolmogorov-Smirnov two-sample test. If we have multiple groups, we would use analysis of variance (see ANOVA/MANOVA; the nonparametric equivalents to this method are the Kruskal-Wallis analysis of ranks and the Median test.

Nonparametric tests- differences between dependent groups

Differences between dependent groups. If we want to compare two variables measured in the same sample we would customarily use the t-test for dependent samples (in Basic Statistics for example, if we wanted to compare students' math skills at the beginning of the semester with their skills at the end of the semester). Nonparametric alternatives to this test are the Sign test and Wilcoxon's matched pairs test. If the variables of interest are dichotomous in nature (i.e., "pass" vs. "no pass") then McNemar's Chi-square test is appropriate.

Nonparametric tests- relationship between 2 variables

To express a relationship between two variables one usually computes the correlation coefficient. Nonparametric equivalents to the standard correlation coefficient are Spearman R, Kendall Tau, and coefficient Gamma (see Nonparametric correlation. If the two variables of interest are categorical in nature (e.g., "passed" vs. "failed" by "male" vs. "female") appropriate nonparametric statistics for testing the relationship between the two variables are the Chi-square test, the Phi coefficient, and the Fisher exact test. In addition, a simultaneous test for relationships between multiple cases is available: Kendall coefficient of concordance. This test is often used for expressing inter-rater agreement among independent judges who are rating (ranking) the same stimuli.

Chi-Square measure of association

Is a nonparametric statistical test. The Pearson Chi-square is the most common test for significance of the relationship between categorical/nominal variables. This measure is based on the fact that we can compute the expected frequencies in a two-way table (i.e., frequencies that we would expect if there was no relationship between the variables). You first look at the X²to see if there is a statistically significant association beyond chance. If significant, you then look at Phi if square table (2X2, 3X3, etc.) and Cramer’s V if not square (2X3, etc.). A "moderate" magnitude for Phi and Cramer’s V is .26 to .40. Over .50 is considered "strong". # of variables: 2 level of measurement: nominal scale Assumptions: independent observations, all expected frequencies are greater than 5.

Chi-Square goodness of fit test

Is a nonparametric statistical test. Assesses whether observed frequency counts fit some pre-existing "model distribution" or deviate reliably from that model.

Wilcoxon signed rank test for two correlated samples

A non-parametric equivalent of the paired t-test. Is used for matched samples. The null hypo states that the population distributions corresponding to the two types of observations are identical, while the alternative hypo states that they are different. Each subject is measured on the variable at two different times. The two sets of scores are subtracted and assigned a rank according to their differences. Assumptions: two groups are randomly and independently selected, variables at least at ordinal level.

Mann-Whitney U-test

This is the nonparametric equivalent to the independent samples t-test and tests the difference between the two population distributions. Deals with ranks of observations, based on the sum of ranks for each group. If one group has a larger sum of ranks than the other, we suspect that the two samples did not come from the same distribution. The null hypo states that the populations from which the two samples were drawn were identical, while the alternative hypo states that they are not identical. If the null hypo is rejected, then it is concluded that the two population distributions are not identical, but differ somehow. Assumptions: two groups are randomly and independently selected, DV is at least at ordinal level, no tied ranks.

Kruskal-Wallis test for more than two independent samples

A non-parametric equivalent of one-way analysis of variance. This is used for 3 or more non-related groups. Tests the differences between more than two population distributions. Is similar to the Mann-Whitney, but involves more than 2 groups. The null hypo states that the several samples have identical population distributions, while the alternative hypo says they do not. All scores are for the several groups are put together in ascending order and assigned ranks. The ranks are then totaled within each group. The null hypo is rejected when the totals of ranks are unequal between the groups, showing that there are actual differences in the populations. Assumptions: groups are randomly and independently selected, DV is at least at ordinal level, at least 5 cases or subjects per group.

Spearman's rank-order coefficient of correlation

A nonparametric alternative to correlation. Used to correlate ordinal level data. It is a coefficient which is applied to ordered, equally spaced ranks of pairs of scores. Data appears as matched pairs of scores and ranks are assigned and subtracted. Assumptions: data should be at ordinal level. Other assumptions??

MEASURES OF RELATIONSHIP

Chi-Square measure of association

Is a nonparametric statistical test. The Pearson Chi-square is the most common test for significance of the relationship between categorical/nominal variables. This measure is based on the fact that we can compute the expected frequencies in a two-way table (i.e., frequencies that we would expect if there was no relationship between the variables). You first look at the X²to see if there is a statistically significant association beyond chance. If significant, you then look at Phi if square table (2X2, 3X3, etc.) and Cramer’s V if not square (2X3, etc.). A "moderate" magnitude for Phi and Cramer’s V is .26 to .40. Over .50 is considered "strong". # of variables: 2 level of measurement: nominal scale. Assumptions: independent observations, all expected frequencies are greater than 5.

Correlation

Correlation is a measure of the relation between two or more variables and describes the strength or degree of a linear relationship. Involves strength and direction. Produces correlation coefficients (r) which can range from -1.00 to +1.00. Correlation lets us specify to what extent the two variables behave alike or vary together. Variables should be at least on the interval level of measurement. Spearman's rank and Kendall's tau are the nonparametric alternatives to correlation. Correlation is not causation! Correlation coefficient Correlation coefficient (r) represents the linear relationship between two variables. The correlation coefficient (r) provides an index of the degree to which the paired measures co-vary in a linear fashion. An r of .70 and above is considered a high correlation. r² -If the correlation coefficient is squared, then the resulting value (r2, the coefficient of determination) will represent the proportion of common variation in the two variables This is important in determining significance of the correlation. Assumptions: data must be at interval level, the pattern of relationship must be linear, data must be homoscedastic (equal variance in Y across X, can see in running a scattergram and seeing data in a cigar shape).

Linear Regression

A regression analysis which involves only one predictor is called Simple Linear Regression Analysis. Linear regression is used to make predictions about a single value and uses r to predict future outcomes. Simple linear regression involves discovering the equation for a line that most nearly fits the given data. That linear equation is then used to predict values for the data. The regression line is one which shows the best fit that relates y to x. Basically, you build the regression model, evaluate the regression model, and use the regression model to partition variance (breaks the Y variance into two parts: a proportion that is predictable from X and a proportion that is not explained or accounted for by X). Assumptions: data are in the form of pairs of scores, there is a correlation between X and Y variables, data is at interval level. (not completely sure about these)

MEASURES OF MEAN DIFFERENCES BETWEEN GROUPS

T-Test

A parametric statistical test. The t-test is used to evaluate the differences in means between two groups. Theoretically, the t-test can be used even if the sample sizes are very small, as long as the variables are normally distributed within each group and there is equality of variances. # of variables: 2 level of measurement: 1 nominal (2 categories) & 1 interval

Independent samples T-test

A parametric statistical test. In order to perform the t-test for independent samples, one independent (grouping) variable (e.g., Gender: male/female) and at least one dependent variable (e.g., a test score) are required. The means of the dependent variable will be compared between selected groups based on the specified values (e.g., male and female) of the independent variable. Assumptions: independent observations, DV is at interval or ratio level of measurement, DV is normally distributed, equal variances.

Mann-Whitney U-test

This is the nonparametric equivalent to the independent samples t-test and tests whether two independent groups have been drawn from the same population.

Paired/Dependent samples T-test

A parametric statistical test. Two groups of observations (that are to be compared) are based on the same sample of subjects who were tested twice (e.g., before and after a treatment). Assumptions: normal distribution, equal variances, equal means (?)

Wilcoxon signed rank test

A non-parametric equivalent of the paired sample t-test, for testing whether two populations have the same distribution.

ANOVA

A parametric statistical test. In general, the purpose of analysis of variance (ANOVA) is to test for significant differences between means of 3 or more groups. This procedure employs the statistic (F) to test the statistical significance of the differences among the obtained means of two or more random samples from a given population. The Kruskal-Wallis is the nonparametric equivalent to the 1-way ANOVA. # of variables: 2 level of measurement: 1 nominal IV (grouping variable, 3 or more groups) and 1 interval or ratio IV. Assumptions: independent observations, the DV is interval or ratio scale, DV normally distributed, equal variances.

Kruskal-Wallis

A non-parametric equivalent of one-way analysis of variance. This is used for 3 or more non-related groups.

2-way ANOVA

A type of elaboration. 2 nominal IVs and 1 interval or ratio DV. Looks at main effects of each IV on the DV and interaction effects of the IVs combined on the DV.

MANOVA

A parametric and multivariate statistical test. MANOVA is used to assess the statistical significance of the effect of one or more IVs on a set of two or more DVs. It is different from ANOVA because ANOVA uses only 1 DV. Use a MANOVA instead on conducting multiple ANOVAs to control for Type I errors. With MANOVA, you can see if mean scores among groups are significantly different.

REGRESSION

Regression

Regression is a class of statistical methods in which 1 dependent variable is related to 1 or more independent variables. Regression is used to make predictions of values. These predictions are made possible by knowing something about the values predicted. In other words, based on existing data values, predictions are made about other, similar values.

Linear Regression

Multiple Regression

The general purpose of multiple regression is to learn more about the relationship between several independent/predictor variables and 1 dependent/criterion variable (and finding an equation that satisfies that relationship). In general, multiple regression allows the researcher to ask (and hopefully answer) the general question "what is the best predictor of ...". We want to predict 1 continuous, dependent variable by using 2 or more continuous or nominal independent variables and we want to determine the utility of predictor variables for predicting a criterion variable. Multiple regression assumes multivariate normality of the data. Assumptions: multivariate normality, 1 interval DV and 2 or more interval or nominal IVs, relationships among variables must be linear, all relevant predictors must be included and no irrelevant predictors must be included, error scores have a mean=0, are homoscedastic (have equal variances at all values of the predictors), and are uncorrelated.

DATA REDUCTION AND UNDERLYING CONSTRUCTS

Principal component analysis

PCA is a data reduction technique, trying to reduce large #s of variables to a few composite indices. It also involves the formation of new variables that are linear combinations of the original variables. Original items must be interrelated/intercorrelated (the less correlation between variables, the less data reduction that can be achieved). Assumptions: Multivariate normality; Variables/items should be interrelated/correlated among themselves

Factor analysis

Like PCA, FA involves data reduction, trying to reduce large #s of variables to a few composite indices. Both involve the formation of new variables that are linear combinations of the original variables. Factor analysis goes one step further and tries to determine an underlying structure or construct. It seeks to explain how certain variables are correlated. Can be used to develop scales and measure constructs. Assumptions: Multivariate normality; Variables/items should be interrelated/correlated among themselves.

PREDICTION

Descriminant analysis

Discriminant function analysis is used to determine which variables discriminate between two or more naturally occurring groups and to use those variables to predict group membership of future cases. In general, Discriminant Analysis is a very useful tool (1) for detecting the variables that allow the researcher to discriminate between different (naturally occurring) groups, and (2) for classifying cases into different groups with a better than chance accuracy. The main use of discriminant analysis is to predict group membership from a set of predictors. DV is nominal and IVs are interval. Assumptions: homogeneity of variances/covariances and multivariate normal distribution.

Logistic regression

An alternative procedure to DA (as it also can predict group membership) and an extension of multiple regression. Can use logistic regression when you violate the normality assumption of DA (either because IVs are a mix of categorical and continuous variables or because continuous variables are not normally distributed). Logistic regression is used to predict a dichotomous DV from 1 or more IVs. The DV usually represents the occurrence or non-occurrence of some outcome event. The procedure will produce a formula which will predict the probability of the occurrence as a function of the IVs. It also produces an odds ratio associated with each predictor value. Assumptions: Independent observations, mutually exclusive and exhaustive categories, specificity (the model must contain all relevant predictors and no irrelevant predictors).

Linear Regression

Multiple Regression

MULTIVARIATE STATISTICS

Multivariate statistics

Multivariate normality

To have multivariate normality, the IVs must be distributed normally, any linear combination of the DVs must be normally distributed, and all subsets of the variables must have a multivariate normal distribution.

Centered data

Data is represented as deviations from the mean or average. When we center data, points will have a new position in the relation to the axes. The variance does not change but the mean becomes=0. To center data, take each score in the original data and subtract the mean from it.

Standardized data

Standardizing either stretches out or crunches in data to make the SD=1 and makes the dispersion in each direction about the same. The variance and the SD both become=1. To standardize data, divide the mean corrected data by the respective standard deviation.

Trace

Represents the total variability by a single score.

Determinant

Represents the generalized variance by a single score.

Eigenvalue

The variance (might be more to it?). Has a magnitude but no direction.

Eigenvector

The vector that corresponds to an eigenvalue. A vector is a quantity that has a magnitude and direction.

Covariance

The variance in one variable that is shared by another variable.

Loadings

The correlations between the original and new variables in principal component analysis.

Collinearity

A numerical problem that results when explanatory variables in a regression model are highly correlated.

Communality

The common, or shared variance.

Factor rotation

A technique used in factor analysis when you want to achieve a simpler factor structure which can be easily interpreted. Rotation separates the data out. Can use varimax or quartermax rotation methods.

Principal component analysis

Factor analysis

Descriminant analysis

Discriminant function analysis is used to determine which variables discriminate between two or more naturally occurring groups and to use those variables to predict group membership of future cases. In general, Discriminant Analysis is a very useful tool (1) for detecting the variables that allow the researcher to discriminate between different (naturally occurring) groups, and (2) for classifying cases into different groups with a better than chance accuracy. The main use of discriminant analysis is to predict group membership from a set of predictors. DV is nominal and IVs are interval. Assumptions: homogeneity of variances/covariances and multivariate normal distribution.

Logistic regression

MANOVA

Multiple Regression

Cluster Analysis

Cluster analysis (CA) is a multivariate procedure for detecting natural groupings in data. Cluster analysis classification is based upon the placing of objects into more or less homogeneous groups, in a manner such that the relationship between groups is revealed. The two key steps within cluster analysis are the measurement of distances between objects and to group the objects based upon the resultant distances (linkages).