Mr. Meinzen - AP Statistics Terminology

"Success is the ability to go from one failure to another with no loss of enthusiasm." Winston Churchill

Chapter 1 Terminology : Exploring Data

 

Fathom Terminology

  • Toolbar
  • Formula Editor
  • Data :
    • Collection
    • Case or Cases (i.e. individuals)
    • Attribute (i.e. variable)
    • Value
  • Viewing the Data:
    • Inspection Window
    • Table (or Case Table) : an organized collection of related data cases
    • Graph (or Plot) - bar, pie, histogram, dotplot, stemplot
    • Summary Table or Chart

Definitions

 

Some important terms that are specific to AP Statistics

  • Data : one or more values representing counts or measuments. Note: "data" can be singular or plural as in the word "deer" (i.e. "one deer or many deer"). This word "data" can be a source of confusion.

  • Distribution : multiple data values that share a common relationship (i.e. can be graphed on the same axis)

  • Population : a group that is the focus of interest. Usually the population size is too large to gather complete data (called a "census"). Instead, we take a sample of the population. Population size is usually denoted by "N".

    • Parameter : a single number that represents a summary of the population data. Examples include mean of population, standard deviation of population, or population variance. Discovering parameters is one of the main goals of the field of study called Statistics.

  • Sample : a group of data that is taken from a population (i.e. subset of population data). The size (or number) of a sample is denoted by "n".

    • Statistic (aka Point Estimator): a single numerical that represents a summary of sample data. Examples include: sample mean, sample standard deviation, sample variance. Calculating a statistic to estimate a parameter is one of the main processes in the field of study called Statistics.

    • Note: the word "sample" can be a source of confusion! Sample could be a single member of the population (i.e. n = 1), a small group from population (n<N), or even the full population (n=N) in which case the sample is a "census". A "sample" could even represent a group of other samples from a distribution!

Types of Data

 

Data types are known as "attributes" in Fathom

 

  1. categorical (or qualitative) variable : data used as counts of a descriptor (i.e. labels or names).
    • Numeric Representation : frequency table : number of cases (i.e. counts) falling within each category
    • Numeric Representation : relative frequency table: synonyms : proportion, fraction, rates, ratios, or % of cases falling within each category. Note: Divide the frequency or count by the total number of counts gives the relative frequency.
    • Graphical Representation : bar chart : height corresponds to count or proportion of data in category.
  2. quantitative variable : data that takes on values that are either a count of a quantity (i.e. 5 bananas) or measured real-valued quantity (i.e. 2.3 inches).
    1. Discrete Quantity:
      • Numeric Representation : data can take on any countable (or countably infinite) value. (i.e. Integers or Rationals but not Irrationals).
      • Graphical Representations : histogram (separate bars), stem and leaf plot, and dot plot
    2. Continuous Quantity:
      • Numeric Representation : data can take on any real-numbered value but cannot be counted (i.e. Reals). Between any 2 data values, another value can always be calculated.
      • Graphical Representations : histogram (bars share common sides) and probability density function or PDF (normal, uniform, etc.)l
  • Notes :
    • A quantity is a number with descriptor and, therefore, has both a mathematical component and a language component (or "context").
      • Example1 : quantity : 5 bananas has a number (5) and a descriptor or unit (bananas)
      • Example2 : numbers : 5 (count) , 1/3 (proportion), 25% (proportion) , 1st (categorical), pi (continuous real number)
    • A source of confusion : numbers can be used with the "nominative" property in place of text-based labels.
      • Example1 : categorical data : In the Olympics, a country may have : five 1st place winners, 3 2nd place winners, and VIII 3rd place winners. The counts (five, 3, 7) are applied to the categories (1st place, 2nd place, 3rd place). Note: these 3 categories, while written as numbers, cannot be "added" together without creating a different overall category such as "medal winners."
      • Example2 : quantitative (discrete) data : Similar to above but in a different context...in the Olympics, a country may have : five 1st place winners, 3 2nd place winners, and VIII 3rd place winners. If we define an overall category of "medal winners" then the numbers (five, 3, VII) can be added together to have a total of 5+3+8 = 16 medal winners. However, it would not make sense to find the "average" number of medal winners (i.e. 16/3 = 5.333...) and therefore these are NOT continuous quantitative data.
      • Example3 : categorical data : The following list of items are labeled with numbers to put them in order of priority but only represent categories and not quantitative data : 1. milk; 2. cheese; 3. candy; 4. soda. Note : we can replace the numbered list with letters such as A. milk; B. cheese; C. candy; D: soda without losing any information.
      • Example4 : quantitative (continuous) data : In a school, the heights of the children were 60.3 inches, 39.7321 inches, and 40 inches. The total height would be 140.0321 inches and the mean height is 140.0321/3 = 46.67736667... inches
    • The distinguishing factor is whether any given number can be used to "compute" other values and whether means (averages) has contextual meaning.
    • For more details, see the following lecture notes:
    • Lecture 1 : Numerals : how we write/communicate numbers.

      Example A - different ways to write the same Number (note: there are many more ways than the 8 listed here):

      1. 27 (Hindu-Arabic numeral system used in most modern societies)

      2. twenty-seven (English-language based numeral system - verbal communications)

      3. XXVII (roman numerals - old but not designed for computation)

      4. 0001 1011 (binary numerals - computer representations)

      5. ///// ///// ///// ///// ///// // (tally system - simple)

      6. one score and seven (English ala Lincoln)

      7. 00:27:00 (sexagesimal system - based on degrees:minutes:seconds) and one of the oldest numeral systems in existence. Ancient Babylonians used to compute time, geography, and angles...base-60 avoids many issues with fractions/division due to the First Fundamental Theorem of Arithmetic)

      8. 1B (hexadecimal system which uses the digits 0,1..9,A,B,C,D,E,F

      Example B - the numerals 0, 1, 2, and 3 in mathematical set-theoretic notation...uses only the three symbols: {,}

      • 0 = {},

      • 1 = {0} = {{}},

         

      • 2 = {0,1} = {{},{{}}},

      • 3 = {0,1,2} = {{},{{}},{{},{{}}}}

      Lecture 2 : Numbers : how we use numbers to count, measure, and label objects.

      1. When we use numbers to "count" objects, we are referring to the "cardinality" property.

      2. When we use numbers to compare ("bigger versus smaller") against objects, we are referring to the "comparison" property.

      3. When we use numbers to label objects (like the list below) we are referring to the "nominal" property.

      Usually we just talk about "Number Systems" which are sets of numbers.

      1. Natural Numbers (aka Counting Numbers)

        • If used to describe the "size" of an object then we call the number a Cardinal such as 5 bananas [five is the cardinality of the set of bananas].
        • If used to describe a property of an object for comparison purposes thenl the number is a Measurement such as "I scored an 85% on the test." The 85% by itself does not mean anything unless we compare it to other scores. (If the measurement includes a magnitude and a unit--a word such as "meter"-- it is called a Quantity.)
        • If used only as a label rather than for any calculations, then the Number is a Nominal such as this listing or numbering of the definitions for Cardinal, Measurement, and Nominal...we could have just used a bulleted list instead of a numbered list.
      2. Integers (discrete) : no fractions or decimals, includes positives, negatives, and 0.

      3. Rationals (discrete) : Integer divided by Integer; also called Fractions, % , proportions

      4. Real : Rational + Irrationals (i.e. non-terminating, non-repeating decimals)

      5. Complex Numbers : Real + Imaginary (numbers that can go round-and-round)

Types of Graphical Representations

  • Charts : Categorical Data
    • bar chart (bar graph) :
      • horizontal axis : categories (labels or names)
      • vertical axis : counts (i.e. frequency) or relative frequency (proportions or fractions)
  • Plots & Graphs : Quantitative (Discrete or Continuous) Data
    • boxplot (and modified boxplot) :
      • horizontal axis : data values
      • vertical axis : none, may have several boxplots side-by-side to compare similar data in different categories
      • uses 5-number summary from left to right : min,Q1,Q2,Q3,max
      • box is from Q1 to Q3 with horizontal lines ("whiskers") to min and max. Modified boxplots have whiskers that exclude outliers.. Vertical line at Q2 (median).
    • dotplot : horizontal axis corresponds to data values with a dot placed above the data value. Multiple indentical (or nearly identical) data values have dots stacked vertically.
    • stem-and-leaf plot : "stem" is the first digit or left-most digits of a number and "leaf" is usually the last digit (unit) of the number.
    • other : time plot
    • histograms
      • vertical axis : counts of data values (frequency ) or proportions (i.e. relative frequency or pdf).
      • horizontal axis : data values or ranges of data values
        • Relative frequency histogram (discrete): data is grouped in bars or bins (ranges of values) separated by spaces or gaps.
        • Relative frequency histogram (continuous) : data is grouped in bars or bins (ranges avalues) with no separation
        • Probabiliy Density Function or PDF (continuous) : data is not grouped in bars (or grouped into infinitely thin bars) which creates a single continuous curve. The data is can be modeled by a theoretical (i.e. mathematical) function. The total area under a PDF curve is exactly 1 (100%). The two most common PDF's are normal (bell-shaped or mount-shaped) and uniform curves.
    • cumulative frequency graph (aka "ogive"): a curve drawn to represent total values less than or equal to a given number (left side is 0, right side is 100%)
      • horizontal axis : data values
      • vertical axis : proportion totals
  • Notes:
    • Histograms and Bar Charts appear and are interpreted in a similar manner (i.e. counts=frequencies or relative frequencies=proportions).
    • Bar Charts are for categorical data (horizontal axis are labels) whose bars that are separated by blank space.
    • Histograms are for quantitative values (horizontal axis are data values or ranges of data values) with spaces between bars representing discrete quantities. Continuous quantities have no spaces between bars. A histogram with bars that become infinitely thin (width --> 0) becomes a probability density function (pdf).

Distributions of Quantititative Data : SHAPE > CENTER > SPREAD

  1. SHAPE of distribution : (i.e. describing a graph of a collection or group of data)

    • symmetric : left and right halves are mirror images; mean and median are close to each other
      • uniform (or rectangular) distribution : all heights are approximately the same (i.e. a flat top)
      • normal distribution (i.e. "bell curve")
    • skewness (i.e. partly symmetric but one side "longer" than the other)
      • left skewed (or negative skew) : left tail is longer than right tail; mean is left of median
      • right skewed (or positive skew) : right tail is longer than left tail; mean is right of median
    • other
      • bimodal distribution : two peaks or clusters
      • gap : region of distribution between2 data values where there is no observed data
  2. CENTER [summary statistic or point estimator : single value measurement or computation]

    • mean : the arithmetic mean or average : add up all the data values and divide by the total count of data
    • median : the middle value when data is ordered. Half of the data counts are below the median and half of the data counts are above the median. For AP Statistics, the median for an even-number of data points is usually the mean (average) of the two middle data values.
    • mode : datum that shows up the most; rarely used
  3. SPREAD or variation [summary statistic or point estimator : single value measurement or computation]

    • deviation (or residue)
    • for means : standard deviation (s) or variance (s2)
    • for medians : quartiles (ranges)
      • 5-number summary : minimum, Q1, Q2 (median), Q3, maximum
      • range : a single value = maximum - minimum. Note: in some math courses, the range is given as two values [min, max]
      • interquartile range (IQR) = Q3 - Q1

Other Unusual Features

  • Outliers : data points that are unusually small or large relative to the rest of the data.

    • for means : any value 2 or more standard deviations above or below the mean.

    • for medians : any value less than Q1 - 1.5*IQR or greater than Q3 + 1.5*IQR

  • Gaps : region of a distribution between two data values there there is no observed data

  • Clusters : concentrations of data usually separated by gaps.

  • The mean, standard deviation and range are non-resistant (non-robust) because they are influenced (changed) by outliers. The median and IQR are resistant (robust) because outlies do not greatly (or at all) affect their values.

Linear Transformations : effects on mean, spread and change of units

Definition :

a Linear Transformation takes every data point and re-calculates each to a new value using multiplication and/or addition by constants. An example of Linear Transformation is converting degrees Fahrenheit to degrees Celcius.

  • degree Celcius = 5/9 * (degree Fahrenheit - 32)

    • [i.e. subtract 32 and then multiply by 5/9]
  • When a linear transformation takes place, the following statistics are changed :

    1. The center (i.e. mean) is transformed by both the multiplication and addition constants. Example: if the mean temperature was 122 degrees fahrenheit, then the mean temperature was 5/9 * (122-32) = 50 degrees Celcius
    2. The spread (i.e. standard deviation) is transformed by the multiplication only. Example: if the standard deviation was 18 degrees Fahrenheit, then the standard deviation would be 5/9*(18) = 10 degrees Celcius. Technically the variation (square of standard deviation) is multiplied by the square of the multiplicative constant s2c = (5/9)2 * s2f

Context Clues

  • statistical use of a "count" :
    • Mathematics : Generally, a "count" is a whole number (0, 1, 2, 3, ...). However, it may be a rational number if the item being "counted" is grouped in equal-sized units. Ex: 1/3 of a case is equivalent to 4 bottles in a 12-bottle case of soda.
    • English : each numeric value must have units such as "oranges" or "individuals" or "cases of soda"
    • Examples : 20 oranges, 17 people, 22.5 cases of soda
  • statistical use of a "proportion" :
    • Concept : part / total
    • Mathematics : rational-value ratio between 0 and 1 (inclusive). May be represented by a fraction or decimal value or percentage.
    • English : same units divide by same units (i.e. unit-less quantity)
    • Example : In a class of 25 students with 10 girls and 15 boys, the proportion of girls in the classroom is 10/25 (i.e. 40%). NOTE: Technically, the units are"students per students" but this does not need to be specified.
  • statistical use of a "mean" :
    • Concept : the (arithmetic) average value of a set of numbers
    • Mathematics : ( Σ xi) / n = (sum of values) / (count of values)
    • English : units divided by a count (i.e. the mean has the same units as the numbers themselves)
    • Example : In a class of 5 students, the points on a given quiz are 43, 47, 50, 40, and 44. The mean is (43+47+50+40+44) / 5 = 44.8 points/student. NOTE: The units are"points per student" and must be specified.

Chapter 3 : Two Variable Relations : Quantitative (continuous) : Linear

Comparison : Chapter 2 (univariate) and Chapter 3 (bivariate)

 

Chapters 1 & 2: One Variable

(univariate data) :

  • shape -> center -> spread

Chapter 3: Two Quantitative Variables

(bivariate data):

  • Form (shape) -> Direction (trend) -> Strength -> Variability

Key Idea

Distribution (several related data points)

Association (relation) :

  • x = explanatory variable = independant variable
  • y = response (or predicted) variable = dependant variable

Plots/Graphs

Dot plot
Stemplot
Boxplot
Histogram

ScatterPlot


Residual Plot

Ideal Form (shape)

Normal (bell or mound shaped)

Linear (oval/ellipse) with either positive (uphill) or negative (downhill)direction

Terminology

Normal, uniform, or skewed

Symmetric

Clusters, gaps, and outliers

Form (shape): Linear or Non-Linear (or curved)


--> Direction (trend) : positive or negative or no association


--> --> Strength : strong, moderate, or weak



Unusual Features : clusters, gaps,

  • outlier point (in the y variable),
  • high-leverage point (in the x variable),
  • influential point (either or both variables change LSRL

Measure of Center

or Estimate of Population

Mean (continuous)

Median (discrete)

Regression Line (LSRL)

  • Hint: (x̄,ȳ) is a point on the LSRL

Measure of Spread
from the Center

Standard Deviation (continuous)

Interquartile Range (discrete)

Correlation, r : number that gives direction and strength of linear association (but only if form is already known to be linear)

  • Coefficient of Determination, r2 = square of correlation

Summary line (LSRL) : similies & synonyms

  • Least Squares Regression Line (LSRL)
  • Fitted Line (student's best guess for a line)
  • Line of Best Fit (best guess or LSRL)
  • Regression Line (LSRL)
  • Trend Line (LSRL)

ScatterPlots : form (shape) --> direction (trend) --> strength --> variability

Relation between x and y variables (plausible explanations) : causation, common response, or confounding

  • lurking variable
  • residual plots
  • outliers
  1. data distribution's form (shape) : linear, non-linear, or none
    • ŷ = a1 + b1*x [equivalent to algebra equation of line : y = mx + b]
  2. form's (shape's) direction (trend) : positive slope, negative slope, or none
    • b1 : measure of slope
  3. direction's (trend's) strength : strong (tight cluster), moderate (some clustering), or weak (no cluster)
    • correlation, r : a measure of a trend's strength
  4. strength's variability : uniform or heteroscedasticity (fan-shaped)

Symbols and Definitions :

  • ŷ = a1 + b1*x : least squares regression line ("regression line" or just LSRL) : Note: usually provided by calculator or software.
  • x : explanatory variable or predictor variable from data points
  • y : response variable or observed variable from data points
  • ŷ : predicted value calculated from LSRL
  • r : correlation coefficient : Note : formula will be given...usually calculator or software will provide this number
  • r2: coefficient of determination : proportion (percentage) of variation (i.e. change) in y that is explained by variation (i.e. change) in x.
  • Coefficients of LSRL :
    • b1 = r* (sy/sx) : slope of the LSRL : the change in predicted amount of ŷ for every unit increase in x. Note: sy = standard deviation of all data points' y-values, sx = standard deviation of all data points' x=values...useful if student is given r, sy, and sx
    • a1 : y-intercept, : not always useful in context.
    • Example: Given LSRL model : height(inches) = 2.75 * age(years) + 20; the coefficients are :
      • ŷ = "predicted height in inches"
      • x = "age in years"
      • b1=2.75 intrepreted as "for every 1 year increase in age, the predicted height increases by 2.75 inches"
      • a1=20 interpreted as "when born (i.e. age=0 years), the height is predicted to be 20 inches"
      • so, for a 10-year-old, the predicted height would be 47.5 inches" (47.5= 2.75*10 + 20)
  • interpolations versus extrapolation
  • Special Points (compare & contrast) :
    • influential point : any point that, when removed, changes the LSRL significantly (i.e. changes b1, a1, or r values)
    • outlier in y: large residual y-ŷ in LRSL compared to other observation points
    • high-leverage point in x : substantially higher or smaller x-value than other observation points
  • residual = y-ŷ = observed y - predicted y : Note : random residual plot is evidence for a linear form.
  • sum of square errors (SSE)
  • "correlation does not imply causation" due to lurking variable

Chapter 4 : Two Variable Relations : Quantitative (continuous) : Non-Linear and Categorical

Quantitative (continuous) Data : Non-Linear

  • Transformations :
    • Purpose : convert non-linear data into linear data so we can use LSRL model

    • Note: textbooks may use "x" or "t" (for time) for the explanatory variable

    1. linear --> linear models : y = a + bx or y = a + bt

      • increase is fixed amount from previous values
      • Example : useful for "simple" unit conversions such as Fahrenheit to/from Celcius : degC = 32 + 9/5*degF
    2. exponential --> linear models : y = abx or y = abt

      • plot : "log y against x" or "log y against t" to see linear pattern
      • Example : biological growth of cell division can be modeled by y = 2t; We can take the log of both sides to get: log(y) = log(2) * t which graphs to a line
    3. power --> linear models : y = atp or y = axp

      • Note: "p" is the power (or exponent) used in the equation

      • plot : "log y against log x" or "log y against log t" graph to see linear pattern
      • Example : geometry (volumes and areas) can be modeled such as y = x3. We can take the log of both sides to get: log(y) = 3 * log(x) which graphs to a line.
      • square, p=2 : y = x2 or y = t2
      • reciprocal, p=-1 : y = 1/x or y = 1/t
      • reciprocal square root, p=-1/2 : y = x-1/2 or y = t-1/2
      • logarithm, p=0 : y = log(x) or y = log(t)

Categorical Data : counts and percentages within categories

  • Recall Chapter 1 : One Variable (uni-variate) :
    • Visualizations : bar graphs, segmented bar graphs, mosaic plots
  • Two Variable (bi-variate)
    • Visualizations : two-way tables or contingency tables
    • Cell data may be :
      1. frequency counts :
        • Example frequency table : Categorical Variables : sex versus color preferences for 30 people
          •   Red Blue
            Male 5 7
            Female 12 6
      2. joint relative frequency (cell frequency divided by total for the entire table) :
        • Example relative frequency table: Categorical Variables : sex versus color preferences for 30 people
          •   Red Blue
            Male 6/30 = 20% 7/30 = 23%
            Female 12/30 = 40% 6/30 = 20%
      3. other relative frequencies (percentages) tables also exist
    • Summary Statistics :
      1. marginal frequencies : additional cell at bottom of each column (or right of each row) with column (row) totals divided by total of entire table.
        • Example Marginal frequencies: Categorical Variables : sex versus color preferences
          •   Red Blue Totals (i.e. Column Margin)
            Male 5 7 12
            Female 12 6 18
            Totals (i.e. Row Margin) 17 13 30
      2. conditional frequencies : each cell = cell's relative frequency divided by marginal frequncy (row or column)
        • Example Conditional frequencies: Categorical Variables : sex versus color preferences
          •   Red Blue Totals (i.e. Column Margin)
            Male 5/12=42% or 5/17=29% 7/12=58% or 7/13=54% 12/30=40%
            Female 12/18=67% or 12/17=71% 6/18=33% or 6/13=46% 18/30=60%
            Totals (i.e. Row Margin) 17/30=57% 13/30=43% 30/30=100%

Simpson's Paradox (lurking variable as a pre-condition) : occurs when categories are a combination of smaller categories.

Chapter 5 : Collecting, Producing & Exploring Data

Definitions and Fundamental Concepts

  • population : all items or subjects (units) of interest. "N" is population size.
  • sample : selected subset of population. "n" is sample size
  • collection types : census versus sample
  • summary numbers : parameter versus sample statistic (or point estimator)

Collecting Samples

Overarching Concerns when collecting a Sample from a Population :

  • BIAS : certain responses (or samples) are systematically favored over other responses (or samples)
  • samples : unbiased representatation of population [MUST BE random and independent]

 

Methods to Collecting Samples to avoid Bias

  • simple random sample (SRS) : each sample as an equal probability of being selected. Use of table of random digits, calculator, or software
  • stratified random sample : division of population into smaller groups (strata) of similar individuals (homogeneous grouping), SRS within each strata. Example: strata of boys then SRS of boys and strata of girls then SRS of girls.
  • cluster random sample : division of population into smaller groups (heterogeneous grouping). SRS within each cluster. Example: Select two cities with similar populations then SRS from each city (i.e. cluster).
  • systematic sample with random starting point and fixed, periodic interval. Example: Number each student in classroom, start with a random student (say 5th student) and then select every 3rd (i.e. 8th, 11th, 14th, etc.) student.
  • Other : two-stage (or multi-stage) cluster sample

Types of Bias in Samples :

  • undercoverage bias : part of population has a reduced chance of being include in sample.
  • non-response bias : individuals chosen for sample refuse to respond.
  • response bias & question-wording bias : confusing words and/or leading questions. Example: "If blue is the most favored pigment, what is your favorite color of amphibians?"
  • convenience bias : non-random selection based on preferences
  • volunteer bias : use of only volunteers will not be representation of whole population
  • Other Biases : judgement bias, size bias, incorrect response bias (lying)

Experiment vs Observational Study

  • Design of observational studies

    Purpose : help investigate a topic of interest about a population. No treatment is imposed and no causal relationship can be determined.

    • retrospective : examine data for a sample of individuals
    • prospective : follow a sample of individuals to gather data into the future.
    • sample survey : an observational study to collect data in order to learn about population from a sample of the population. Must be random and representative of population (i.e. minimize known biases).
  • Designs of experiments

    Purpose : help determine causal relationships:

    • experimental units ("subjects" or "participants" if human) --> treatment (factor vs plecebo) --> observed response (measured or categorized)
        1. comparison of at least 2 treatment groups (one may be "control");
        2. randomize assignments of treatments to experimental units;
        3. replicate
        4. control for potential confounding variables, plecebo & plecebo effect.
    • Types of Experimental Designs : Randomization & Independence
      • single blind experiments & double blind experiments
      • completely randomized design : treatments assigned to experimental units completely at random
      • randomized block design : experimental units organized into similar-variable blocks. Treatments are assigned randomly to each block.
        • randomized matched paired design : 2 treatments to a single experimental unit at different times or 2 subjects sharing a common relevant factor (such as age) each given one treatment.
    • Variables & Treatments in Experiments
      • explanatory variable or factor (if categorical) has levels that are chosen intentionally. Levels (or combination of levels) are treatments.
      • response variable : measured outcome after treatment has been imposed.
      • lurking variable
      • confounding variable : a 3rd variable related to the explanatory variable and influences the response.
      • control group : collection of experimental units either not given a treatment or given a plecebo.
      • plecebo : an inactive substance given to a control group as a treatment that should not have an effect on the measure response. A plecebo effect may occur if the experimental untis have a response to the plecebo.
    • Variability (not on AP Exam)
      • between-treatment
      • within-treatment
    • Significance :
      • statistically significant : observed changes are so large as to be unlikely to have occurred by chance.
      • practically significant (i.e. lack of realism) : numerically large changes may not have an impact on the topic of interest.
      • Example : There may be a statistically significant difference between test scores (say 83% and 87%) that cannot be explained by random chance. However, there may not be any practically significant difference as both scores result in a "B" grade.
  • Simulation of Experiments

    Purpose : model of chance behavior (random events) such that simulated outcome closely matches real-world outcomes. The outcomes are based on either empirical data or mathematical probability model of the real-world experiment. Simulation may be done because a real-world experiment may be impractical or too costly in terms of time or money.

    • Definitions :
      • random process : generates results that are determined by chance
      • outcome : result of a trial of a random process
      • event : collection of outcomes
      • simulation : a model random events.
    • Procedure :
      1. Describe the real-world experiment.
      2. State assumptions about model for one trial and it's connection to real-world experiment.
      3. Assign digits to represent every real-world outcome. (random number table)
      4. Simulate many repetitions (i.e. many trials) to generate counts for each outcome
      5. Calculate probabilities (counts/toals) & state conclusions
    • Notes:
      • Law of Large Numbers : simulated (empirical) probabilities tend to get closer to the true probability as the number of trials increases.
      • (false) Law of Small Numbers : patterns (or probabilities) may not match expectations during a small number of trials. Humans are susceptible to this effect...if we flip a fair coin 10 times, we may reject the "fairness" of the coin if we observe 2 heads and 8 tails because we expect about 5 heads and 5 tails (i.e. fair coin defined as 50% probability of heads) Not on AP Exam.

Chapter 6 Terminology : Probability : a study of randomness

  • event : a single occurance of some item of interest
  • Visualization : venn diagrams & Tree Diagrams
  • Probability :

    • probability of an event A : in repeatable situations can be interpreted as the relative frequency of the event in the long run; P(A) = (number of outcomes of event A) / (total number of outcomes in sample space)
    • complement of event : A' or Ac and probability that event A will NOT occur is P(Ac) = 1- P(A)
    • mutually exclusive events A and B [disjoint categories] : P(A ∩ B) = 0
    • conditional events A given B : P(A|B) = P(A ∩ B) / P(B) or, re-arranging P(A ∩ B) = P(B)·P(A|B)
    • independent events A and B: P(A) is not changed by knowing P(B) or P(A|B) = P(A) also P(A ∩ B) = P(A)·P(B)
    • "of at least one" : P("of at least one") = 1 - P("exactly none")
    • probability distribution : represented as a table or function showing the probability of each value of the random variable. Interpretation provides information about the shape, center, and spread of a population also known as normal, mean, and variance (or standard deviation)
    • probability cumulative distribution : represented as a table or function showing the probability of being les than or equal to each value of the random variable. Example :
      Roll of Die (Random Variable) 1 2 3 4 5 6
      Probability Distibution 1/6=16.6% 16.6% 16.6% 16.6% 16.6% 16.6%
      Cumulative Probability Distibution 1/6 = 17% 2/6=33% 50% 67% 83% 100%
    • model : applying mathematical expression(s) or equation(s) to a real-world scenario. This is often seen when applying a probability distribution to a real population.
  • Sampling

    • space : set of all possible non-overlapping outcomes
    • with Replacement
    • without Replacement
  • Mathematical

    • Law of Large Numbers : simulated (empirical) probabilities tend to get closer to the true probability as the number of trials increases.
    • Fundamental Counting Principle
    • Terminology or Rules regarding Probabilities
      • Legal Values : 0 ≤ P(A) ≤ 1 for any Event A
      • Total Probability = 1 : P(all sample space) = 1 : sum total probability of all sample space is 1.00 (100%)
      • Complement of Event : P(Ac) = 1 - P(A)
        • P("of at least one") = 1 - P("exactly none")
      • Mutually Exclusive Events [i.e. Disjoint] : P(A and B) = P(A ∩ B) 0
      • Independent Events : P(A|B) = P(A) also P(A and B) = P(A ∩ B) = P(A)·P(B))
      • Conditional Events : P(A|B) = P(A and B) / P(B) often re-written as P(A and B) = P(A ∩ B) = P(A|B)·P(B)
    • Addition Rules (aka Union) ["or"]
      • Full Rule : P(A or B) = P(A) + P(B) - P(A and B)
      • Simplified Rule for Mutually Exclusive Events [Disjoint] : P(A or B) = P(A) + P(B)
    • Multiplication Rules (aka Intersection) ["and"]
      • Full Rule : P(A and B) = P(A ∩ B) = P(A)·P(B|A) can also be re-writen as P(A and B) = P(A ∩ B) = P(B) * P(A|B)
      • Simplified Rule for Independent Events : P(A and B) = P(A ∩ B) = P(A)·P(B)
    • Bayes' Rule

Chapter 7 Terminology : Random Variables : numeric outcome of a random phenomenon

  • Random Variable, X
    • Law of Large Numbers : simulated (empirical) probabilities tend to get closer to the true probability as the number of trials increases.
    • Discrete (integer values only for X)
      • P(X=a) : probability of the random variable being a fixed value; height of a probability histogram bar
    • Continuous (any real value for X)
      • P(X<a) : probability of the random variable being less than (or equal to) fixed value; area under the curve from left side to a [reading left to right]
      • P(X>a) : probability of the random variable being greater than (or equal to) fixed value; area under the curve from a to right side [reading left to right]
      • P(a<X<b) : probability of the random variable being between than (or equal to) two fixed values; area under the curve between a and b.
  • Distributions
    • Probability Distributions for a Random Variable, X
      • center : mean, x : sometimes misleadingly called the Expected Value, E(X)
      • spread : standard deviation, σx or variance, σx2 : Note: correlation for independant random variables = 0
      • obtained from Collected Data (i.e. empirical data)
        • using Known Data Frequencies taken from real-world situations
        • simulation using Random selection from a known data
      • obtained from Mathematics (i.e. theoretical data)
        • assumptions + Basic Mathematical Principles : Example : rolling a fair die.
    • Probability Distributions for multiple Random Variables, X and Y
      • linear transformation of one random variable X into another random variable Y : Y = aX+b :
        • center : uy = aux+b : the new center will shift
        • spread : σy2= b2σx2 : the new spread will dialate (stretch wider or narrower)
      • linear combinations of two independent random variables X ± Y :
        • center : ux+y = ux±uy : the new center will add or subtract.
        • spread : σx+y2 = σx2 + σy2 : the new spread ALWAYS add (spread can only increase)
  • Types of Distributions
    • Uniform : every outcome has the exact same probability (rolling a die); Parameter : p
    • Normal : most-used for sampling (Central Limit Theorem); Parameters : center=u; spread=σ
    • Binomial : only 2 possible outcomes with fixed probability of success (i.e. success or failure, flipping coin, yes/no questions, etc.); Parameter : p
    • Geometric : sequence of trals with 2 possible outcomes, fixed probability of success, and n trials before first success Parameter : p and n
    • Chi-Square : used for categorical data (Sem 2)
    • Other : HyperGeometric, Poisson, Gamma, etc. (not on AP Exam)...each is a model of various real-world scenarios

Chapter 8 : Binomial and Geometric Distributions

  • Binomial Distributions : Discrete Random Variable with only 2 categories (i.e. "success" or "failure")
    • n = fixed number of independant trials, must be known
    • p = probability of success on any one trial, must be the same for each trial, must be known
    • 1-p = q = probability of failure on any one trial
    • P(X = k) = nCk pk (1 - p)n - k
    • P(X > n) = 1 - P(X ≤ n)
    • probability distribution function at X="number of success":
      • calculator : binompdf(number of trials, probability of success, number of successes)
    • cumulative (i.e. sum of ) probability distribution function for 0 ≤ X ≤ "number of successes" = area under curve
      • calculator : binomcdf(number of trials, probability of success, number of successes)
  • Statistics of a Binomial Distribution
    • u = np
    • σ = √[np(1-p)]
    • Normal approximation to Binomial dDistribution ~ BINS (binomial, independant, number of trials is fixed, success probabilities is known)
      • N(np, √[np(1-p)] ) : used only if np ≥ 10 and n(1-p) ≥ 10
  • Geometric Distributions : Discrete Random Variable with only 2 categories (i.e. "success" or "failure") with known number of trials before first success.
    • n = number of independant trials (not fixed)... we are trying to determine this number of trials before we get our first "success"!
    • p = probability of success on any one trial, must be the same for each trial, must be known
    • 1-p = q = probability of failure on any one trial
    • P(X = n) = (1-p)n -1 p1
    • P(X > n) = 1 - P(X≤n)
    • probability distribution function at X="number of success":
      • calculator : binompdf(number of trials, probability of success, number of successes)
    • cumulative (i.e. sum of ) probability distribution function for 0 ≤ X ≤ "number of successes" = area under curve
      • calculator : binomcdf(number of trials, probability of success, number of successes)
  • Statistics of a Geometric Distribution
    • u = 1/p
    • σ = √(1-p) / p

Chapter 2 : Normal Distributions

  • The Central Limit Theorem (CLT) : if a sample size is sufficently large, the sampling distribution of a random variable will be approximated by a normal distribution.
  • Definitions and Formulas:
    1. Shape : normal distribution : bell-shaped and symmetric about the mean and inflection point at 1 standard deviation from mean.
    2. Center : mean : arithmetic average of data points :
    3. Spread : standard deviation : average of all deviations of data points away from the mean
    4. Standardized score : z-score : z = (dataValue - mean) / (standard deviation) = (x - u) / σ
      1. z-score as a standard point on normal curve [also the formula to calculate as well as re-centering and re-scaling]. Usually marked on the horizontal (i.e. x-axis) axis.
      2. N(0,1) is the Standard Normal curve (i.e. Z-curve)
      3. Calculator : normalcdf ( leftbound, rightbound, mean, standard deviation) : area between leftbound and rightbound values on a normal probability distribution (also the precentage or probability that a data value will be between leftbound and rightbound).
      4. Calculator : z = invNorm ( area, mean, SD ) : the rightbound value that corresponds to the given area under the curve starting at a leftbound at -infinity (i.e. furthest left)
      5. Must also use a standard normal table or computer-generated output
      6. Note : z-scores (like percentages) are used to compare relative positions of points within or between data sets (distributions)
  • normal curve of a population
    • Parameters for Quantitative Variable (i.e. summary numbers of population):
      • u : population mean [also the formula to calculate] is the center
      • σ : population standard deviation, [also the formula to calculate] is the spread
    • Parameters for Categorical Variable (i.e. summary numbers of population):
      • p : population proportion [also the formula to calculate] is the center
      • Note : there is no population standard deviation (spread) for categorical data
  • normal distribution (i.e. probability density curves) from theory

    • total area under probability density curve = 1
    • visually determine : median (equal area point), mean (balance point)
    • skewed : mean further toward tail than median
    • N(u, σ ) : Normal (bell) curve [i.e. Normal Probability Density Curve] with center at u and standard deviation at σ
    • 68-95-99.7 rule
    • Estimate normality from plot of histogram, stemplot and/or boxplot
  • normal point-estimators (or statistical) from sample data

    • Statistics for Quantitative Variable
      • x̄ = u : sample mean as an estimate for population mean [also the formula to calculate]
      • sx = σx : sample standard deviation as an estimate for population standard deviation [also the formula to calculate]
    • Statistics for Categorical Variable
      • p̂ = p: sample proportion as an estimate for population proportion [also the formula to calculate]
      • s: sample standard deviation for a normal sampling distribution [also the formula to calculate]
    • NOTES :
      1. Unbiased point estimator = population parameter
      2. To avoid bias estimators, samples must be taken from the population only if :
        1. random
        2. independent (n < 10% * N)
        3. normal : (n>30 for quantitative, np≥10 and n(1-p)≥10 for categorical )...later other distributions have different requirements