Chapter 1 Terminology : Exploring Data
Fathom Terminology
 Toolbar
 Formula Editor
 Data :
 Collection
 Case or Cases (i.e. individuals)
 Attribute (i.e. variable)
 Value
 Viewing the Data:
 Inspection Window
 Table (or Case Table) : an organized collection of related data cases
 Graph (or Plot)  bar, pie, histogram, dotplot, stemplot
 Summary Table or Chart
Definitions
Some important terms that are specific to AP Statistics

Data : one or more values representing counts or measuments. Note: "data" can be singular or plural as in the word "deer" (i.e. "one deer or many deer"). This word "data" can be a source of confusion.

Distribution : multiple data values that share a common relationship (i.e. can be graphed on the same axis)

Population : a group that is the focus of interest. Usually the population size is too large to gather complete data (called a "census"). Instead, we take a sample of the population. Population size is usually denoted by "N".

Parameter : a single number that represents a summary of the population data. Examples include mean of population, standard deviation of population, or population variance. Discovering parameters is one of the main goals of the field of study called Statistics.

Sample : a group of data that is taken from a population (i.e. subset of population data). The size (or number) of a sample is denoted by "n".

Statistic (aka Point Estimator): a single numerical that represents a summary of sample data. Examples include: sample mean, sample standard deviation, sample variance. Calculating a statistic to estimate a parameter is one of the main processes in the field of study called Statistics.

Note: the word "sample" can be a source of confusion! Sample could be a single member of the population (i.e. n = 1), a small group from population (n<N), or even the full population (n=N) in which case the sample is a "census". A "sample" could even represent a group of other samples from a distribution!
Types of Data
Data types are known as "attributes" in Fathom
 categorical (or qualitative) variable : data used as counts of a descriptor (i.e. labels or names).
 Numeric Representation : frequency table : number of cases (i.e. counts) falling within each category
 Numeric Representation : relative frequency table: synonyms : proportion, fraction, rates, ratios, or % of cases falling within each category. Note: Divide the frequency or count by the total number of counts gives the relative frequency.
 Graphical Representation : bar chart : height corresponds to count or proportion of data in category.
 quantitative variable : data that takes on values that are either a count of a quantity (i.e. 5 bananas) or measured realvalued quantity (i.e. 2.3 inches).
 Discrete Quantity:
 Numeric Representation : data can take on any countable (or countably infinite) value. (i.e. Integers or Rationals but not Irrationals).
 Graphical Representations : histogram (separate bars), stem and leaf plot, and dot plot
 Continuous Quantity:
 Numeric Representation : data can take on any realnumbered value but cannot be counted (i.e. Reals). Between any 2 data values, another value can always be calculated.
 Graphical Representations : histogram (bars share common sides) and probability density function or PDF (normal, uniform, etc.)l
 Notes :
 A quantity is a number with descriptor and, therefore, has both a mathematical component and a language component (or "context").
 Example1 : quantity : 5 bananas has a number (5) and a descriptor or unit (bananas)
 Example2 : numbers : 5 (count) , 1/3 (proportion), 25% (proportion) , 1st (categorical), pi (continuous real number)
 A source of confusion : numbers can be used with the "nominative" property in place of textbased labels.
 Example1 : categorical data : In the Olympics, a country may have : five 1st place winners, 3 2nd place winners, and VIII 3rd place winners. The counts (five, 3, 7) are applied to the categories (1st place, 2nd place, 3rd place). Note: these 3 categories, while written as numbers, cannot be "added" together without creating a different overall category such as "medal winners."
 Example2 : quantitative (discrete) data : Similar to above but in a different context...in the Olympics, a country may have : five 1st place winners, 3 2nd place winners, and VIII 3rd place winners. If we define an overall category of "medal winners" then the numbers (five, 3, VII) can be added together to have a total of 5+3+8 = 16 medal winners. However, it would not make sense to find the "average" number of medal winners (i.e. 16/3 = 5.333...) and therefore these are NOT continuous quantitative data.
 Example3 : categorical data : The following list of items are labeled with numbers to put them in order of priority but only represent categories and not quantitative data : 1. milk; 2. cheese; 3. candy; 4. soda. Note : we can replace the numbered list with letters such as A. milk; B. cheese; C. candy; D: soda without losing any information.
 Example4 : quantitative (continuous) data : In a school, the heights of the children were 60.3 inches, 39.7321 inches, and 40 inches. The total height would be 140.0321 inches and the mean height is 140.0321/3 = 46.67736667... inches
 The distinguishing factor is whether any given number can be used to "compute" other values and whether means (averages) has contextual meaning.
 For more details, see the following lecture notes:

Lecture 1 : Numerals : how we write/communicate numbers.
Example A  different ways to write the same Number (note: there are many more ways than the 8 listed here):

27 (HinduArabic numeral system used in most modern societies)

twentyseven (Englishlanguage based numeral system  verbal communications)

XXVII (roman numerals  old but not designed for computation)

0001 1011 (binary numerals  computer representations)

///// ///// ///// ///// ///// // (tally system  simple)

one score and seven (English ala Lincoln)

00:27:00 (sexagesimal system  based on degrees:minutes:seconds) and one of the oldest numeral systems in existence. Ancient Babylonians used to compute time, geography, and angles...base60 avoids many issues with fractions/division due to the First Fundamental Theorem of Arithmetic)

1B (hexadecimal system which uses the digits 0,1..9,A,B,C,D,E,F
Example B  the numerals 0, 1, 2, and 3 in mathematical settheoretic notation...uses only the three symbols: {,}

0 = {},

1 = {0} = {{}},

2 = {0,1} = {{},{{}}},

3 = {0,1,2} = {{},{{}},{{},{{}}}}
Lecture 2 : Numbers : how we use numbers to count, measure, and label objects.

When we use numbers to "count" objects, we are referring to the "cardinality" property.

When we use numbers to compare ("bigger versus smaller") against objects, we are referring to the "comparison" property.

When we use numbers to label objects (like the list below) we are referring to the "nominal" property.
Usually we just talk about "Number Systems" which are sets of numbers.

Natural Numbers (aka Counting Numbers)
 If used to describe the "size" of an object then we call the number a Cardinal such as 5 bananas [five is the cardinality of the set of bananas].
 If used to describe a property of an object for comparison purposes thenl the number is a Measurement such as "I scored an 85% on the test." The 85% by itself does not mean anything unless we compare it to other scores. (If the measurement includes a magnitude and a unita word such as "meter" it is called a Quantity.)
 If used only as a label rather than for any calculations, then the Number is a Nominal such as this listing or numbering of the definitions for Cardinal, Measurement, and Nominal...we could have just used a bulleted list instead of a numbered list.

Integers (discrete) : no fractions or decimals, includes positives, negatives, and 0.

Rationals (discrete) : Integer divided by Integer; also called Fractions, % , proportions

Real : Rational + Irrationals (i.e. nonterminating, nonrepeating decimals)

Complex Numbers : Real + Imaginary (numbers that can go roundandround)

Types of Graphical Representations
 Charts : Categorical Data
 bar chart (bar graph) :
 horizontal axis : categories (labels or names)
 vertical axis : counts (i.e. frequency) or relative frequency (proportions or fractions)
 Plots & Graphs : Quantitative (Discrete or Continuous) Data
 boxplot (and modified boxplot) :
 horizontal axis : data values
 vertical axis : none, may have several boxplots sidebyside to compare similar data in different categories
 uses 5number summary from left to right : min,Q1,Q2,Q3,max
 box is from Q1 to Q3 with horizontal lines ("whiskers") to min and max. Modified boxplots have whiskers that exclude outliers.. Vertical line at Q2 (median).
 dotplot : horizontal axis corresponds to data values with a dot placed above the data value. Multiple indentical (or nearly identical) data values have dots stacked vertically.
 stemandleaf plot : "stem" is the first digit or leftmost digits of a number and "leaf" is usually the last digit (unit) of the number.
 other : time plot
 histograms
 vertical axis : counts of data values (frequency ) or proportions (i.e. relative frequency or pdf).
 horizontal axis : data values or ranges of data values
 Relative frequency histogram (discrete): data is grouped in bars or bins (ranges of values) separated by spaces or gaps.
 Relative frequency histogram (continuous) : data is grouped in bars or bins (ranges avalues) with no separation
 Probabiliy Density Function or PDF (continuous) : data is not grouped in bars (or grouped into infinitely thin bars) which creates a single continuous curve. The data is can be modeled by a theoretical (i.e. mathematical) function. The total area under a PDF curve is exactly 1 (100%). The two most common PDF's are normal (bellshaped or mountshaped) and uniform curves.
 cumulative frequency graph (aka "ogive"): a curve drawn to represent total values less than or equal to a given number (left side is 0, right side is 100%)
 horizontal axis : data values
 vertical axis : proportion totals
 Notes:
 Histograms and Bar Charts appear and are interpreted in a similar manner (i.e. counts=frequencies or relative frequencies=proportions).
 Bar Charts are for categorical data (horizontal axis are labels) whose bars that are separated by blank space.
 Histograms are for quantitative values (horizontal axis are data values or ranges of data values) with spaces between bars representing discrete quantities. Continuous quantities have no spaces between bars. A histogram with bars that become infinitely thin (width > 0) becomes a probability density function (pdf).
Distributions of Quantititative Data : SHAPE > CENTER > SPREAD

SHAPE of distribution : (i.e. describing a graph of a collection or group of data)
 symmetric : left and right halves are mirror images; mean and median are close to each other
 uniform (or rectangular) distribution : all heights are approximately the same (i.e. a flat top)
 normal distribution (i.e. "bell curve")
 skewness (i.e. partly symmetric but one side "longer" than the other)
 left skewed (or negative skew) : left tail is longer than right tail; mean is left of median
 right skewed (or positive skew) : right tail is longer than left tail; mean is right of median
 other
 bimodal distribution : two peaks or clusters
 gap : region of distribution between2 data values where there is no observed data

CENTER [summary statistic or point estimator : single value measurement or computation]
 mean : the arithmetic mean or average : add up all the data values and divide by the total count of data
 median : the middle value when data is ordered. Half of the data counts are below the median and half of the data counts are above the median. For AP Statistics, the median for an evennumber of data points is usually the mean (average) of the two middle data values.
 mode : datum that shows up the most; rarely used

SPREAD or variation [summary statistic or point estimator : single value measurement or computation]
 deviation (or residue)
 for means : standard deviation (s) or variance (s^{2})
 for medians : quartiles (ranges)
 5number summary : minimum, Q1, Q2 (median), Q3, maximum
 range : a single value = maximum  minimum. Note: in some math courses, the range is given as two values [min, max]
 interquartile range (IQR) = Q3  Q1
Other Unusual Features

Outliers : data points that are unusually small or large relative to the rest of the data.

for means : any value 2 or more standard deviations above or below the mean.

for medians : any value less than Q1  1.5*IQR or greater than Q3 + 1.5*IQR

Gaps : region of a distribution between two data values there there is no observed data

Clusters : concentrations of data usually separated by gaps.

The mean, standard deviation and range are nonresistant (nonrobust) because they are influenced (changed) by outliers. The median and IQR are resistant (robust) because outlies do not greatly (or at all) affect their values.
Linear Transformations : effects on mean, spread and change of units
Definition :
a Linear Transformation takes every data point and recalculates each to a new value using multiplication and/or addition by constants. An example of Linear Transformation is converting degrees Fahrenheit to degrees Celcius.
degree Celcius = 5/9 * (degree Fahrenheit  32)
 [i.e. subtract 32 and then multiply by 5/9]

When a linear transformation takes place, the following statistics are changed :
 The center (i.e. mean) is transformed by both the multiplication and addition constants. Example: if the mean temperature was 122 degrees fahrenheit, then the mean temperature was 5/9 * (12232) = 50 degrees Celcius
 The spread (i.e. standard deviation) is transformed by the multiplication only. Example: if the standard deviation was 18 degrees Fahrenheit, then the standard deviation would be 5/9*(18) = 10 degrees Celcius. Technically the variation (square of standard deviation) is multiplied by the square of the multiplicative constant s^{2}_{c} = (5/9)^{2} * s^{2}_{f}
Context Clues
 statistical use of a "count" :
 Mathematics : Generally, a "count" is a whole number (0, 1, 2, 3, ...). However, it may be a rational number if the item being "counted" is grouped in equalsized units. Ex: 1/3 of a case is equivalent to 4 bottles in a 12bottle case of soda.
 English : each numeric value must have units such as "oranges" or "individuals" or "cases of soda"
 Examples : 20 oranges, 17 people, 22.5 cases of soda
 statistical use of a "proportion" :
 Concept : part / total
 Mathematics : rationalvalue ratio between 0 and 1 (inclusive). May be represented by a fraction or decimal value or percentage.
 English : same units divide by same units (i.e. unitless quantity)
 Example : In a class of 25 students with 10 girls and 15 boys, the proportion of girls in the classroom is 10/25 (i.e. 40%). NOTE: Technically, the units are"students per students" but this does not need to be specified.
 statistical use of a "mean" :
 Concept : the (arithmetic) average value of a set of numbers
 Mathematics : ( Σ x_{i}) / n = (sum of values) / (count of values)
 English : units divided by a count (i.e. the mean has the same units as the numbers themselves)
 Example : In a class of 5 students, the points on a given quiz are 43, 47, 50, 40, and 44. The mean is (43+47+50+40+44) / 5 = 44.8 points/student. NOTE: The units are"points per student" and must be specified.
Chapter 3 : Two Variable Relations : Quantitative (continuous) : Linear
Comparison : Chapter 2 (univariate) and Chapter 3 (bivariate)
Chapters 1 & 2: One Variable(univariate data) :

Chapter 3: Two Quantitative Variables(bivariate data):


Key Idea 
Distribution (several related data points) 
Association (relation) :

Plots/Graphs 
Dot plot 
ScatterPlot Residual Plot 
Ideal Form (shape) 
Normal (bell or mound shaped) 
Linear (oval/ellipse) with either positive (uphill) or negative (downhill)direction 
Terminology 
Normal, uniform, or skewed Symmetric Clusters, gaps, and outliers 
Form (shape): Linear or NonLinear (or curved) > Direction (trend) : positive or negative or no association > > Strength : strong, moderate, or weak Unusual Features : clusters, gaps,

Measure of Center or Estimate of Population 
Mean (continuous) Median (discrete) 
Regression Line (LSRL)

Measure of Spread 
Standard Deviation (continuous) Interquartile Range (discrete) 
Correlation, r : number that gives direction and strength of linear association (but only if form is already known to be linear)

Summary line (LSRL) : similies & synonyms
 Least Squares Regression Line (LSRL)
 Fitted Line (student's best guess for a line)
 Line of Best Fit (best guess or LSRL)
 Regression Line (LSRL)
 Trend Line (LSRL)
ScatterPlots : form (shape) > direction (trend) > strength > variability
Relation between x and y variables (plausible explanations) : causation, common response, or confounding
 lurking variable
 residual plots
 outliers
 data distribution's form (shape) : linear, nonlinear, or none
 ŷ = a_{1} + b_{1}*x [equivalent to algebra equation of line : y = mx + b]
 form's (shape's) direction (trend) : positive slope, negative slope, or none
 b_{1} : measure of slope
 direction's (trend's) strength : strong (tight cluster), moderate (some clustering), or weak (no cluster)
 correlation, r : a measure of a trend's strength
 strength's variability : uniform or heteroscedasticity (fanshaped)
Symbols and Definitions :
 ŷ = a_{1} + b_{1}*x : least squares regression line ("regression line" or just LSRL) : Note: usually provided by calculator or software.
 x : explanatory variable or predictor variable from data points
 y : response variable or observed variable from data points
 ŷ : predicted value calculated from LSRL
 r : correlation coefficient : Note : formula will be given...usually calculator or software will provide this number
 r^{2}: coefficient of determination : proportion (percentage) of variation (i.e. change) in y that is explained by variation (i.e. change) in x.
 Coefficients of LSRL :
 b_{1 } = r* (s_{y}/s_{x}) : slope of the LSRL : the change in predicted amount of ŷ for every unit increase in x. Note: s_{y} = standard deviation of all data points' yvalues, s_{x} = standard deviation of all data points' x=values...useful if student is given r, s_{y}, and s_{x}
 a_{1} : yintercept, : not always useful in context.
 Example: Given LSRL model : height(inches) = 2.75 * age(years) + 20; the coefficients are :
 ŷ = "predicted height in inches"
 x = "age in years"
 b_{1}=2.75 intrepreted as "for every 1 year increase in age, the predicted height increases by 2.75 inches"
 a_{1}=20 interpreted as "when born (i.e. age=0 years), the height is predicted to be 20 inches"
 so, for a 10yearold, the predicted height would be 47.5 inches" (47.5= 2.75*10 + 20)
 interpolations versus extrapolation
 Special Points (compare & contrast) :
 influential point : any point that, when removed, changes the LSRL significantly (i.e. changes b_{1}, a_{1}, or r values)
 outlier in y: large residual yŷ in LRSL compared to other observation points
 highleverage point in x : substantially higher or smaller xvalue than other observation points
 residual = yŷ = observed y  predicted y : Note : random residual plot is evidence for a linear form.
 sum of square errors (SSE)
 "correlation does not imply causation" due to lurking variable
Chapter 4 : Two Variable Relations : Quantitative (continuous) : NonLinear and Categorical
Quantitative (continuous) Data : NonLinear
 Transformations :

Purpose : convert nonlinear data into linear data so we can use LSRL model

Note: textbooks may use "x" or "t" (for time) for the explanatory variable

linear > linear models : y = a + bx or y = a + bt
 increase is fixed amount from previous values
 Example : useful for "simple" unit conversions such as Fahrenheit to/from Celcius : degC = 32 + 9/5*degF

exponential > linear models : y = ab^{x} or y = ab^{t}
 plot : "log y against x" or "log y against t" to see linear pattern
 Example : biological growth of cell division can be modeled by y = 2^{t}; We can take the log of both sides to get: log(y) = log(2) * t which graphs to a line

power > linear models : y = at^{p} or y = ax^{p}

Note: "p" is the power (or exponent) used in the equation
 plot : "log y against log x" or "log y against log t" graph to see linear pattern
 Example : geometry (volumes and areas) can be modeled such as y = x^{3}. We can take the log of both sides to get: log(y) = 3 * log(x) which graphs to a line.
 square, p=2 : y = x^{2} or y = t^{2}^{}
 reciprocal, p=1 : y = 1/x or y = 1/t
 reciprocal square root, p=1/2 : y = x^{1/2} or y = t^{1/2}
 logarithm, p=0 : y = log(x) or y = log(t)
Categorical Data : counts and percentages within categories
 Recall Chapter 1 : One Variable (univariate) :
 Visualizations : bar graphs, segmented bar graphs, mosaic plots
 Two Variable (bivariate)
 Visualizations : twoway tables or contingency tables
 Cell data may be :
 frequency counts :
 Example frequency table : Categorical Variables : sex versus color preferences for 30 people

Red Blue Male 5 7 Female 12 6  joint relative frequency (cell frequency divided by total for the entire table) :
 Example relative frequency table: Categorical Variables : sex versus color preferences for 30 people

Red Blue Male 6/30 = 20% 7/30 = 23% Female 12/30 = 40% 6/30 = 20%  other relative frequencies (percentages) tables also exist
 Summary Statistics :
 marginal frequencies : additional cell at bottom of each column (or right of each row) with column (row) totals divided by total of entire table.
 Example Marginal frequencies: Categorical Variables : sex versus color preferences

Red Blue Totals (i.e. Column Margin) Male 5 7 12 Female 12 6 18 Totals (i.e. Row Margin) 17 13 30  conditional frequencies : each cell = cell's relative frequency divided by marginal frequncy (row or column)
 Example Conditional frequencies: Categorical Variables : sex versus color preferences

Red Blue Totals (i.e. Column Margin) Male 5/12=42% or 5/17=29% 7/12=58% or 7/13=54% 12/30=40% Female 12/18=67% or 12/17=71% 6/18=33% or 6/13=46% 18/30=60% Totals (i.e. Row Margin) 17/30=57% 13/30=43% 30/30=100%
Simpson's Paradox (lurking variable as a precondition) : occurs when categories are a combination of smaller categories.
Chapter 5 : Collecting, Producing & Exploring Data
Definitions and Fundamental Concepts
 population : all items or subjects (units) of interest. "N" is population size.
 sample : selected subset of population. "n" is sample size
 collection types : census versus sample
 summary numbers : parameter versus sample statistic (or point estimator)
Collecting Samples
Overarching Concerns when collecting a Sample from a Population :
 BIAS : certain responses (or samples) are systematically favored over other responses (or samples)

samples : unbiased representatation of population [MUST BE random and independent]
Methods to Collecting Samples to avoid Bias
 simple random sample (SRS) : each sample as an equal probability of being selected. Use of table of random digits, calculator, or software
 stratified random sample : division of population into smaller groups (strata) of similar individuals (homogeneous grouping), SRS within each strata. Example: strata of boys then SRS of boys and strata of girls then SRS of girls.
 cluster random sample : division of population into smaller groups (heterogeneous grouping). SRS within each cluster. Example: Select two cities with similar populations then SRS from each city (i.e. cluster).
 systematic sample with random starting point and fixed, periodic interval. Example: Number each student in classroom, start with a random student (say 5th student) and then select every 3rd (i.e. 8th, 11th, 14th, etc.) student.
 Other : twostage (or multistage) cluster sample
Types of Bias in Samples :
 undercoverage bias : part of population has a reduced chance of being include in sample.
 nonresponse bias : individuals chosen for sample refuse to respond.
 response bias & questionwording bias : confusing words and/or leading questions. Example: "If blue is the most favored pigment, what is your favorite color of amphibians?"
 convenience bias : nonrandom selection based on preferences
 volunteer bias : use of only volunteers will not be representation of whole population
 Other Biases : judgement bias, size bias, incorrect response bias (lying)
Experiment vs Observational Study

Design of observational studies
Purpose : help investigate a topic of interest about a population. No treatment is imposed and no causal relationship can be determined.
 retrospective : examine data for a sample of individuals
 prospective : follow a sample of individuals to gather data into the future.
 sample survey : an observational study to collect data in order to learn about population from a sample of the population. Must be random and representative of population (i.e. minimize known biases).

Designs of experiments
Purpose : help determine causal relationships:
 experimental units ("subjects" or "participants" if human) > treatment (factor vs plecebo) > observed response (measured or categorized)
 comparison of at least 2 treatment groups (one may be "control");
 randomize assignments of treatments to experimental units;
 replicate
 control for potential confounding variables, plecebo & plecebo effect.
 Types of Experimental Designs : Randomization & Independence
 single blind experiments & double blind experiments
 completely randomized design : treatments assigned to experimental units completely at random
 randomized block design : experimental units organized into similarvariable blocks. Treatments are assigned randomly to each block.
 randomized matched paired design : 2 treatments to a single experimental unit at different times or 2 subjects sharing a common relevant factor (such as age) each given one treatment.
 Variables & Treatments in Experiments
 explanatory variable or factor (if categorical) has levels that are chosen intentionally. Levels (or combination of levels) are treatments.
 response variable : measured outcome after treatment has been imposed.
 lurking variable
 confounding variable : a 3rd variable related to the explanatory variable and influences the response.
 control group : collection of experimental units either not given a treatment or given a plecebo.
 plecebo : an inactive substance given to a control group as a treatment that should not have an effect on the measure response. A plecebo effect may occur if the experimental untis have a response to the plecebo.
 Variability (not on AP Exam)
 betweentreatment
 withintreatment
 Significance :
 statistically significant : observed changes are so large as to be unlikely to have occurred by chance.
 practically significant (i.e. lack of realism) : numerically large changes may not have an impact on the topic of interest.
 Example : There may be a statistically significant difference between test scores (say 83% and 87%) that cannot be explained by random chance. However, there may not be any practically significant difference as both scores result in a "B" grade.

Simulation of Experiments
Purpose : model of chance behavior (random events) such that simulated outcome closely matches realworld outcomes. The outcomes are based on either empirical data or mathematical probability model of the realworld experiment. Simulation may be done because a realworld experiment may be impractical or too costly in terms of time or money.
 Definitions :
 random process : generates results that are determined by chance
 outcome : result of a trial of a random process
 event : collection of outcomes
 simulation : a model random events.
 Procedure :
 Describe the realworld experiment.
 State assumptions about model for one trial and it's connection to realworld experiment.
 Assign digits to represent every realworld outcome. (random number table)
 Simulate many repetitions (i.e. many trials) to generate counts for each outcome
 Calculate probabilities (counts/toals) & state conclusions
 Notes:
 Law of Large Numbers : simulated (empirical) probabilities tend to get closer to the true probability as the number of trials increases. (false) Law of Small Numbers : patterns (or probabilities) may not match expectations during a small number of trials. Humans are susceptible to this effect...if we flip a fair coin 10 times, we may reject the "fairness" of the coin if we observe 2 heads and 8 tails because we expect about 5 heads and 5 tails (i.e. fair coin defined as 50% probability of heads) Not on AP Exam.
Chapter 6 Terminology : Probability : a study of randomness
 event : a single occurance of some item of interest
 Visualization : venn diagrams & Tree Diagrams

Probability :
 probability of an event A : in repeatable situations can be interpreted as the relative frequency of the event in the long run; P(A) = (number of outcomes of event A) / (total number of outcomes in sample space)
 complement of event : A' or A^{c} and probability that event A will NOT occur is P(A^{c}) = 1 P(A)
 mutually exclusive events A and B [disjoint categories] : P(A ∩ B) = 0
 conditional events A given B : P(AB) = P(A ∩ B) / P(B) or, rearranging P(A ∩ B) = P(B)·P(AB)
 independent events A and B: P(A) is not changed by knowing P(B) or P(AB) = P(A) also P(A ∩ B) = P(A)·P(B)
 "of at least one" : P("of at least one") = 1  P("exactly none")
 probability distribution : represented as a table or function showing the probability of each value of the random variable. Interpretation provides information about the shape, center, and spread of a population also known as normal, mean, and variance (or standard deviation)
 probability cumulative distribution : represented as a table or function showing the probability of being les than or equal to each value of the random variable. Example :
Roll of Die (Random Variable) 1 2 3 4 5 6 Probability Distibution 1/6=16.6% 16.6% 16.6% 16.6% 16.6% 16.6% Cumulative Probability Distibution 1/6 = 17% 2/6=33% 50% 67% 83% 100%  model : applying mathematical expression(s) or equation(s) to a realworld scenario. This is often seen when applying a probability distribution to a real population.

Sampling
 space : set of all possible nonoverlapping outcomes
 with Replacement
 without Replacement

Mathematical
 Law of Large Numbers : simulated (empirical) probabilities tend to get closer to the true probability as the number of trials increases.
 Fundamental Counting Principle
 Terminology or Rules regarding Probabilities
 Legal Values : 0 ≤ P(A) ≤ 1 for any Event A
 Total Probability = 1 : P(all sample space) = 1 : sum total probability of all sample space is 1.00 (100%)
 Complement of Event : P(A^{c}) = 1  P(A)
 P("of at least one") = 1  P("exactly none")
 Mutually Exclusive Events [i.e. Disjoint] : P(A and B) = P(A ∩ B) 0
 Independent Events : P(AB) = P(A) also P(A and B) = P(A ∩ B) = P(A)·P(B))
 Conditional Events : P(AB) = P(A and B) / P(B) often rewritten as P(A and B) = P(A ∩ B) = P(AB)·P(B)
 Addition Rules (aka Union) ["or"]
 Full Rule : P(A or B) = P(A) + P(B)  P(A and B)
 Simplified Rule for Mutually Exclusive Events [Disjoint] : P(A or B) = P(A) + P(B)
 Multiplication Rules (aka Intersection) ["and"]
 Full Rule : P(A and B) = P(A ∩ B) = P(A)·P(BA) can also be rewriten as P(A and B) = P(A ∩ B) = P(B) * P(AB)
 Simplified Rule for Independent Events : P(A and B) = P(A ∩ B) = P(A)·P(B)
 Bayes' Rule
Chapter 7 Terminology : Random Variables : numeric outcome of a random phenomenon
 Random Variable, X
 Law of Large Numbers : simulated (empirical) probabilities tend to get closer to the true probability as the number of trials increases.
 Discrete (integer values only for X)
 P(X=a) : probability of the random variable being a fixed value; height of a probability histogram bar
 Continuous (any real value for X)
 P(X<a) : probability of the random variable being less than (or equal to) fixed value; area under the curve from left side to a [reading left to right]
 P(X>a) : probability of the random variable being greater than (or equal to) fixed value; area under the curve from a to right side [reading left to right]
 P(a<X<b) : probability of the random variable being between than (or equal to) two fixed values; area under the curve between a and b.
 Distributions
 Probability Distributions for a Random Variable, X
 center : mean, _{x} : sometimes misleadingly called the Expected Value, E(X)
 spread : standard deviation, σ_{x} or variance, σ_{x}^{2} : Note: correlation for independant random variables = 0
 obtained from Collected Data (i.e. empirical data)
 using Known Data Frequencies taken from realworld situations
 simulation using Random selection from a known data
 obtained from Mathematics (i.e. theoretical data)
 assumptions + Basic Mathematical Principles : Example : rolling a fair die.
 Probability Distributions for multiple Random Variables, X and Y
 linear transformation of one random variable X into another random variable Y : Y = aX+b :
 center : u_{y} = au_{x}+b : the new center will shift
 spread : σ_{y}^{2}= b^{2}σ_{x}^{2} : the new spread will dialate (stretch wider or narrower)
 linear combinations of two independent random variables X ± Y :
 center : u_{x+y} = u_{x}±u_{y} : the new center will add or subtract.
 spread : σ_{x+y}^{2} = σ_{x}^{2} + σ_{y}^{2} : the new spread ALWAYS add (spread can only increase)
 Types of Distributions
 Uniform : every outcome has the exact same probability (rolling a die); Parameter : p
 Normal : mostused for sampling (Central Limit Theorem); Parameters : center=u; spread=σ
 Binomial : only 2 possible outcomes with fixed probability of success (i.e. success or failure, flipping coin, yes/no questions, etc.); Parameter : p
 Geometric : sequence of trals with 2 possible outcomes, fixed probability of success, and n trials before first success Parameter : p and n
 ChiSquare : used for categorical data (Sem 2)
 Other : HyperGeometric, Poisson, Gamma, etc. (not on AP Exam)...each is a model of various realworld scenarios
Chapter 8 : Binomial and Geometric Distributions
 Binomial Distributions : Discrete Random Variable with only 2 categories (i.e. "success" or "failure")
 n = fixed number of independant trials, must be known
 p = probability of success on any one trial, must be the same for each trial, must be known
 1p = q = probability of failure on any one trial
 P(X = k) = _{n}C_{k} p^{k} (1  p)^{n  k}
 P(X > n) = 1  P(X ≤ n)
 probability distribution function at X="number of success":
 calculator : binompdf(number of trials, probability of success, number of successes)
 cumulative (i.e. sum of ) probability distribution function for 0 ≤ X ≤ "number of successes" = area under curve
 calculator : binomcdf(number of trials, probability of success, number of successes)
 Statistics of a Binomial Distribution
 u = np
 σ = √_{[np(1p)]}
 Normal approximation to Binomial
dDistribution ~ BINS (binomial, independant, number of trials is
fixed, success probabilities
is
known)
 N(np, √_{[np(1p)]} ) : used only if np ≥ 10 and n(1p) ≥ 10
 Geometric Distributions : Discrete Random Variable with only 2 categories (i.e. "success" or "failure") with known number of trials before first success.
 n = number of independant trials (not fixed)... we are trying to determine this number of trials before we get our first "success"!
 p = probability of success on any one trial, must be the same for each trial, must be known
 1p = q = probability of failure on any one trial
 P(X = n) = (1p)^{n 1} p^{1}
 P(X > n) = 1  P(X≤n)
 probability distribution function at X="number of success":
 calculator : binompdf(number of trials, probability of success, number of successes)
 cumulative (i.e. sum of ) probability distribution function for 0 ≤ X ≤ "number of successes" = area under curve
 calculator : binomcdf(number of trials, probability of success, number of successes)
 Statistics of a Geometric Distribution
 u = 1/p
 σ = √_{(1p)} / p
Chapter 2 : Normal Distributions
 The Central Limit Theorem (CLT) : if a sample size is sufficently large, the sampling distribution of a random variable will be approximated by a normal distribution.

Definitions and Formulas:
 Shape : normal distribution : bellshaped and symmetric about the mean and inflection point at 1 standard deviation from mean.
 Center : mean : arithmetic average of data points :
 Spread : standard deviation : average of all deviations of data points away from the mean
 Standardized score : zscore : z = (dataValue  mean) / (standard deviation) = (x  u) / σ
 zscore as a standard point on normal curve [also the formula to calculate as well as recentering and rescaling]. Usually marked on the horizontal (i.e. xaxis) axis.
 N(0,1) is the Standard Normal curve (i.e. Zcurve)
 Calculator : normalcdf ( leftbound, rightbound, mean, standard deviation) : area between leftbound and rightbound values on a normal probability distribution (also the precentage or probability that a data value will be between leftbound and rightbound).
 Calculator : z = invNorm ( area, mean, SD ) : the rightbound value that corresponds to the given area under the curve starting at a leftbound at infinity (i.e. furthest left)
 Must also use a standard normal table or computergenerated output
 Note : zscores (like percentages) are used to compare relative positions of points within or between data sets (distributions)
 normal curve of a population
 Parameters for Quantitative Variable (i.e. summary numbers of population):
 u : population mean [also the formula to calculate] is the center
 σ : population standard deviation, [also the formula to calculate] is the spread
 Parameters for Categorical Variable (i.e. summary numbers of population):
 p : population proportion [also the formula to calculate] is the center
 Note : there is no population standard deviation (spread) for categorical data

normal distribution (i.e. probability density curves) from theory
 total area under probability density curve = 1
 visually determine : median (equal area point), mean (balance point)
 skewed : mean further toward tail than median
 N(u, σ ) : Normal (bell) curve [i.e. Normal Probability Density Curve] with center at u and standard deviation at σ
 689599.7 rule
 Estimate normality from plot of histogram, stemplot and/or boxplot

normal pointestimators (or statistical) from sample data
 Statistics for Quantitative Variable
 x̄ = u : sample mean as an estimate for population mean [also the formula to calculate]
 s_{x} = σ_{x} : sample standard deviation as an estimate for population standard deviation [also the formula to calculate]
 Statistics for Categorical Variable
 p̂ = p: sample proportion as an estimate for population proportion [also the formula to calculate]
 s_{p̂}: sample standard deviation for a normal sampling distribution [also the formula to calculate]
 NOTES :
 Unbiased point estimator = population parameter
 To avoid bias estimators, samples must be taken from the population only if :
 random
 independent (n < 10% * N)
 normal : (n>30 for quantitative, np≥10 and n(1p)≥10 for categorical )...later other distributions have different requirements