#StatisticalAnalysis #Year12 #Advanced
>[!info]- [Random variables | NSW Curriculum Website](https://curriculum.nsw.edu.au/learning-areas/mathematics/mathematics-advanced-11-12-2024/content/n12/fa4579cc54)
>- MAV-12-07 solves problems involving discrete probability distributions, continuous random variables and the normal distribution
## ๐ Prior Knowledge
| Content | Prior knowledge | Used for |
| --------------------------------------------- | -------------------------------------------------- | -------------------------------------------------------- |
| [[Data Classification and Visualisation]] | - frequency tables and histograms | - probability distributions |
| [[Stage 4/Data Analysis\|Data Analysis]] | - mean<br>- summary statistics, shape of a dataset | - expected value<br>- identifying normal distributions |
| [[Probability and Data\|Probability]] | - relative frequency | - estimating probabilities of a discrete random variable |
| [[Data Analysis A]] | - standard deviation | - variance and standard deviation<br>- empirical rule |
## Discrete random variables
- Use the relative frequencies of discrete random variable datasets to estimate the probabilities that the random variable $X$ takes each of its values, and explain why these estimated probabilities add to 1
- Denote the probability that the discrete random variable $X$ takes the value $x$ by $P(X=x)$, or by $P(x)$ if the discrete random variable $X$ is understood
- Define a discrete probability distribution to be the set of values $x$ taken by a discrete random variable $X$, together with the probabilities $P(x)$ that $x$ is the outcome of the experiment
- Define a discrete random variable to be uniformly distributed if it has finitely many values, all with the same probability, and use it to model random phenomena with equally likely outcomes
- Recognise the mean, or expected value, $\mu$ or $E(X)$, of a discrete random variable $X$ as a measure of centre for its distribution
- Generate and use the formulas $E(X)=\mu=\Sigma xP(x)$ and $Var(X)=\sigma^2=\Sigma(x-\mu)^2P(x)=\Sigma x^2P(x)-\mu^2$ for the random variable $X$, where $Var(X)$ is the variance and $\sigma$ is the standard deviation of the distribution
- Recognise that the variance is an expected value because $Var(X)=E((X-\mu)^2)$
- Generate a probability distribution for a given discrete random variable and represent the probability distribution in graphical and tabular form
- Solve problems involving probabilities, expectation and variance of discrete random variables
## Continuous random variables
- Estimate the probability that a continuous random variable falls in some interval using relative frequencies and histograms obtained from data
- Recognise that the probability of a particular value for a continuous random variable is 0 and hence that $P(a<X<b)=P(a\leq X<b)=P(a<X\leq b)=P(a\leq X\leq b)$ since $P(X=a)=P(X=b)=0$ when $X$ is a continuous random variable
- Define the cumulative distribution function (CDF), $F(x)$, as the probability of a random variable, $X$, having values less than or equal to $x$, so $F(x)=P(X\leq x)$ and $P(a\leq x\leq b)=F(b)-F(a)$
- Recognise that the cumulative distribution function, $F(x)$, is non-decreasing for all $x$ in its domain, and graph cumulative distribution functions, given a formula for $F(x)$, with and without graphing applications
- Define a probability density function (PDF), $f(x)$, for a random variable $X$ with cumulative distribution function $F(x)$ as $f(x)=F'(x)$ and recognise that $P(a<X<b)=\int_{a}^{b}{f(x)}dx$
- Recognise the properties of a probability density function: $f(x)\geq0$ for all $x$ in the domain of $f(x)$, and $\int_{a}^{b}f(x)dx=1$ if the domain of $f(x)$ is $[a,b]$, or $\int_{-\infty}^{\infty}f(x)dx=1$ if the domain of $f(x)$ is all real $x$
- Apply the properties of a probability density function to solve problems and justify conclusions
- Find the mode from a given probability density function
- Obtain the cumulative distribution function using the formula $F(x)=\int_{a}^{x}f(t)dt$ where $f(x)$ is a given probability density function defined on the interval $[a,b]$
- Determine and use the probability density function $f(x)=\frac{1}{b-a}$ for a continuous uniform distribution for a random variable $X$ taking values in the interval $[a,b]$
- Use a cumulative distribution function to calculate the median and quartiles for a continuous random variable
- Find the probability density function from a given cumulative distribution function
- Generate the expression for the expected value of a continuous random variable, $E(X)=\mu=\int_{a}^{b}xf(x)dx$, where the probability density function $f(x)$ is defined on the interval $[a,b]$
- Generate the expression for the variance of a continuous random variable, $Var(X)=E(X^2)-\mu^2=\int_{a}^{b}{x^2f(x)dx}-\mu^2$, where the probability density function $f(x)$ is defined on the interval $[a,b]$
- Evaluate the expected value and the variance of a continuous random variable, where the probability density function $f(x)$ is defined on the interval $[a,b]$, that involve integration of functions within the scope of the Mathematics Advanced course
- Evaluate the expected value and the variance of a continuous random variable, where the probability density function $f(x)$ is defined on the interval $[a,b]$, that involve integration of functions beyond the scope of the Mathematics Advanced course using an online computational application
## The normal distribution
- Identify the normal distribution as a continuous probability distribution that is used to model many naturally occurring phenomena
- Identify the graph of the probability density function of a normal distribution, the normal curve, as an 'ideal' bell-shaped curve, symmetrical about its mean which is equal to its mode and median, and as having most values concentrated about the mean
- Identify contexts that can be approximately modelled by a normal random variable
- Use the notation $X\sim N(\mu,\sigma^2)$ to represent a normally distributed random variable that has mean $\mu$ and standard deviation $\sigma$
- Represent probabilities associated with the normal distribution by areas of shaded regions under the normal curve, which may extend to $\pm\infty$
- Apply the empirical rule to make judgements and solve problems involving probabilities of normally distributed data: that for normal distributions, approximately 68% of data lie within one standard deviation of the mean, approximately 95% within two standard deviations of the mean and approximately 99.7% within three standard deviations of the mean
- Use graphing applications to explore the normal distribution, graph the probability density function $f(x)=\frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}$, verify the empirical rule and graph the cumulative distribution function
- Recognise features of the normal curve, and identify the global maximum and points of inflection
- Distinguish between a standard normal distribution with mean 0 and standard deviation 1, and the non-standard normal distribution with mean $\mu$ and standard deviation $\sigma$
- Define the $z$-score, or standardised score, by the formula $z=\frac{x-\mu}{\sigma}$, where $\mu$ is the mean and $\sigma$ is the standard deviation, and $x$ is an observed value of a random variable
- Interpret the $z$-score as the number of standard deviations a score lies above or below the mean
- Use $z$-scores to compare scores from different sets of data and justify conclusions
- Use $z$-scores to identify probabilities of events less or more extreme than a given outcome and solve problems using tables for the standard normal distribution
- Solve problems involving finding the mean or standard deviation of a normal random variable given the probability of an event less or more extreme than a given outcome
- Use $z$-scores to make judgements related to probabilities of certain events or given sets of data assuming an underlying normal distribution