What is a probability distribution?

For a given variable (e.g., house prices), the frequency of ranges of this variable (i.e., classes, groups, intervals).

Discrete or continuous?

  • Discrete: the number of distinct events (integers)

  • Continuous: always possible to find a number between any two events (real number)

Examples of distributions

  1. The Bernoulli distribution is used in situations where an uncertain parameter can take on one of only two possible values.
  2. The binomial distribution is used for the number of outcomes on repeated trials when each trial is independently sampled (with replacement).
  3. The hypergeometric distribution is used for the number of outcomes on repeated trials when each trial is dependent on another trial (without replacement).
  4. The poisson distribution is used for the number of outcomes in a unit of time.
  5. The uniform distribution describes an outcome that is equally likely to fall anywhere between a minimum and a maximum value.
  6. The triangular distribution is a more flexible family of continuous distributions: these distributions are specified by three parameters: the minimum, maximum, and most likely values.
  7. The normal distribution is a symmetric distribution, usually specified by its mean and standard deviation.
  8. The exponential distribution describes the frequency of times elapsed between random poisson occurrences.

Imagine this

You are considering to found an equity fund to track risky investments in privately owned organizations

  • You have a history of stock prices
  • You know that today’s stock price is the present value of stockholders’ free cashflow for the foreseeable and not-so-foreseeable future
  • You need a back of the envelope way to describe how stock prices and value evolve
  • You also want to use this evolution to understand how a claim contingent on stock prices also evolves
  • And the present value of such a claim

What do you want to know?

  • Is there a distribution you can use to describe the evolution and also price a claim on the stock prices?

Why?

  • So you can acquire or divest an asset, a project, or anything that generates cash flows over time and has the same risk profile as the stock price.

What’s a binomial?

Suppose we know that the current stock price of Ross Stores (NASDQ: ROST) \(S\) is $109 per share. We might consider a simple forecast of the uncertain level of the asset’s value in one day, or even several days into the future: it might go up or it might go down. That’s a binomial.

We picture the binomial view of an anticipated stock price \(S\) as branches from a root of \(S_0=\$109\) per share today at time \(t=0\). How can we forecast asset value at the end of one day \(S_1\)? We can do so by supposing that the stock price might simply rise or fall. Both up \(u\) and down \(d\) stock price outcomes, \(S_{1,u}\) and \(S_{1,d}\), might occur in \(t=1\) day.

We might also want to represent our optimisim or pessimism about how often up moves and down moves might occur. Let’s assume just for a minute that we are optimistic about the future so that the probability of any up move in this stock is \(p = 0.60\). If the probability of an up move is 0.60, then the probability of a down move must be



We can hover over the diagram below and locate the initial stock price and the up and down possibilities as well as probabilities of an up or down movement.

Now we can suppose that the stock price grows or declines by an amount. That amount can be learned from the sample standard deviation \(\sigma\) of the rate of return of the stock price. Rates of return are just growth (or decline) rates. So that the one day ahead stock price will go up to

\[ S_{1,u} = S_0 + S_0 \sigma = S_0(1+\sigma) \approx S_0 e^{\sigma} \] Where we use the approximmation \((1+\sigma) \approx e^{\sigma}\). We must keep in mind that this is the one day standard deviation.

Let’s get this a little more down to earth. We find that over the past 251 trading days that the stock return daily standard deviation is 0.0165 or 1.65% per day. If we multiply this by 251 days a $1 invested today would grow into 62.8970964. This means that after 251 up jumps from today we would get this amount. Intuitively we might feel this is not very likely! Anyway, after just one up jump (a stock price move for one day) and if today’s stock price is $109 per share, then

\[ S_{1,u} = S_0 + S_0 \sigma = S_0(1+\sigma) \approx S_0 e^{\sigma} = 109 e^{0.0165} = 110.8134196 \]

What goes up might just as well go down. If stock returns have a positive \(\sigma = 0.0165\), and since \(\sigma = var^{1/2}\), then it is possible for a negative or down turn in the return. Now the factor is $e^{-0.0165}=0.9836354, a discount to the current stock price of $109 per share.

\[ S_{1,d} = S_0 - S_0 \sigma = S_0(1-\sigma) \approx S_0 e^{-\sigma} = 109 e^{-0.0165} = 107.2162564 \]

If the stock could jump up 251 times it might decline that many times too. In that scenario, the stock price would move from $109 to what level?



What would be the expected value of the stock price in one day? This question is just a weighted average of the probabilities of up and down with the stock price outcomes, a random variable in one day.



Then there were two

Yes, two draws of stock prices, one after the other, one conditional on the other. Let’s look at this up and down tree.

SOme description is in order. At the top is right now, day 0. The second row is day 1. The third row is day 2. If the stock goes up from day 0 to day 1, then the value is \(S(1,u)\). Similarly if the stock goes down. If after the stock goes up at day 1, then the stock can either go up to \(S(2,uu)\) (that is, jump up twice) to go down to \(S(2,ud)\) (that is, after jumping up it then jumps down). The same thoughts will occur when the stock goes down to \(S(1,d)\).

How many paths are there to get to the third row where we have forecasted the 2nd day’s stock price?



How often?

We can compute outcomes all day. But how often does a node occur? We remember that our task is to forecast stock prices in the future. To do that we realize that stock prices occur in a probable range. That means they are random variables, where the adjective random has the notion of an indiscriminate sampling of prices. But a random variable is not so colloquially indiscriminate as to not have a notion of a frequency of occurrence. The relative frequency, as we continue to see, is what we measure to be probability. Allowing probability into our lives also admits our beliefs into the analysis.

So what is the probability of a stock price after one up and one down jump?



How would we calculate the \(S(2,du=ud)\) outcome?



The \(S(2, du=ud)\) outcome occurs 48% of the time given our optimism in this market.

Being binomial

Consumer goods ‘r’ us

From a Consumer Food database.

  1. What proportion of the database households are in the Metro area? Use this as the value of \(p\) in a binomial distribution.
  2. If you were to randomly select 25 of these households, what is the probability that exactly 8 would be in the Metro area?
  3. If you were to randomly select 12 of these households, what is the probability that 3 or fewer would be households in the Metro area?

Just like the probability of an up swing in stock prices, the proportion of Metro area households is the probability of that up or success (yes we found Metro households!) brach of the binomial tree. We can go on to then answer qeustions 2 and 3. We will need to know how many paths it can take to get to an outcome, the probability of a single path as well. Armed with this knowledge we can answer questions just like these.

What’s a binomial?

Let’s get more precise.

  • Two possible event outcomes only in each run or scenario of a binomial process
  • E.g., default/not-default, reject/accept, comply/not-comply events
  • Let \(x\) = comply, then \(P(x)\) = probability of compliance and \(1 - P(x)\) = probability of non-compliance.

We use three assumptions:

  1. Each replication of the process is a combination of events that results in one of two possible outcomes (usually referred to as “success”" or “failure” events).
  2. The probability of success is the same for each replication.
  3. The replications are independent, meaning here that a success in one replication does not influence the probability of success in another.

What’s a combination?

Start with a set of choices or categories. Combinations are the complete set of different ways you can arrange the various subsets of choices or categories.

For a simple example

  • Start with the set of \(A= \{1,2\}\), where “1” is “comply” and “2” is “don’t comply.”
  • You can form four subsets of this set: {}, {1}, {2}, and {1,2} (don’t forget the null or “do nothing” subset). These subsets are the combinations of set A.
  • Order does not matter so that \(\{1,2\}\) is the same as \(\{2,1\}\)

We have to start with permutation first

A permutation does care about the order of the elements in a subset, much like the order of letter in a word. We will find that combinations are found from permutations.

Start with \(A = \{a,b,c,d,e\}\) letters in a text to your friend.

  • With 5 letters to choose from we can select the first letter in 5 ways.
  • We now have 4 letters left, so the second letter can be chosen in 4 ways,
  • Then the third letter in 3 ways
  • The fourth letter in 2 ways
  • The fifth letter in 1 way or \[ 5! = 5 \times 4 \times 3 \times 2 \times 1 = 120 \] ways to build a word of text (granted some of these might be code).

Find the number of 3 letter words you can form from a list of 5 letters.

Using the same logic as above this is \[ 5 \times 4 \times 3 = 60 \] There are 120 possible words you can form from 5 letters. There are 60 possible 3 letter sequences. This is called a permutation. In symbols we have 5 permute 3 or \(_{5}P_{3}\). We notice that \(5! = 5 \times 4 \times 3 \times 2 \times 1 = 5 \times 4 \times 3 \times 2!\). So all we need to do to get the 60 permutations is take 5! and divide by 2! to get

\[ _{5}P_{3} = \frac{5!}{2!} = 60 \]

Generally we have for \(n\) elements permuted \(x\) at a time:

\[ _{n}P_{x} = \frac{n!}{(n-x)!} \]

What about combinations?

Now we don’t care about the order of the letters. We seem to also know that

  • Permutations track with the order of elements, here, letters: {a,d,c} \(\neq\) {d,c,a}
  • Combinations don’t worry about the order of elements: {a,d,c} = {d,c,a}
  • Thus out of the 60 permutations we just counted, we have to keep only one in \(3 \times 2 \times 1 = 3! = 6\) (1/6th) of the 60 Permutations or 10 combinations. We then have 6 times more permutations than we need to account for combinations.

Combinations from permutations

For our 5 letters taken 3 at a time without regard to the order of letters, we have \[ _{5}C_{3} = {5 \choose 3} = \frac{5!}{3! \, (5 - 3)!} \] and then we compute \[ = \frac{5x4x3x2x1}{(3x2x1)(2x1)} = 10 \] More generally, we have \[ _{n}C_{x} = {n \choose x} = \frac{n!}{x! \, (n - x)!} \] and \[ _{n}P_{x} = _{n}C_{x}x! = {n \choose x}x! = \frac{n!}{(n - x)!} \]

Try these: combinations and permutations

A risk management practice has 8 members. Teams of three are formed to work with client project teams and provide management, subject matter expertise and budget and scheduling control. The practice leader wants to know

  1. How many different project teams (“teams”) can be formed from the practice?
  2. How many teams of 3 members can be formed from the practice if each team is to have a team manager, a project controller, and a subject matter expert?


So how do we use the binomial process?

Follow these steps:

  1. How many times did we repeat the process (number of observations)? \(n\).

  2. How many successes (events we are tracking)? \(x\)

  3. How often does a success occur (event we are tracking)? \(p(X = x)\)

  4. How many \(x\) successes (events) in \(n\) replications (trials, observations)? The number of possible combinations of \(x\) successes in \(n\) replications is \[ _{n}C_{x} = {n \choose x} = \frac{n!}{x! \, (n - x)!} \] The \(x\) successes can occur anywhere among the \(n\) trials (observations). There are \(_{n}C_{x}\) different ways of distributing \(x\) successes in a sequence of \(n\) trials (observations).

  5. What is the probability of a single scenario (combination = 1) of \(x\) successes? \(p^x\)

  6. What is the probability of a single scenario of \(n-x\) failures? \((1-p)^{n-x}\)

  7. What is the probability of all combinations of \(x\) successes in \(n\) trials (using the Excel formula)? \[ P(X = x \mid n, p) = {n \choose x}p^x (1-p)^{n-x} \] We can also use the \(BINOM.DIST(x,n,p,FALSE)\) formula in Excel, where FALSE indicates that we calculate the probability mass function value (the relative frequency) and TRUE indicates that we calculate cumulative probability value (the cumulative relative frequency).

Try this: binomial distribution

A environmental control specialist picks a sample of 10 sensors from a large shipment of sensors. Experience has shown that 1 in 5 sensors fail to work when installed. The specialist is scheduling her time for the week and wants to know

  1. What is the probability that she will pick exactly 2 of the defective sensors?
  2. What is the probability that she will pick no more than 2 of the defective sensors?


Try another

Using the same sensor problem, what is the probability that she will pick 8 or more defective sensors? What if she picks more than 8 defective sensors?

  • Build a simple spreadsheet table to answer these questions.


What does a graph of the binomial look like?

Binomial statistics

Here we need just the mean and standard deviation.

  • Mean of a binomial distribution \[ \mu = n p \]
  • Standard deviation of a binomial distribution \[ \sigma = (n p (1-p))^{1/2} \]

Try this

Use the data from the last example above. What are the mean and standard deviation of the sensor problem?



Short exercises

  1. Hospital records show that of patients suffering from a certain rare disease, 75% die of it. What is the probability that of 6 randomly selected patients, 4 will recover?

Answer: 3.3%

  1. For the patients suffering from a certain disease, What is the probability that all 6 randomly selected patients will recover?

Answer: very, very small

  1. You are trying reach your surveying team in the iron rich creases of the Mesabi mountain range. there was a probability of 80% of success in any attempt to connect by cell phone. What is the probability of having 7 successes in 10 attempts?

Answer: a little over 20%

  1. A manufacturer of metal pistons finds that, on the average, 12% of the pistons are rejected by customers because they are either oversize or undersize. What is the probability that a batch of 10 pistons will contain no more than 2 rejects?

Answer: 89%

Binomial barrels

You own a small refinery. Inputs are barrels of crude products. Outputs are barrels of refined products. Your corporate policy is to allow shipments where no more than 3 sampled barrels of unacceptable quality (non-compliant) are discovered in a sample of 10 barrels. However, because of the nature of the production process and the quality of inputs, it is 20% of time that bad barrels occur.

Use this Excel workbook (binomial.xlsx) to perform these tasks and answer the questions below:

  1. Sample 30 barrels from the population.
  2. Set up the spreadsheet model for probabilities (p) of a non-compliant barrel equal to 20%, 50%, and 80%.
  3. What is the probability that 5 or fewer barrels will likely be found?
  4. What is the probability that 15 or greater barrels will be found?
  5. How probable is it that between 5 and 15 barrels will found exclusive of 5 or 15?