Increasing probability of success

2020-01-08 R

Problem

You run a series of trials. Trials are independent of each other and any other results.
The probability of success on a trial starts at 2% and increases linearly / additively by 3% after each failure.

The probability of success resets to 2% after a success; but this is irrelevant as you only run trials until you obtain a single success. Should you be interested in multiple successes then merely repeat the entire experiment exactly.

Probabilities

Let \(p_i\) denote the probability of success on trial \(i\). Then we can write

\[p_i=0.03i-0.01,\ i=1\dots33\] And \(p_{34}=1\).

Let \(X\) denote the number of trials until the first success. Then \(P(X=1)=p_1=0.02\), \(P(X=2)=p_2(1-p_1)=0.05(0.98)\), \(P(X=3)=p_3(1-p_1)(1-p_2)=0.08(0.98)(0.95)\), and so on.

In general,

\[P(X=x)=p_x\prod_{i=1}^{x-1}(1-p_i)=(0.03x-0.01)\prod_{i=1}^{x-1}(1.01-0.03i), \ x=1\dots33\]

Expected value or average

One quantity of interest is the expected (average) number of trials until the first success. This is calculated as

\[\sum_{x=1}^{34} xP(X=x) = \sum_{x=1}^{34} x p_x\prod_{i=1}^{x-1}(1-p_i)= \sum_{x=1}^{33} x (0.03x-0.01)\prod_{i=1}^{x-1}(1.01-0.03i)+34\prod_{i=1}^{33}(1.01-0.03i)\]

x <- 1:34
pvec <- seq(0.02,1,0.03)
prodvec <- cumprod(1-pvec)
px <- c(pvec,1)*c(1,prodvec)
Ex <- sum(x*px)

Calculating the expected value using R gives 7.23 trials.

Median

Another quantity of interest is the median, which is the number of trials for which half of participants can expect to have had their first success, or the 50/50 point in the distribution. If you exceed this number of trials you can consider yourself unlucky in a sense.

Calculating the median involves solving the equation \(P(X\leq m)=\sum_{x=1}^{m} P(X=x) \approx 0.5\) for \(m\).

For every trial \(x\) we calculate the probability of obtaining the first success on or before this trial. This can be done by summing the already calculated probabilities up to each point, or from scratch using the fact that this probability can be expressed as 1 minus the probability of no successes up to that point.

The table below gives the probabilities, from which we can see that obtaining the first success in the first 6 trials could be seen as lucky, while from trial 7 onwards could be seen as unlucky.

We can also see from the table that Trial 6 has the highest probability, making it the mode of the distribution and most likely point at which to get the success.

Fx <- cumsum(px)
data.frame(x=x,px=round(px,4),Fx=round(Fx,4))

##     x     px     Fx
## 1   1 0.0200 0.0200
## 2   2 0.0490 0.0690
## 3   3 0.0745 0.1435
## 4   4 0.0942 0.2377
## 5   5 0.1067 0.3444
## 6   6 0.1114 0.4559
## 7   7 0.1088 0.5647
## 8   8 0.1001 0.6648
## 9   9 0.0871 0.7520
## 10 10 0.0719 0.8239
## 11 11 0.0564 0.8802
## 12 12 0.0419 0.9222
## 13 13 0.0296 0.9517
## 14 14 0.0198 0.9715
## 15 15 0.0125 0.9841
## 16 16 0.0075 0.9915
## 17 17 0.0042 0.9958
## 18 18 0.0022 0.9980
## 19 19 0.0011 0.9991
## 20 20 0.0005 0.9996
## 21 21 0.0002 0.9999
## 22 22 0.0001 1.0000
## 23 23 0.0000 1.0000
## 24 24 0.0000 1.0000
## 25 25 0.0000 1.0000
## 26 26 0.0000 1.0000
## 27 27 0.0000 1.0000
## 28 28 0.0000 1.0000
## 29 29 0.0000 1.0000
## 30 30 0.0000 1.0000
## 31 31 0.0000 1.0000
## 32 32 0.0000 1.0000
## 33 33 0.0000 1.0000
## 34 34 0.0000 1.0000

other

Sean van der Merwe

Coordinator of UFS Statistical Consultation Unit

Statistician