The hazard function, cumulative hazard function, PDF, CDF, quantile function of an exponential distribution r.v. t are: The hazard function, cumulative hazard function, PDF, survival function, quantile function of a piecewise exponential distribution r.v. t with breakpoints di are:
The PWEXP
package provides dpwexp()
,
ppwexp()
, qpwexp()
, rpwexp()
functions to the piecewise exponential distribution:
# Left Figure ------------------------------------------------------------
# use rpwexp function to generate piecewise exp samples with rate 2, 1, 3
r_sample <- rpwexp(50000, rate=c(2, 1, 3), breakpoint=c(0.3, 0.8))
hist(r_sample, freq=F, breaks=200, main="Density of Piecewsie Exp Dist", xlab='t', xlim=c(0, 1.2))
# piecewise exp density with rate 2, 1, 3
t <- seq(0, 1.5, 0.01)
f2 <- dpwexp(t, rate=c(2, 1, 3), breakpoint=c(0.3, 0.8))
points(t, f2, col='red', pch=16)
# exp distribution can be a special case of piecewise exp distribution
f1 <- dpwexp(t, rate=2)
lines(t, f1, lwd=2)
legend('topright', c('exp dist with rate 2','piecewise exp dist with rate 2, 1, 3',
'histogram of piecewise exp dist with rate 2, 1, 3'),
col=c('black','red'), fill=c(NA, NA, 'grey'), border=c('white', 'white', 'black'),
lty=c(1, NA, NA), pch=c(NA, 16, NA), lwd=2)
# Right Figure ------------------------------------------------------------
# CDF of piecewise exp with rate 2, 1, 3
F2 <- ppwexp(t, rate=c(2, 1, 3), breakpoint=c(0.3, 0.8), lower.tail=T)
plot(t, F2, type='l', col='red', lwd=2, main="CDF and Quantile Function of Piecewsie Exp Dist",
xlim=c(0, 1.5), ylim=c(0, 1.5))
# CDF of exp dist is compatible with our package
F1 <- ppwexp(t, rate=2, lower.tail=T)
lines(t, F1, lwd=2)
# plot quantile functions of both distributions
lines(F1, qpwexp(F1, rate=2, lower.tail=T), lty=2, lwd=2)
lines(F2, qpwexp(F2, rate=c(2, 1, 3), breakpoint=c(0.3,0.8), lower.tail=T), col='red',
lty=2, lwd=2)
abline(0, 1, col='grey')
legend('topleft', c('CDF of piecewise exp with rate 2, 1, 3', 'quantile function of
piecewise exp with rate 2, 1, 3',
'CDF of exp with rate 2', 'quantile function of exp with rate 2'),
col=c('red', 'red', 'black', 'black'), lty=c(1, 2, 1, 2), lwd=2)
The conditional survival function, CDF, PFD and quantile function of an exponential distribution t given t > T is
The conditional survival function and CDF of a piecewise exponential
distribution t given t > T is The conditional
quantile function of a piecewise exponential distribution t given t > T is The
PWEXP
package provides ppwexp_conditional()
,
qpwexp_conditional()
, rpwexp_conditional()
functions for conditional piecewise exponential distribution:
# Left Figure ------------------------------------------------------------
# CDF and qunatile function of conditional piecewise exp with rate 2, 1, 3 given t > 0.1
t <- seq(0.1, 1.2, 0.01)
F2_con <- ppwexp_conditional(t, qT=0.1, rate=c(2, 1, 3), breakpoint=c(0.3, 0.8))
plot(t, F2_con, type='l', col='red', lwd=2, main="CDF and Quantile Function of
Conditional \nPiecewsie Exp Dist", xlim=c(0, 1.2), ylim=c(0, 1.2))
lines(F2_con, qpwexp_conditional(F2_con, qT=0.1, rate=c(2, 1, 3), breakpoint=c(0.3,0.8)),
lty=2, lwd=2, col='red')
# compare with CDF and quantile function of unconditional piecewise exp with rate 2, 1, 3
t <- seq(0, 1.2, 0.01)
F2 <- ppwexp(t, rate=c(2, 1, 3), breakpoint=c(0.3,0.8))
lines(t, F2, lwd=2)
lines(F2, qpwexp(F2, rate=c(2, 1, 3), breakpoint=c(0.3,0.8)), lty=2, lwd=2)
abline(v=0.1, col='grey')
abline(h=0.1, col='grey')
legend('topleft', c('CDF of piecewise exp dist given t > 0.1', 'quantile function of
piecewise exp dist given t > 0.1', 'CDF of piecewise exp dist',
'quantile function of piecewise exp dist'), col=c('red', 'red', 'black', 'black'),
lty=c(1, 2, 1, 2), lwd=2)
# Right Figure ------------------------------------------------------------
# use rpwexp_conditional function to generate piecewise exp samples with rate 2, 1, 3 given t > 0.1
r_sample_con <- rpwexp_conditional(3000, qT=0.1, rate=c(2, 1, 3), breakpoint=c(0.3,0.8))
plot(ecdf(r_sample_con), col='red', lwd=2, main="Empirical CDF of Conditional
Piecewsie Exp Dist", xlim=c(0, 1.2), ylim=c(0, 1))
# compare with its CDF
lines(seq(0.1, 1.2, 0.01), F2_con, lwd=2)
legend('topleft', c('empirial CDF of piecewise exp dist given t > 0.1', 'true CDF of
piecewise exp dist given t > 0.1'), col=c('red', 'black'), lty=c(1,2), lwd=2)
The log likelihood function can be constructed as following: where Dr or Cr are the index set of event time or censoring time that fall into rth interval of the piecewise exponential distribution.
When breakpoints d1 to dr are known, we can take a derivative of log (L) wrt λ: where nDr is the number of events that fall into the rth interval; nr+ is the number of events and censoring that fall into the rth to the end intervals.
Let all derivatives equal to 0, we obtain the MLE estimator of hazard rates:
The log likelihood function is differentiable wrt dr as long as dr is not equal to any event time or censoring time. We take the derivative of log (L) wrt dr: Since λr + 1 ≠ λr, then the derivative cannot be zero when dr is between two consecutive sample time points. This fact implies that only when dr is equal to any of the sample values, the log likelihood function achieves the maximum value.
Therefore, we can use a brute-force search to calculate the log likelihood function values for all potential breakpoint combinations. For each candidate, hazard rates can be estimated by the formula in Section 2.1 and thus the log likelihood can be obtained. The breakpoints and hazard rates with the largest log likelihood value are the MLE estimator.
When the number of samples or the number of breakpoints are relatively large, the combination of breakpoints will be very large and the brute-force search may not be feasible. We will draw a random sub-sample first and then do exhaustive search based on the sub-sample. The argument in function controls the maximum combination candidates to try.
The second method to estimate breakpoints is based on the survival curve. Once we take a log transformation of the piecewise survival function, we will obtain a piecewise linear function:
Let Yi be the log value of KM estimate at ti. Following Muggeo (2003), we use (ti, Yi) to fit the piecewise log survival function log (S(t)) and obtain the OLS breakpoints d̂r. Once breakpoints are determined, hazard rates can be estimated by the formula in Section 2.1.
The breakpoints estimated in Section 2.2.2 are not exactly but very close to the MLE estimator. In order to obtain the MLE estimator, we combine brute-force search with OLS method.
Specifically, instead of drawing a random sub-sample in Section 2.2.1, we draw a sub-sample from the values within 95% percent confidence intervals of the estimated breakpoints from Section 2.2.2. Then we do exhaustive search based on the sub-sample. The argument in function controls the maximum combination candidates to try. The hybrid method improves the efficiency and accuracy of the estimation procedure.
The procedure of parameter estimation is summarized in the diagram below:
[1] Muggeo, V. M. (2003). Estimating regression models with unknown break-points. Statistics in medicine, 22(19), 3055-3071.