Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Sep 9.
Published in final edited form as: Commun Stat Theory Methods. 2019 Mar 28;49(17):4197–4215. doi: 10.1080/03610926.2019.1595655

Log-epsilon-skew normal: A generalization of the log-normal distribution

Alan D Hutson a, Terry L Mashtare Jr b, Govind S Mudholkar c
PMCID: PMC7480789  NIHMSID: NIHMS1022836  PMID: 32913381

Abstract

The log-normal distribution is widely used to model non-negative data in many areas of applied research. In this paper, we introduce and study a family of distributions with non-negative reals as support and termed the log-epsilon-skew normal (LESN) which includes the log-normal distributions as a special case. It is related to the epsilon-skew normal developed in Mudholkar and Hutson (2000) the way the log-normal is related to the normal distribution. We study its main properties, hazard function, moments, skewness and kurtosis coefficients, and discuss maximum likelihood estimation of model parameters. We summarize the results of a simulation study to examine the behavior of the maximum likelihood estimates, and we illustrate the maximum likelihood estimation of the LESN distribution parameters to two real world data sets.

Keywords: Hazard function, log-normal, distribution, maximum, likelihood estimation

1. Introduction

The epsilon-skew normal (ESN) distribution was introduced as a skew extension of the normal distribution in same spirit as Azzalini’s (1985) skew-normal (SN) distribution. In this paper we introduce and investigate properties of a family of non-negative random variables (r.v.’s) called the log-epsilon-skew normal (LESN) r.v.’s. A r.v. is said to have the LESN distribution if its logarithm has the ESN distribution. A key advantage of the LESN family is that it includes as a subfamily the log-normal distributions, which are well known to be in use in a wide spectrum of disciplines. Examples include astrophysics, Gandhi (2009), environmental sciences, Benning and Barnesa (2009), computer science, Doerr et al. (2013), economics, Cobb et al. (2013), biomedical, Feng et al. (2013), and radiology, Neti and Howell (2008). Eckhard et al. (2001) compared the use of the log-normal distribution across several different science disciplines.

The log-normal distribution has a long and rich history. Galton (1879) suggested the use of the log-normal distribution to analyze data for which the geometric mean is better than the arthimatic mean for estimating central tendency. McAlister (1879) derived the log-normal distribution at Galton’s suggestion. Finney (1941) examined the moments, moment estimation, and efficiency of the estimation. Heyde (1963), showed the log-normal distribution is not uniquely determined by its moments. It is known that the moment generating function for the log-normal distribution does not exist, see Romano and Siegel (1986). However, it is well-known that the moments of the log-normal distribution can be derived from the moment generating function of the normal distribution. For a review of the history of log-normal distribution and its applications as a generative law see Mitzenmacher (2004).

In terms of background, a LN(θ; σ) r.v. T is defined by its probability distribution function (p.d.f.).

fT(t)=1t2πσexp((logtθ)22σ2) (1.1)

where Φ(·) denotes the standard normal cumulative distribution function (c.d.f). Its c.d.f. is given by

FT(t)=Φ(logtθσ) (1.2)

In survival analysis, the log-normal distribution is used in accelerated failure time (AFT) models, see Collett (2003). The form of the log-normal hazard function is given by

hT(t)=(1t2πσ)exp((logtθ)22σ2)1Φ(logtθσ) (1.3)

The log-normal hazard function is unimodal starting at zero and increasing to a maximum, then decreases toward zero as t → ∞. Sweet (1990) gives mathematical properties of the hazard rate for the log-normal distribution.

One competing approach towards generalizing the log-normal distribution would be to utilize the skew normal distribution of Azzalini (1985, 1986) more recently extend by Domma et al. (2015). This approach has been utilized in practice by Azzalini and Kotz (2002) for modeling income data and Chai and Bailey (2008) for the purpose of modeling continuous data with a discrete component at zero. However, the mathematical properties have not been investigated.

This paper is organized as follows. In Section 2, we introduce the LESN distribution and discuss its properties. In Section 3, we examine the maximum likelihood estimates of the LESN parameters. In Section 4 we summarize the results of a simulation study to examine the behavior of the maximum likelihood estimates with and without the presence of censoring. We apply these methods to two real-life data sets in Section 5. In Section 6, we finish with some conclusions.

2. The log-epsilon-skew-normal distribution

For the further development in this section the ESN model is used as a starting point. Details of the epsilon-skew-normal (ESN) distribution, developed by Mudholkar and Hutson (2000), are given in the Appendix. Toward this end let T be a random variable such that log (T)~ESN(θ, σ, ϵ): It follows that T~LESN(θ, σ, ϵ). The pdf, cdf, and the quantile function of the LESN distribution are given as follows:

fT(t)={1t2πσexp((logtθ)22σ2(1ϵ)2)if0<t<eθ1t2πσexp((logtθ)22σ2(1+ϵ)2)ifteθ (2.1)
FT(t)={(1ϵ)Φ(logtθσ(1ϵ))if0<t<eθϵ+(1+ϵ)Φ(logtθσ(1+ϵ))ifteθ (2.2)

and,

QT(u)=exp[θ+σQ0(u)] (2.3)

respectively, where −1<ϵ<1 and Q0(u) is given by Equation (A.3) in the Appendix. As can be seen at ϵ = 0, Equations (2.1) and (2.2) give Equations (1.1) and (1.2). Figure 1 shows some typical LESN probability density functions with θ = 0 and σ = 1. A horizontal line is drawn at the splice point x = eθ= 1: We see that the LESN pdf is unimodal, with the mode before the splice point t = 1.

Figure 1.

Figure 1

Some typical LESN probability density functions. (a) σ = 5 (b) σ = 32 (c) σ = 1 (d) σ = 12 (e) σ = 14 (f) σ =18.

2.1. Moments

It is known that the moment generating function for the log-normal distribution does not exist, see Romano and Siegel (1986). Likewise, the moment generating function for the LESN does not exist for any −1<ϵ<1. However, the moments of the LESN distribution can be derived from the moment generating function of the ESN distribution.

By definition, the r-th moment of T from an LESN(θ, σ; ϵ) distribution is given by

μr'=E(Tr)=E(erlogT)=E(erX)=MX(r)=eθrM(σr) (2.4)

where MX(·) and M(·) are the moment generating functions for X~ESN(θ, σ; ϵ ) and ESN(0, 1, ϵ) random variables, respectively. Using Equations (2.4) and (A.6), we have

μr'=E(Tr)=eθr{(1ϵ)e(1ϵ)2σ2r2/2Φ[(1ϵ)σr]+(1+ϵ)e(1+ϵ)2σ2r2/2Φ[(1+ϵ)σr]} (2.5)

Mean, Median and Mode

By a straightforward calculation it is seen that the mean, median, and mode of the r.v. T with the LESN(θ, σ; ϵ) distribution are given by:

E(T)=eθ{(1ϵ)e(1ϵ)2σ2/2Φ[(1ϵ)σ]+(1+ϵ)e(1+ϵ)2σ2/2Φ[(1+ϵ)σ]} (2.6)
QT(1/2)={exp[θ+σ(1ϵ)Φ1(12(1ϵ))]ifϵ<0exp[θ+σ(1+ϵ)Φ1(1/2+ϵ1+ϵ)]ifϵ0 (2.7)

and

eθσ2(1ϵ)2 (2.8)

Figure 1 shows some typical LESN probability density functions with θ = 0. A horizontal line is drawn at the splice point t = eθ= 1: We see that the LESN pdf is unimodal, with the mode before the splice point t = 1. This compares to the result at Equation (2.8). We see a variety of shapes with high peaks for large σ. This will be further examined within the context of the hazard function given below.

Variance, Skewness and Kurtosis

The raw moments of the LESN(θ, σ; ϵ) distribution are given by

μ1'=eθ{τ1(1ϵ)Φ(v1)+τ2(1+ϵ)Φ(v2)} (2.9)
μ2'=e2θ{τ12(1ϵ)Φ(2v1)+τ22(1+ϵ)Φ(2v2)} (2.10)
μ3'=e3θ{τ19/2(1ϵ)Φ(3v1)+τ29/2(1+ϵ)Φ(3v2)} (2.11)
μ4'=e4θ{τ18(1ϵ)Φ(4v1)+τ28(1+ϵ)Φ(4v2)} (2.12)

where τ1=e(1ϵ)2σ2, τ2=e(1+ϵ)2σ2, v1 = (1−ϵ)σ; and v2 = (1+ ϵ) σ. From these expressions for the moments we get,

Var(T)=μ2'μ12 (2.13)
β1=μ3'3μ1μ2'+2μ13Var(T)3/2 (2.14)
β2=μ4'4μ1μ3'+6μ12μ2'3μ14Var(T)2 (2.15)

where β1 and β2 denote the coefficients of skewness and kurtosis, respectively. The skewness and kurtosis for the log-normal distribution involve only the parameter σ. As compared to the log-normal, β1 and β2 involve the parameters σ and ϵ but not θ.

2.2. Hazard function

Let z1 = (log t−θ)/(σ(1−ϵ)) and z2 = (log t−θ)/(σ(1+ϵ)): The hazard function for the LESN is given by the following function:

hT(t)={ϕ(z1)tσ[1(1ϵ)Φ(z1)]if0<t<eθϕ(z2)tσ(1+ϵ)[1Φ(z2)]ifteθ (2.16)

Using the notation given in Section 2, we can rewrite hT(t) more compactly:

hT(t)={(1ϵ)fZ1(z1)[SZ1(z1)]+ϵFZ1(z1)if0<t<eθfZ2(z2)SZ2(z2)ifteθ (2.17)

Note that the hazard function for t ∈ [eθ, ∞) corresponds to the hazard function for a LN(θ, σ(1+ ϵ)) random variable. Some example hazard function shapes are given in Figure 2. However for t ∈ (0, eθ) the form of the hazard function is more complicated. This yields some interesting results as developed below. In order to investigate the possible hazard shapes we examine the derivative hT'(t) of hT(t). It can be shown that

hT'(t)=hT(t)[hT(t)g(t)]

where

g(t)={1t[logtθσ2(1ϵ)2+1]if0<t<eθ1t[logtθσ2(1+ϵ)2+1]ifteθ (2.19)

In general, it is well-known that we can write the hazard function and the derivative of the hazard function for T as

hT(t)=fT(t)1FT(t) (2.20)

and

hT'(t)=fT'(t)1FT(t)+[hT(t)]2 (2.21)

The expressions for the hazard function and its derivative yield the following properties:

Figure 2.

Figure 2

Some typical LESN hazard functions. (a) σ = 5 (b) σ = 32 (c) σ = 1 (d) σ = 12 (e) σ = 14 (f) σ =18.

Lemma 2.1. We have that hT(0)= 0 and hT'(0)=0.

Proof. Utilizing Equations (2.20) and (2.21), we have hT(0) = fT(0)= 0 and hT'(0)=fT'(0)=0.

Lemma 2.2. For t ∈ (0; eθ), if ϵ>0 then

hT(t)<hZ(t) (2.22)

where hZ (t) is the hazard function for LN(θ, σ).

Similarly, if ϵ<0 then

hT(t)>hZ(t) (2.23)

Proof. If ϵ>0; we have

(1ϵ)fZ1(z1)[SZ1(z1)]+ϵFZ1(z1)(1ϵ)fZ1(z1)[SZ1(z1)]<fZ1(z1)[SZ1(z1)] (2.24)

In addition, if ϵ<0 we have

(1ϵ)fZ1(z1)[SZ1(z1)]+ϵFZ1(z1)(1ϵ)fZ1(z1)[SZ1(z1)]>fZ1(z1)[SZ1(z1)] (2.25)

Lemma 2.3. In terms of the limits the

limthT(t)=0andlimthT'(t)=0

Proof. This property follows from the fact that hT(t) for t ∈ [eθ, ∞) corresponds to the hazard function for a LN(θ, σ(1+ϵ)) random variable. Sweet (1990) showed that the hazard function for a log-normal random variable tends to 0 as t → ∞.

Lemma 2.4. The hazard function at eθ is given as

hT(eθ)=2eθσ(1+ϵ)2π

Proof. Substitute t = eθ in Equation (2.16). Note that hT(t) is continuous at t = eθ

Lemma 2.5. The derivative of the hazard function hT'(eθ) exists and is given by

hT'(eθ)=2σ(1+ϵ)2πe2θπσ2(1+ϵ)2 (2.26)

Proof. Using Equations (2.18) and (2.19) we have

limteθhT'(t)=limteθ+hT'(t)=2σ(1+ϵ)2πe2θπσ2(1+ϵ)2 (2.27)

Lemma 2.6. At t = eθ, hT(t) is increasing if

σ(1+ϵ)<2π

At t = eθ, hT(t) is decreasing if

σ(1+ϵ)>2π

The point t = eθ is a relative maximum of hT(t) if

σ(1+ϵ)=2π

Proof. Follows directly from Lemma 2.5.

In summary, property 2 implies that for t ∈ (0; eθ); if ϵ<0, then the hazard function for a LESN random variable is bounded above by the hazard function for a LN random variable. Similarly, if ϵ>0; then the hazard function for a LESN random variable is bounded below by the hazard function for a LN random variable. Property 5 implies that hT(t) is continuous at the splice point t = eθ. Property 6 combined with Equation (2.16) imply that if σ(1+ϵ)<2/π, then the hazard function has a relative max for some t ∈ [eθ, ∞) Similarly, if σ(1+ϵ)>2/π, then the hazard function decreases on t ∈ [eθ, ∞).

2.3. LESN model for censored survival data analysis

One adaptation to the LESN is in modeling typical bio-medical survival data. We have already explored the uncensored case. We now examine the use of the LESN distribution when right censoring is present.

Let Ti = min(Xi; Ci); where Xi denotes the failure time and Ci denotes the right censoring time for subject i = 1, 2, …, n. The value of Xi is known only if XiCi. In addition, let the indicator variable δi = 1 if XiCi and 0 otherwise. Assume that Xi~LESN(θ, σ, ϵ), the log-likelihood per observation is given by

l(yi;θ,σ2,ϵ)=δilogft(ti)+(1δi)log[1FT(ti)] (2.28)

where fT(ti) and FT(ti) are given by Equations (2.1) and (2.2) respectively. Maximum likelihood estimation can be used to estimate model parameters which can subsequently be used to fit hazard function given in Equation (2.16).

3. Maximum likelihood estimation

Let Y1; Y2, …, Yn denote i.i.d. Yi~LESN(θ, σ, ϵ): The loglikelihood for the i-th observation is given by

lLESN(θ,σ2,ϵ|yi)={logyi12log2πσ(logyiθ)22σ2(1ϵ)2iflogyi<θlogyi12log2πσ(logyiθ)22σ2(1+ϵ)2iflogyiθ (3.1)

Denote the loglikelihood for the i-th observation for the ESN(θ, σ, ϵ) as lESN (θ, σ2, ϵ|xi), then lESN (θ, σ2, ϵ|yi) = log yi + lESN (θ, σ2, ϵ|log yi). We denote the maximum likelihood estimates of (θ, σ2, ϵ) as (θ^,σ^2,ϵ^).

Let y(1)y(2) ≤ … ≤ y(n) denote the order statistic of a sample from the LESN(θ, σ, ϵ) population. Also denote y0 = 0 and yn+1 = ∞. We apply some of the results from Mudholkar and Hutson (2000) to the LESN(θ, σ, ϵ) distribution.

Lemma 3.1. There exists and integer k = k(y(1), …, y(n)), 0 ≤ k ≤ n, such that the loglikelihood l(θ, σ2, ϵ) can be expressed as

l(θ,σ2,ϵ)={n2log2πσi=1nlogy(i)18σ2i=1n[(logy(i)θ)2]ifk=0,nn2log2πσi=1nlogy(i)12σ2[i=1k(logyiθ)2(1ϵ)2+i=k+1n(logyiθ)2(1+ϵ)2]if1k<n (3.2)

Proof. The existence of k is obvious. The form of the loglikelihood follows from the p.d.f. of LESN(θ, σ, ϵ) given by Equation (2.1). Note that the case k = 0 corresponds to the case of the p.d.f. when ϵ = −1 and the case k = n corresponds to the case of the p.d.f. when ϵ = 1. In other words for k = 0 and k = n the loglikelihood comes directly from the half-normal distributions.

The estimation of k, i.e. identification of the interval containing the maximum likelihood estimate of the mode, is helped by the following two lemmas.

Lemma 3.2. If k = 0 or k = n the maximum likelihood estimate (θ^,σ^2,ϵ^) is given by

(θ^,σ^2,ϵ^)={(logy(n),sn2,1)ifk=0(logy(1),s02,1)ifk=n (3.3)

where s02=i=2n(logy(i)logy(1))2/(4n) and sn2=i=2n1(logy(i)logy(n))2/(4n).

Proof. If k = 0, then ϵ = −1; and in the loglikelihood, x(i)θ, i = 1, 2, …, n. A maximization of the loglikelihood then yields θ^=x(1), the sample minimum, and σ^02=s02.

The proof of Lemma 3.2 at k = n follows similarly.

Lemma 3.3. If 1 ≤ k<n then the local minima of

gj(y,θ)=14{[i=1j(logy(i)θ)2]1/3+[i=j+1n(logy(i)θ)2]1/3}3 (3.4)

j = 1, 2, …, n−1, if any, determines the local maxima of the loglikelihood.

Proof. If 1 ≤ k<n then, for a fixed σ2; the loglikelihood l(θ, σ2, ϵ) at Equation (3.1) is maximized by examing the local minima of

gj(y,θ,ϵ)=i=1j(log(y(i))θ)2(1+ϵ)2+i=j+1n(log(y(i))θ)2(1ϵ)2 (3.5)

where −1<ϵ<1, and j = 1; 2, … n−1. Now gj(y, θ; ϵ) may be minimized sequentially, first with respect to ϵ for a fixed θ (and consequently for fixed k), and then with respect to θ. Thus, equating the derivative of gj(y, θ, ϵ) with respect to ϵ to zero we have:

i=1j(log(y(i))θ)2(1+ϵ)3i=j+1n(log(y(i))θ)2(1ϵ)3=0 (3.6)

Eliminating ϵ between Equations (3.5) and (3.6), and simplifying, we get the minimum of gj(y, θ, ϵ) with respect to ϵ as

gj(y,θ)=14{[i=1j(log(y(i))θ)2]1/3+[i=j+1n(log(y(i))θ)2]1/3}3 (3.7)

Thus the problem is reduced to minimizing gj(y, θ), or equivalently hj(y, θ)= (4gj(y, θ))1/3, with respect to θ subject to log (y(j)) ≤ θ log (y(j+1)). Clearly, hj(y, θ) is a differentiable function of θ with derivative

hj'(y,θ)=23{i=1j(log(y(i))θ)[i=1j(log(y(i))θ)2]2/3+i=j+1n(log(y(i))θ)[i=j+1n(log(y(i))θ)2]2/3} (3.8)

It follows that the solutions of hj'(y,θ)=0, j = 1,2, …, n−1, if any, correspond to the local stationary points of the loglikelihood.

It is observed that in general the function gj(y, θ) is monotone in most intervals. In other words, most of the half-open intervals [log(y(j)), log (y(j+1))), do not contain the local minima of gj(y, θ). The problem of identifying the intervals which have the local minima can be efficiently handled by graphing the continuously differentiable (except possibly at θ = log (y(1)) and log (y(n))) function

g(θ)=j=1n1gj(y,θ)Ij(θ) (3.9)

where Ij(θ) is the indicator function for the half-open interval [log(y(j)), log (y(j+1))), and gj(y, θ) is given by Equation (3.7). However, the isolation of the global maximum of the likelihood from the set of the local minima identified as in Lemma 3.3 requires the values of the likelihood at these points. Towards this end we have the following:

Lemma 3.4. Let θ0 denote a solution of

hj'(y,θ)=23{i=1j(logy(i)θ)[i=1j(logy(i)θ)2]2/3+i=j+1n(logy(i)θ)[i=j+1n(logy(i)θ)2]2/3}=0 (3.10)

j = 1, 2, …, n−1. Then the corresponding values σ2=σ02 and ϵ = ϵ0 which maximize the loglikelihood are given by

ϵ0=[i=j+1n(logy(i)θ0)2]1/3[i=1j(logy(i)θ0)2]1/3[i=j+1n(logy(i)θ0)2]1/3+[i=1j(logy(i)θ0)2]1/3 (3.11)
σ02=14n{[i=1j(logy(i)θ0)2]1/3+[i=j+1n(logy(i)θ0)2]1/3}3 (3.12)

Proof. Follows directly from Equations (3.1) and (3.6).

The value of the integer j, i.e. the interval containing the maximum likelihood estimator of the mode, is determined by the value of the loglikelihood over the set of all (θ0,σ02,ϵ0) given by Lemmas lem:lem2 and lem:lem3.

Lemma 3.5. The global maximum of the loglikelihood is obtained by examining the following local maxima for the set k = 0, k = n, and the set (θ0,σ02,ϵ0) given by Lemmas 3.2 and 3.4:

Maxlk(θ0,σ02,ϵ0)={n2[1+logsn2]ifk=0n2[1+logσ02]if1k<nn2[1+logs02]ifk=n (3.13)

where s02 and s12 are given by Lemma 3.2 and σ02 is given by Lemma 3.4.

For ϵ = −1 θ becomes a threshold parameter with log (y(1)) as a consistent and reasonable estimator. The scale parameter may then be estimated by substituting log (y(1)) for θ and maximizing the likelihood. The asymptotic theory of estimation in the presence of a threshold parameter, e.g. as discussed by R.L. Smith (1985), is then operative. The analogous observation applies when ϵ = 1.

Theorem 3.6. As n → ∞, the maximum likelihood estimator n(θ^mlθ,σ^ml2σ2,ϵ^mlϵ) is a centered trivariate normal distribution with variance-covariance matrix

ml=(Iθθ=3π(1ε2)σ23π8Iθσ2=0Iθε=22π(1ε2)σ3π8Iσ2σ2=2σ4Iσ2ε=0Iεε=π(1ε2)3π8) (3.14)

Proof. The regularity conditions for the validity of the large sample theory for maximum likelihood estimation (e.g. Rao 1973, Section 5g) are obviously satisfied. Also, ∑ml is the inverse of the expectation of the total sample information matrix given by the density in Equation (2.1) and may be computed readily. The proof is completed by taking expectations using the score equations. Theorem 3.7. If ϵ = ϵ0 then as n → ∞, n(θ^ml(ϵ0)θ,σ^ml2(ϵ0)σ2) is a centered bivariate normal distribution with variance-covariance matrix

((1ϵ02)σ2002σ4) (3.15)

In the context of LESN modeling the parameter ϵ may be interpreted as a measure of distance of the distribution from log-normality with respect to LESN alternatives. Hence, the test

H0:ϵ=0H1:ϵ0 (3.16)

may be used as a diagnostic tool with respect to the appropriateness of testing the appropriateness of an underlying log-normal model versus a broad class of LESN alternatives with the large sample p-value given via the results of Theorem 3.7. A parametric bootstrap approach given by Ristic et al. (2018) may be used to test the validity of the model given real datasets.

4. Monte Carlo experiments

In this section we outline two Monte Carlo experiments performed in order to study the behavior of the maximum likelihood based estimates of the LESN parameters and summarize its results. The first study examines the uncensored case and the second considers the randomly censored data. Starting values were given by considering the maximum likelihood estimates for θ and σ given by a log-normal distribution and setting ϵ = 0 with parameter estimation carried out using classic Newton-Raphson methodologies.

In the uncensored case, we let T~LESN(0, 1, ϵ) with sample sizes n = 25, 50, 100, 200, and 500. We utilized 10000 simulations under each scenario. We let ϵ range from 0 to 0.75 by 0.25, which includes the special case of ϵ = 0 corresponding to the log-normal distribution.

Tables 1 and 4 summarize the means ± standard deviations of the MLE’s for each of the parameter configurations estimated on the basis of 10000 replications in the simulation study. As expected, the finite bias approaches zero and the variance of the parameter estimates decrease as the sample size increases. Hence, via simulation we have that the maximum likelihood estimates of the LESN parameters are well-behaved, thus validating the results at Equation (3.14).

Table 1.

MLE of LESN model fit ϵ ≥ 0 no censoring.

ϵ n θ = 0 σ = 1 ϵ
 25  0.003 ± 0.71  0.938 ± 0.14 −0.001 ± 0.47
 50  0.004 ± 0.45  0.972 ± 0.10  0.000 ± 0.28
0 100  0.004 ± 0.28  0.988 ± 0.07 −0.002 ± 0.17
200  0.005 ± 0.19  0.994 ± 0.05 −0.003 ± 0.11
500  0.002 ± 0.12  0.997 ± 0.03 −0.001 ± 0.07
 25 −0.032 ± 0.68  0.934 ± 0.14  0.291 ± 0.45
 50 −0.029 ± 0.44  0.973 ± 0.10  0.275 ± 0.27
0.25 100 −0.005 ± 0.28  0.986 ± 0.07  0.257 ± 0.16
200 −0.001 ± 0.18  0.993 ± 0.05  0.253 ± 0.11
500 −0.002 ± 0.11  0.997 ± 0.03  0.252 ± 0.07
 25 −0.028 ± 0.58  0.933 ± 0.14  0.564 ± 0.38
 50 −0.048 ± 0.40  0.969 ± 0.10  0.547 ± 0.25
0.50 100 −0.025 ± 0.26  0.987 ± 0.07  0.521 ± 0.16
200 −0.005 ± 0.17  0.994 ± 0.05  0.506 ± 0.10
500 −0.001 ± 0.10  0.997 ± 0.03  0.502 ± 0.06
 25  0.024 ± 0.44  0.930 ± 0.14  0.800 ± 0.28
 50 –0.015 ± 0.31  0.966 ± 0.10  0.790 ± 0.19
0.75 100 −0.027 ± 0.21  0.986 ± 0.07  0.779 ± 0.13
200 −0.014 ± 0.14  0.992 ± 0.05  0.763 ± 0.08
500 −0.004 ± 0.08  0.997 ± 0.03  0.754 ± 0.05

Table 4.

MLE of LESN model fit ϵ ≥ 0 50% censoring.

ϵ n θ = 0 σ = 1 ϵ
 25 −0.369 ± 1.20 1.019 ± 0.43 0.002 ± 0.87
 50 −0.283 ± 1.17 1.058 ± 0.38 −0.012 ± 0.79
0 100 −0.227 ± 1.08 1.085 ± 0.32 0.003 ± 0.70
200 −0.172 ± 0.92 1.084 ± 0.26 0.019 ± 0.60
500 −0.216 ± 0.73 1.097 ± 0.20 0.085 ± 0.47
 25 −0.308 ± 1.02 1.061 ± 0.37 0.308 ± 0.80
 50 −0.339 ± 0.97 1.125 ± 0.31 0.346 ± 0.69
0.25 100 −0.358 ± 0.81 1.158 ± 0.25 0.395 ± 0.54
200 −0.387 ± 0.59 1.176 ± 0.18 0.444 ± 0.37
500 −0.431 ± 0.30 1.190 ± 0.10 0.494 ± 0.18
 25 −0.230 ± 0.77 1.099 ± 0.31 0.604 ± 0.65
 50 −0.335 ± 0.65 1.170 ± 0.23 0.671 ± 0.47
0.5 100 −0.368 ± 0.47 1.200 ± 0.17 0.696 ± 0.30
200 −0.357 ± 0.29 1.208 ± 0.11 0.696 ± 0.16
500 −0.328 ± 0.15 1.207 ± 0.06 0.685 ± 0.08
 25 −0.089 ± 0.48 1.128 ± 0.24 0.848 ± 0.41
 50 −0.194 ± 0.35 1.182 ± 0.17 0.887 ± 0.26
0.75 100 −0.235 ± 0.23 1.208 ± 0.12 0.894 ± 0.13
200 −0.227 ± 0.16 1.216 ± 0.08 0.878 ± 0.09
500 −0.232 ± 0.17 1.225 ± 0.06 0.870 ± 0.07

For the censored case, we needed to simulate Ti = min(Xi, Ci); where Xi denotes the failure time and Ci denotes the right censoring time for subject i = 1, 2, …, n. We assume that the censoring times, Ci, are identically independently exponential random variables with mean 1/λ. That is Ci~Expo(λ). We fix μ = 0 and σ = 1. Let ϵ = −0:75 to 0.75 by 0.25.

Values of λ were chosen to obtain samples with expected censoring proportions of 10%, 20%, 30%, 40%, and 50%. Finding the values of λ analytically would require solving

c=P(Xi>Ci) (4.1)

for λ, where c is the expected censoring proportion. This problem is equivalent to choosing a value of λ so that E(δi)= 1−c; where δi = 1 if Xi ≤ Ci and 0 otherwise. This was done via simulations.

Tables 14 summarize the maximum likelihood parameter estimation of the LESN model parameters θ, σ, and ϵ. We see there is finite bias in the parameter estimates. As expected, the bias approaches zero and the variance of the parameter estimates decrease as the sample size increases. This bias does increase as the censoring proportions increase as would be expected.

5. Two examples

For the purpose of illustration we first look at the Mayo Clinic primary biliary cirrhosis data that can be downloaded from http://www.umass.edu/statdata/statdata/data/, Fleming and Harrington (1991). In particular, we examine 418 serum bilirubin (mg/dl) measurements. Some basic statistics are reported in Table 5. In Table 6 maximum likelihood estimators are presented for log-normal and LESN. It is interesting to note the p-value for testing the hypothesis Equation (3.16) is p < 0.0001. This suggests that the log-normal model may not be appropriate. Starting values for both examples were given by considering the maximum likelihood estimates for θ and σ given by a log-normal distribution and setting ϵ = 0 with parameter estimation carried out using classic Newton-Raphson methodologies. The p-value calculations follow directly from Theorem 3.7.

Table 5.

Summary statistics for example one data.

n X¯ s β1
418 3.22 4.41 2.72

Table 6.

Maximum likelihood values for example one data.

Parameters LESN Log-normal
Θ −0.4169  0.5715
0.9273  1.0238
ϵ 0.6713   –
mean 3.52  2.99
standard deviation 8.72  4.07
skewness 0.23  0.31

Figure 3a depicts histogram of the serum bilirubin measurements and model fitting for log-normal and LESN models. We see that the LESN model provides a better fit. This is supported by Figure 3b which depicts a histogram of the natural logarithm of the serum bilirubin measurements and model fitting for normal and ESN models. Even after the log transformation, the data are clearly right skewed.

Figure 3.

Figure 3

Histogram of Serum Bilirubin (mg/dl) data. (a) Model fitting LESN (solid line) and lognormal (dotted line) (b) Model fitting ESN (solid line) and normal (dotted line).

For our second example, we look at data that was part of a large clinical trial carried out by the Radiation Therapy Oncology Group in the United States, Kalbfleisch and Prentice (1980). The data may also be downloaded from http://www.umass.edu/statdata/statdata/data/. In particular, we examine the survival time in days from date of diagnosis in 195 patients with squamous carcinoma in the oropharynx. There were 53 patients (27%) whose survival times were right censored. In Table 7 maximum likelihood estimators are presented for log-normal and LESN. It is interesting to note that the p-value for testing Equation (3.16) is p = 0.0174. As in the previous example, this suggests that the log-normal model may not be appropriate. Figure 4 shows the estimated hazard function for log-normal and LESN models.

Table 7.

Maximum likelihood values for example two data.

Parameters LESN Log-normal
Θ 5.7489 6.2060
1.3002 1.0641
ϵ 0.2266

Figure 4.

Figure 4

Lognormal and LESN hazard fits for survival data.

6. Conclusions and discussion

In this paper we have proposed a generalization of the log-normal distribution called the log-epsilon-skew-normal distribution, obtained some of its basic properties, and developed its adaptation for the analysis of the randomly censored survival data including the maximum likelihood estimation, including the associated information matrices, of its parameters in uncensored and random censored cases. The MLE’s in the two cases are also studied empirically for the moderate sample-size situation and the application are illustrated by two examples. It is noted that the entire development is based on the ESN distribution. We can adapt methodology developed by Hutson (2004) to extend the use of LESN in log-transformed response variables in common regression problems. It is to be noted that if the sd parameter in the ESN distribution is replaced by an independently distributed χv2/v variable or, alternatively, treating student’s t-distribution in place of the N(μ, σ) distribution as the normal distribution in the ESN definition, we get an epsilon skew-t distribution (Gómez et al. 2007). This model, which includes ES-Cauchy as a particular case, is more flexible in terms of both skewness and kurtosis. This model together its log-modification is currently under investigation. Similar developments involving the ES-logistic and LES-Logistic which may be more appte for the quantal response analysis are also being pursued.

Table 2.

MLE of LESN model fit ϵ ≥ 0 10% censoring.

ϵ n θ = 0 σ = 1 ϵ
 25 −0.042 ± 0.78 0.963 ± 0.17 0.006 ± 0.53
 50 −0.022 ± 0.52 1.001 ± 0.12 0.006 ± 0.33
0 100 −0.032 ± 0.32 1.021 ± 0.08 0.017 ± 0.20
200 −0.027 ± 0.21 1.028 ± 0.06 0.015 ± 0.12
500 −0.025 ± 0.13 1.031 ± 0.04 0.015 ± 0.08
 25 −0.089 ± 0.74 0.985 ± 0.17 0.309 ± 0.50
 50 −0.075 ± 0.50 1.028 ± 0.12 0.297 ± 0.31
0.25 100 −0.049 ± 0.31 1.041 ± 0.08 0.28 ± 0.18
200 −0.043 ± 0.20 1.048 ± 0.06 0.275 ± 0.12
500 −0.035 ± 0.12 1.052 ± 0.04 0.271 ± 0.07
 25 −0.067 ± 0.61 0.997 ± 0.17 0.577 ± 0.42
 50 −0.075 ± 0.42 1.037 ± 0.12 0.559 ± 0.26
0.5 100 −0.056 ± 0.27 1.055 ± 0.08 0.539 ± 0.16
200 −0.040 ± 0.17 1.063 ± 0.06 0.525 ± 0.10
500 −0.032 ± 0.11 1.066 ± 0.04 0.519 ± 0.06
 25 −0.002 ± 0.43 1.011 ± 0.16 0.814 ± 0.29
 50 −0.055 ± 0.30 1.05 ± 0.11 0.813 ± 0.19
0.75 100 −0.054 ± 0.22 1.07 ± 0.08 0.793 ± 0.13
200 −0.039 ± 0.14 1.078 ± 0.06 0.776 ± 0.08
500 −0.026 ± 0.08 1.082 ± 0.04 0.766 ± 0.05

Table 3.

MLE of LESN model fit ϵ ≥ 0 30% censoring.

ϵ n θ = 0 σ = 1 ϵ
 25 −0.223 ± 1.01 0.997 ± 0.27 0.049 ± 0.71
 50 −0.187 ± 0.80 1.042 ± 0.20 0.065 ± 0.52
0 100 −0.155 ± 0.57 1.058 ± 0.14 0.071 ± 0.36
200 −0.154 ± 0.37 1.064 ± 0.10 0.085 ± 0.23
500 −0.164 ± 0.21 1.071 ± 0.06 0.099 ± 0.13
 25 −0.205 ± 0.88 1.033 ± 0.25 0.336 ± 0.64
 50 −0.217 ± 0.69 1.080 ± 0.18 0.358 ± 0.45
0.25 100 −0.208 ± 0.45 1.102 ± 0.12 0.365 ± 0.28
200 −0.203 ± 0.27 1.108 ± 0.08 0.368 ± 0.16
500 −0.198 ± 0.16 1.111 ± 0.05 0.367 ± 0.09
 25 −0.180 ± 0.68 1.064 ± 0.22 0.623 ± 0.50
 50 −0.228 ± 0.50 1.114 ± 0.15 0.638 ± 0.33
0.5 100 −0.211 ± 0.33 1.129 ± 0.11 0.623 ± 0.20
200 −0.188 ± 0.21 1.136 ± 0.07 0.610 ± 0.12
500 −0.169 ± 0.12 1.137 ± 0.04 0.598 ± 0.07
 25 −0.063 ± 0.45 1.086 ± 0.19 0.842 ± 0.33
 50 −0.140 ± 0.30 1.129 ± 0.13 0.860 ± 0.19
0.75 100 −0.148 ± 0.21 1.148 ± 0.09 0.847 ± 0.13
200 −0.130 ± 0.15 1.155 ± 0.06 0.827 ± 0.08
500 −0.106 ± 0.09 1.157 ± 0.04 0.811 ± 0.05

Acknowledgments

The authors would like to thank Drs. Chris Andrews, Lili Tian, and Greg Wilding for reading the manuscript and providing helpful comments. We also would like to thank the reviewers for there careful comments, which led to a much improved revised version of this work.

Funding

This work was supported by Roswell Park Cancer Institute and National Cancer Institute (NCI) grant P30CA016056, NRG Oncology Statistical and Data Management Center grant U10CA180822 and IOTN Moonshot grant U24CA232979–01.

Appendix

Here we briefly outline the principal results and properties of the ESN distribution developed in Mudholkar and Hutson (2000) which are basic to the developments in this paper. The pdf, cdf, and qf of its canonical form ESN(0, 1, ϵ) are respectively:

f0(x)={12πexp(x22(1ϵ)2)ifx<012πexp(x22(1+ϵ)2)ifx0 (A.1)
F0(x)={(1ϵ)Φ(x1ϵ)ifx<0ϵ+(1+ϵ)Φ(x1+ϵ)ifx0 (A.2)

and

Q0(u)=F01(u)={(1ϵ)Φ1(u1ϵ)if0<u<(1ϵ)/2(1+ϵ)Φ1(u+ϵ1+ϵ)if(1ϵ)/2u<1 (A.3)

where −1<ϵ<1; and Φ(x) denotes the standard normal c.d.f.

The general form for the p.d.f., denoted ESN(μ, σ, ϵ), is f0(xμσ/σ), where f0(·) is given by Equation (A.1). Similarly, the general form for the c.d.f. of ESN(μ, σ, ϵ) is F0(xμσ), where F0(·) is given by Equation (A.2). Its quantile function, Q(u) = μ + σQ0(u), can be used to generate samples from the ESN(μ, σ, ϵ) population. Note the relationship Q(1ϵ2)=μ.

The mean of ESN(μ, σ, ϵ) is given by

E(X)=μ+4σϵ2π (A.4)

and the variance is given by

Var(X)=σ2π[(3π8)ϵ2+π] (A.5)

The moment generating function of the standard epsilon-skew-normal distribution ESN(0, 1, ϵ) is

M(t)=(1ϵ)e(1ϵ)2t2/2Φ[(1ϵ)t]+(1+ϵ)e(1+ϵ)2t2/2Φ[(1+ϵ)t] (A.6)

If XESN(μ,σ,ϵ) then MX(t)=eμtM(σt).

References

  1. Azzalini A 1985. A class of distributions which includes the normal ones. Scandanavian Journal of Statistics 12:171–8. [Google Scholar]
  2. Azzalini A 1986. Further results on a class of distributions which includes the normal ones. Statistica 46:199–208. [Google Scholar]
  3. Azzalini A, and Kotz S. 2002. Log-skew-normal and log-skew-t distributions as models for family income data. Journal of Income Distribution 11:12–20. [Google Scholar]
  4. Benning JL, and Barnes DL. 2009. The effects of scale and spatial heterogeneities on diffusion in volcanic breccias and basalts: Amchitka Island, Alaska. Journal of Contaminant Hydrology 106 (3–4):150–65. doi: 10.1016/j.jconhyd.2009.02.005. [DOI] [PubMed] [Google Scholar]
  5. Chai HS, and Bailey KR. 2008. Use of log-skew-normal distribution in analysis of continuous data with a discrete component at zero. Statistics in Medicine 27 (18):3643–55. doi: 10.1002/sim.3210. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Cobb BR, Rumí R, and Salmerón A 2013. Inventory management with log-normal demand per unit time. Computers & Operations Research 40:1842–51. doi: 10.1016/j.cor.2013.01.017. [DOI] [Google Scholar]
  7. Collett D 2003. Modeling survival data in medical research 2nd ed. Boca Raton, FL: Chapman & Hall. [Google Scholar]
  8. Doerr C, Blenn N, and Van Mieghem P. 2013. Lognormal infection times of online information spread. PLoS One 8 (5):e64349. doi: 10.1371/journal.pone.0064349. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Domma F, Popovi’c BV, and Nadarajah S. 2015. An extension of Azzalini’s method. Journal of Computational and Applied Mathematics 278:37–47. doi: 10.1016/j.cam.2014.09.016. [DOI] [Google Scholar]
  10. Eckhard L, Stahel WA, and Abbt M. 2001. Log-normal distributions across the sciences: keys and clues. BioScience 51:341–52. [Google Scholar]
  11. Feng C, Wang H, Lu N, and Tu XM. 2013. Log transformation: application and interpretation in biomedical research. Statistics in Medicine 32 (2):230–9. doi: 10.1002/sim.5486. [DOI] [PubMed] [Google Scholar]
  12. Finney DJ 1941. On the distribution of a variate whose logarithm is normally distributed. Supplement to the Journal of the Royal Statistical Society 7 (2):155–61. doi: 10.2307/2983663. [DOI] [Google Scholar]
  13. Fleming T, and Harrington D. 1991. Counting processes and survival analysis New York: John Wiley & Sons. [Google Scholar]
  14. Galton F 1879. The geometric mean, in vital and social statistics. Proceedings of the Royal Society of London 29:365–7. [Google Scholar]
  15. Gandhi P 2009. The flux-dependent RMS variability of x-ray binaries in the optical. The Astrophysical Journal 697 (2):L167–72. doi: 10.1088/0004-637X/697/2/L167. [DOI] [Google Scholar]
  16. Heyde CC 1963. On a property of the lognormal distribution. Journal of the Royal Statistical Society. Series B (Methodological) 25 (2):392–3. doi: 10.1111/j.2517-6161.1963.tb00521.x. [DOI] [Google Scholar]
  17. Hutson AD 2004. Utilizing the flexibility of the epsilon-skew-normal normal distribution for common regression problems. Journal of Applied Statistical Sciences 31 (6):673–83. doi: 10.1080/1478881042000214659. [DOI] [Google Scholar]
  18. Kalbfleisch JD, and Prentice RL. 1980. The statistical analysis of failure time data New York: John Wiley & Sons. [Google Scholar]
  19. McAlister D 1879. The law of the geometric mean. Proceedings of the Royal Society of London 29:367–76. [Google Scholar]
  20. Mitzenmacher M 2004. A brief history of generative models for power law and lognormal distributions. Internet Math 1 (2):226–5. http://projecteuclid.org/euclid.im/1089229510. [Google Scholar]
  21. Mudholkar GS, and Hutson AD. 2000. The epsilon-skew-normal distribution for analyzing near-normal data. Journal of Statistical Planning and Inference 83 (2):291–309. doi: 10.1016/S0378-3758(99)00096-8. [DOI] [Google Scholar]
  22. Neti P, and Howell RW. 2008. Lognormal distribution of cellular uptake of radioactivity: statistical analysis of α-particle track autoradiography. The Journal of Nuclear Medicine 49 (6): 1009–16. doi: 10.2967/jnumed.107.048843. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Rao CR 1973. Linear statistical inference and its applications 2nd ed, New York: Wiley. [Google Scholar]
  24. Ristic MM, Popovic BV, Zografos K, and Balakrishnan N. 2018. Discrimination among bivariate beta-generated distributions. Statistics 52:303–20. doi: 10.1080/02331888.2017.1397156. [DOI] [Google Scholar]
  25. Romano JP, and Siegel AF. 1986. Counterexamples in probability and statistics Boca Raton: CRC Press. [Google Scholar]
  26. Smith RL 1985. Maximum likelihood estimation in a class of nonregular cases. Biometrika 72 (1):67–90. doi: 10.1093/biomet/72.1.67. [DOI] [Google Scholar]
  27. Sweet AL 1990. On the hazard rate of the lognormal distribution. IEEE Transactions on Reliability 39 (3):325–8. doi: 10.1109/24.103012. [DOI] [Google Scholar]

RESOURCES