Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter February 10, 2018

Testing for a Functional Form of Mean Regression in a Fully Parametric Environment

  • Stanislav Anatolyev EMAIL logo

Abstract

We develop a test for a restricted functional form of a mean regression when a complex distributional model for all variables is estimated. The test statistic is an average squared deviation from the estimated hypothesized function of the form implied by the estimated parametric model, and is asymptotically distributed as a mixture of χ2 distributions. The test is easy to implement using numerical derivatives, and it performs well in samples of typical size. We illustrate the test using data on labor market characteristics of US young men.

JEL Classification: C12; C21

Acknowledgement

I am grateful to the Co-Editor and two anonymous referees for useful suggestions that significantly improved the presentation. I also thank Nikolay Kudrin for excellent research assistance.

A Appendix

Proofs

Proof of Lemma 1

Consistency and asymptotic normality of ϑ^ follow from Newey and McFadden (1994, theorems 2.5, 2.6, 3.3 and 3.4) using Assumption 1 and Assumption 2.      □

Proof of Lemma 2

Note that

E [ E ( u | v , θ 0 ) ϑ 2 ] = E [ + u f ( u | v , θ 0 ) θ d u 2 ] = E [ E [ u ln f ( u | v , θ 0 ) θ | v ] 2 ] < ,

which follows from Assumption 3. Next,

E [ E ( u | v , θ 0 ) ϑ ψ ( v , β 0 ) ϑ ] E [ E ( u | v , θ 0 ) ϑ ψ ( v , β 0 ) ϑ ] E [ E ( u | v , θ 0 ) ϑ 2 ] 1 / 2 E [ ψ ( v , β 0 ) ϑ 2 ] 1 / 2 < ,

which follows from the previous and Assumption 2(f). Finally, Mψψ is finite by Assumption 2(f). This shows finiteness of Δ.

Now,

E [ sup θ N θ , β N β ( E ( u | v i , θ ) ϑ ψ ( v i , β ) ϑ ) ( E ( u | v i , θ ) ϑ ψ ( v i , β ) ϑ ) ] E [ sup θ N θ , β N β ( E ( u | v i , θ ) ϑ + ψ ( v i , β ) ϑ ) 2 ] 2 E [ sup θ N θ E [ u ln f ( u | v , θ ) θ | v ] 2 ] + 2 E [ sup β N β ψ ( v i , β ) β 2 ] <

by Assumption 2(d) and Assumption 3. Then, by Lemma 4.3 of Newey and McFadden (1994), Δ^pΔ.      □

Proof of Theorem 1

Take a second-order stochastic expansion of nD^ around the true parameter value ϑ0:

n D ^ = i = 1 n ( E ( u | v i , θ 0 ) ψ ( v i , β 0 ) ) 2 + n D ^ ϑ | ϑ 0 ζ ^ ϑ + ζ ^ ϑ 1 2 2 D ^ ϑ ϑ | ϑ 0 ζ ^ ϑ + O P ( 1 n ) ,

where

ζ ^ ϑ = n ( ϑ ^ ϑ 0 ) p ζ ϑ = d N ( 0 , V ϑ ) ,

and Vϑ=H1ΩH1 is the asymptotic distribution of ϑ^. Under H0, the leading term is zero. Next, under H0,

D ^ ϑ | ϑ 0 = 1 n i = 1 n ϑ ( E ( u | v , θ 0 ) ψ ( v i , β 0 ) ) 2 = 2 1 n i = 1 n ( E ( u | v , θ 0 ) ψ ( v i , β 0 ) ) ( E ( u | v , θ 0 ) ψ ( v i , β 0 ) ) ϑ = 0.

Finally,

1 2 2 D ^ ϑ ϑ | ϑ 0 = 1 2 1 n i = 1 n 2 ϑ ϑ ( E ( u | v i , θ 0 ) ψ ( v i , β 0 ) ) 2 = 1 n i = 1 n ϑ [ ( E ( u | v i , θ 0 ) ψ ( v i , β 0 ) ) ( E ( u | v i , θ 0 ) ϑ ψ ( v i , β 0 ) ϑ ) ] = 1 n i = 1 n ( E ( u | v i , θ 0 ) ϑ ψ ( v i , β 0 ) ϑ ) ( E ( u | v i , θ 0 ) ϑ ψ ( v i , β 0 ) ϑ ) + 1 n i = 1 n ( E ( u | v i , θ 0 ) ψ ( v i , β 0 ) ) ϑ ( E ( u | v i , θ 0 ) ϑ ψ ( v i , β 0 ) ϑ ) = H 0 1 n i = 1 n ( E ( u | v i , θ 0 ) ϑ ψ ( v i , β 0 ) ϑ ) ( E ( u | v i , θ 0 ) ϑ ψ ( v i , β 0 ) ϑ ) = Δ + o P ( 1 )

by the law of large numbers (see the proof of Lemma 2) and because

E [ ( E ( u | v , θ 0 ) ϑ ψ ( v , β 0 ) ϑ ) ( E ( u | v , θ 0 ) ϑ ψ ( v , β 0 ) ϑ ) ] E [ E ( u | v , θ 0 ) ϑ ψ ( v , β 0 ) ϑ 2 ] 2 E [ E ( u | v , θ 0 ) ϑ 2 + ψ ( v , β 0 ) ϑ 2 ] < .

Summarizing, we have that under H0,

n D ^ = ζ ϑ Δ ζ ϑ + o P ( 1 ) .

Now, using Lemma 3.2 from Vuong (1989), we get that

n D ^ p j = 1 dim ( ϑ ) λ j ζ j 2 ,

where {λj}j=1dim(ϑ) are eigenvalues of VϑΔ=Λ, and {ζj2}j=1dim(ϑ)IIDχ(1)2.      □

Proof of Theorem 2

It follows from the proof of Theorem 1 that

n D ^ = i = 1 n ( E ( u | v i , θ 0 ) ψ ( v i , β 0 ) ) 2 + O P ( n ) .

Because E(u|v,θ0)ψ(v,β0) almost surely, we have that nD^ tends to +∞ as n → ∞ as it is positive by construction.      □

B Appendix

Details on Simulation Experiments

Consider the setup of the first experiment. Because E(u|v)=μu+ρ(vμv) and ψ(v,β)=a+bv, we compute that

E ( u | v ) ϑ ψ ( v , β ) ϑ = [ 1 ρ v μ v 1 v ] .

Note that there are only two non-collinear elements. Hence,

Δ = [ 1 ρ 0 1 μ v ρ ρ 2 0 ρ ρ μ v 0 0 1 0 1 1 ρ 0 1 μ v μ v ρ μ v 1 μ v 1 + μ v 2 ] ,

which, expectedly, has a rank of 2.

The logdensity is

ln f ( u , v | θ ) = ln 2 π 1 2 ln ( 1 ρ 2 ) ( u μ u ) 2 2 ρ ( u μ u ) ( v μ v ) + ( v μ v ) 2 2 ( 1 ρ 2 ) ,

and its derivatives are

ln f ( u , v | θ ) θ = [ 1 1 ρ 2 ( ( u μ u ) ρ ( v μ v ) ) 1 1 ρ 2 ( ( v μ v ) ρ ( u μ u ) ) ρ 1 ρ 2 ρ ( 1 ρ 2 ) 2 ( ( u μ u ) 2 + ( v μ v ) 2 ) + 1 + ρ 2 ( 1 ρ 2 ) 2 ( u μ u ) ( v μ v ) ] .

Then

E [ 2 ln f ( u , v | θ ) θ θ ] = [ 1 1 ρ 2 ρ 1 ρ 2 0 ρ 1 ρ 2 1 1 ρ 2 0 0 0 1 + ρ 2 ( 1 ρ 2 ) 2 ] .

The derivatives of the hypothesized regression function are

ψ ( v , β ) β = ( 1 v ) ,

and hence

E [ ψ ( v , β ) β ψ ( v , β ) β ] = [ 1 μ v μ v 1 + μ v 2 ] .

So, the (minus) inverted Hessian is

H 1 = [ 1 ρ 0 0 0 ρ 1 0 0 0 0 0 ( 1 ρ 2 ) 2 1 + ρ 2 0 0 0 0 0 1 + μ v 2 μ v 0 0 0 μ v 1 ] .

Next we compute

E [ ln f ( u , v | θ ) θ ln f ( u , v | θ ) θ ] = [ 1 1 ρ 2 ρ 1 ρ 2 0 ρ 1 ρ 2 1 1 ρ 2 0 0 0 1 + ρ 2 ( 1 ρ 2 ) 2 ] , E [ ( u ψ ( v , β ) ) 2 ψ ( v , β ) β ψ ( v , β ) β ] = ( 1 ρ 2 ) [ 1 μ v μ v 1 + μ v 2 ] , E [ ( u ψ ( v , β ) ) ln f ( u , v | θ ) θ ψ ( v , β ) β ] = [ 1 μ v ρ ρ μ v 0 1 ] .

Hence, the matrix of expected cross-products of the elements of the score vector is

Ω = [ 1 1 ρ 2 ρ 1 ρ 2 0 1 μ v ρ 1 ρ 2 1 1 ρ 2 0 ρ ρ μ v 0 0 1 + ρ 2 ( 1 ρ 2 ) 2 0 1 1 ρ 0 1 ρ 2 ( 1 ρ 2 ) μ v μ v ρ μ v 1 ( 1 ρ 2 ) μ v ( 1 ρ 2 ) ( 1 + μ v 2 ) ] .

Then the asymptotic variance matrix is

V ϑ = [ 1 ρ 0 1 ρ 2 0 ρ 1 0 0 0 0 0 ( 1 ρ 2 ) 2 1 + ρ 2 μ v ( 1 ρ 2 ) 2 1 + ρ 2 ( 1 ρ 2 ) 2 1 + ρ 2 1 ρ 2 0 μ v ( 1 ρ 2 ) 2 1 + ρ 2 ( 1 ρ 2 ) ( 1 + μ v 2 ) μ v ( 1 ρ 2 ) 0 0 ( 1 ρ 2 ) 2 1 + ρ 2 μ v ( 1 ρ 2 ) 1 ρ 2 ] ,

and, consequently,

V ϑ Δ = 2 ρ 2 1 ρ 2 1 + ρ 2 [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 μ v 0 μ v 0 0 1 0 1 ] .

For the second experiment, we extend the method of Anatolyev and Gospodinov (2010) of constructing a joint distribution of mixed discrete and continuous marginals to the cases of the cardinality of the discrete marginal’s support higher than two. The joint CDF/CMF is

F ( u , v ) = C ( F ( u ) , G ( v ) ) ,

so the PDF/PMF is a derivative with respect to the continuous argument and a difference with respect to the discrete one:

f ( u , v ) = C u ( F ( u ) , G ( v ) ) C u ( F ( u ) , G ( v 1 ) ) = f u ( u ) f ( u , v ) ,

where the second term is

f ( u , v ) = [ C w ( w , G ( v ) ) C w ( w , G ( v 1 ) ) ] w = F ( u ) ,

or

f ( u , 1 ) = [ C w ( w , q 1 ) ] w = F u ( u ) , f ( u , 0 ) = [ C w ( w , 1 q + 1 ) C w ( w , q 1 ) ] w = F u ( u ) , f ( u , 1 ) = 1 [ C w ( w , 1 q + 1 ) ] w = F u ( u ) .

For the FGM copula,

C w 1 ( w 1 , z ) = z + ρ ( 1 2 z ) w 2 ( 1 w 2 ) ,

implying the distorted success probabilities

q 1 C ( z ) = q 1 + ρ ( 1 2 z ) q 1 ( 1 q 1 ) , q 0 C ( z ) = 1 q 1 q + 1 + ρ ( 1 2 z ) [ q + 1 ( 1 q + 1 ) q 1 ( 1 q 1 ) ] , q + 1 C ( z ) = q + 1 ρ ( 1 2 z ) q + 1 ( 1 q + 1 ) .

The joint density/mass is

f ( u , v ) = f u ( u ) q 1 C ( F u ( u ) ) 1 { v = 1 } q 0 C ( F u ( u ) ) 1 { v = 0 } q + 1 C ( F u ( u ) ) 1 { v = + 1 } ,

and the result follows.

C Appendix

Details on Empirical Illustration

We omit the parameters during the derivations. In the case of only one discrete component, the joint PDF/PMF is

f ( u , v ) = C u ( F u ( u ) , G v ( v ) ) C u ( F u ( u ) , G v ( v 1 ) ) = f u ( u ) f C ( u , v ) ,

where the last term is

f C ( u , v ) = [ C w ( w , G v ( v ) ) C w ( w , G v ( v 1 ) ) ] w = F u ( u ) .

The Gaussian copula is C(w,y)=Φ2(Φ1(w),Φ1(y)), where Φ2 is CDF of the standard bivariate normal, and Φ−1 is inverse to the standard normal CDF. Note the important property:

Φ 2 ( x 1 , x 2 ) x 1 = x 1 x 1 x 2 ϕ 2 ( t 1 , t 2 ) d t 1 d t 2 = x 1 x 1 x 2 ϕ ( t 2 | t 1 ) ϕ ( t 1 ) d t 1 d t 2 = x 1 x 1 ϕ ( t 1 ) ( x 2 ϕ ( t 2 | t 1 ) d t 2 ) d t 1 = x 1 x 1 ϕ ( t 1 ) Φ ( x 2 | t 1 ) d t 1 = ϕ ( x 1 ) Φ ( x 2 | x 1 ) .

This leads to

C ( w , y ) w = Φ 2 ( Φ 1 ( w ) , Φ 1 ( y ) ) w = Φ 2 ( x 1 , x 2 ) x 1 | x 1 = Φ 1 ( w ) , x 2 = Φ 1 ( y ) Φ 1 ( w ) w = ϕ ( x 1 ) Φ ( x 2 | x 1 ) | x 1 = Φ 1 ( w ) , x 2 = Φ 1 ( y ) 1 ϕ ( x 1 ) | x 1 = Φ 1 ( w ) = Φ ( Φ 1 ( y ) | Φ 1 ( w ) ) .

Then,

f C ( u , v ) = Φ ( Φ 1 ( G v ( v ) ) | Φ 1 ( F u ( u ) ) ) Φ ( Φ 1 ( G v ( v 1 ) ) | Φ 1 ( F u ( u ) ) ) .

Note that because Φ2 is bivariate standard normal with correlation coefficient ϱ, we have, by normality of the conditional distributions under joint normality, that

Φ ( Φ 1 ( y ) | Φ 1 ( w ) ) = Φ ( Φ 1 ( y ) ϱ Φ 1 ( w ) 1 ϱ 2 ) ,

and hence

f C ( u , v ) = Φ ( Φ 1 ( G ( v ) ) ϱ Φ 1 ( F ( u ) ) 1 ϱ 2 ) Φ ( Φ 1 ( G ( v 1 ) ) ϱ Φ 1 ( F ( u ) ) 1 ϱ 2 ) .

In the case of two discrete components, the joint PDF/PMF is

f ( u , v 1 , v 2 ) = C u ( F u ( u ) , G 1 ( v 1 ) , G 2 ( v 2 ) ) C u ( F u ( u ) , G 1 ( v 1 1 ) , G 2 ( v 2 ) ) C u ( F u ( u ) , G 1 ( v 1 ) , G 2 ( v 2 1 ) ) + C u ( F u ( u ) , G 1 ( v 1 1 ) , G 2 ( v 2 1 ) ) = f u ( u ) f C ( u , v 1 , v 2 ) ,

where the last term is

f C ( u , v 1 , v 2 ) = [ C w ( w , G 1 ( v 1 ) , G 2 ( v 2 ) ) C w ( w , G 1 ( v 1 1 ) , G 2 ( v 2 ) ) C w ( w , G 1 ( v 1 ) , G 2 ( v 2 1 ) ) + C w ( w , G 1 ( v 1 1 ) , G 2 ( v 2 1 ) ) ] w = F u ( u ) .

Consider the three-dimensional Gaussian copula

C ( w , y 1 , y 2 ) = Φ 3 ( Φ 1 ( w ) , Φ 1 ( y 1 ) , Φ 1 ( y 2 ) ) .

Note the property

Φ 3 ( x 1 , x 2 , x 3 ) x 1 = x 1 x 1 x 2 x 3 ϕ 3 ( t 1 , t 2 , t 3 ) d t 1 d t 2 d t 3 = x 2 x 3 ( x 1 x 1 ϕ 3 ( t 1 , t 2 , t 3 ) d t 1 ) d t 2 d t 3 = x 2 x 3 ϕ 3 ( x 1 , t 2 , t 3 ) d t 2 d t 3 = x 2 x 3 ϕ 2 ( t 2 , t 3 | x 1 ) ϕ ( x 1 ) d t 2 d t 3 = ϕ ( x 1 ) x 2 x 3 ϕ 2 ( t 2 , t 3 | x 1 ) d t 2 d t 3 = ϕ ( x 1 ) Φ 2 ( x 2 , x 3 | x 1 ) ,

which leads to

C ( w , y 1 , y 2 ) w = Φ 3 ( Φ 1 ( w ) , Φ 1 ( y 1 ) , Φ 1 ( y 2 ) ) w = Φ 3 ( x 1 , x 2 , x 3 ) x 1 | x 1 = Φ 1 ( w ) , x 2 = Φ 1 ( y 1 ) , x 3 = Φ 1 ( y 2 ) Φ 1 ( w ) w = ϕ ( x 1 ) Φ 2 ( x 2 , x 3 | x 1 ) | x 1 = Φ 1 ( w ) , x 2 = Φ 1 ( y 1 ) , x 3 = Φ 1 ( y 2 ) 1 ϕ ( x 1 ) | x 1 = Φ 1 ( w ) = Φ 2 ( Φ 1 ( y 1 ) , Φ 1 ( y 2 ) | Φ 1 ( w ) ) .

Then,

f C ( u 1 , v 1 , v 2 ) = Φ 2 ( Φ 1 ( G 1 ( v 1 ) ) , Φ 1 ( G 2 ( v 2 ) ) | Φ 1 ( F u ( u ) ) ) Φ 2 ( Φ 1 ( G 1 ( v 1 1 ) ) , Φ 1 ( G 2 ( v 2 ) ) | Φ 1 ( F u ( u ) ) ) Φ 2 ( Φ 1 ( G 1 ( v 1 ) ) , Φ 1 ( G 2 ( v 2 1 ) ) | Φ 1 ( F u ( u ) ) ) + Φ 2 ( Φ 1 ( G 1 ( v 1 1 ) ) , Φ 1 ( G 2 ( v 2 1 ) ) | Φ 1 ( F u ( u ) ) ) .

As a computational matter, we use the fact that

( y 1 y 2 ) | x N ( μ ϱ x , Ω R ) ,

where

μ R = ( ϱ 1 ϱ 2 ) , Ω R = [ 1 ϱ 1 2 ϱ 0 ϱ 1 ϱ 2 ϱ 0 ϱ 1 ϱ 2 1 ϱ 2 2 ] ,

and that

Φ 2 ( y 1 , y 2 | x ) = 1 2 π det Ω R y 1 y 2 exp ( 1 2 ( ( z 1 z 2 ) μ ϱ x ) Ω R 1 ( ( z 1 z 2 ) μ ϱ x ) ) d z 1 d z 2 .

References

Anatolyev, S., and N. Gospodinov. 2010. “Modeling Financial Return Dynamics via Decomposition.” Journal of Business & Economic Statistics 28: 232–245.10.1198/jbes.2010.07017Search in Google Scholar

Azzalini, A. 1985. “A Class of Distributions which Includes the Normal Ones.” Scandinavian Journal of Statistics 12: 171–178.Search in Google Scholar

Azzalini, A., T. Dal Cappello, and S. Kotz. 2003. “Log-Skew-Normal and Log-Skew-t Distributions as Models for Family Income Data.” Journal of Income Distribution 11 (3-4): 12–20.10.25071/1874-6322.1249Search in Google Scholar

Card, D. 1995. “Using Geographic Variation in College Proximity to Estimate the Return to Schooling.” In Aspects of Labor Market Behaviour: Essays in Honour of John Vanderkamp edited by L. N. Christofides, E. K. Grant, and R. Swidinsky. Toronto: University of Toronto Press.Search in Google Scholar

Judd, K. 1998. Numerical Methods in Economics. Cambridge, Massachusetts: MIT Press.Search in Google Scholar

Härdle, W., and E. Mammen. 1990. “Comparing Nonparametric versus Parametric Regression Fits.” Annals of Statistics 21: 1926–1947.10.1214/aos/1176349403Search in Google Scholar

Horowitz, J. L., and V. G. Spokoiny. 2001. “An Adaptive, Rate-Optimal Test of a Parametric Mean-Regression Model against a Nonparametric Alternative.” Econometrica 69 (3): 599–631.10.1111/1468-0262.00207Search in Google Scholar

Massey, F. J. 1951. “The Kolmogorov-Smirnov Test for Goodness of Fit.”Journal of American Statistical Association 46 (253): 68–78.10.1080/01621459.1951.10500769Search in Google Scholar

Murphy, K. M., and F. Welch. 1990. “Empirical Age-Earnings Profiles.” Journal of Labor Economics 8 (2): 202–229.10.1086/298220Search in Google Scholar

Newey, W. K., and D. McFadden. 1994. “Large Sample Estimation and Hypothesis Testing.” In Handbook of Econometrics, edited by R.F. Engle and D. McFadden, Vol 4, pp. 2111–245. Amsterdam: North-Holland.10.1016/S1573-4412(05)80005-4Search in Google Scholar

Vuong, Q. H. 1989. “Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses.” Econometrica 57 (2): 307–333.10.2307/1912557Search in Google Scholar

Published Online: 2018-02-10

©2019 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 18.5.2024 from https://www.degruyter.com/document/doi/10.1515/jem-2016-0013/html
Scroll to top button