Abstract
The asymptotic behavior of solutions as a small parameter tends to zero is determined for a variety of singular-limit PDEs. In some cases even existence for a time independent of the small parameter was not known previously. New examples for which uniform existence does not hold are also presented. Our methods include both an adaptation of geometric optics phase analysis to singular limits and an extension of that analysis in which the characteristic variety determinant condition is supplemented with a periodicity condition.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The modern theory of singular limits of nonlinear evolutionary PDEs was first developed for the case when uniform Sobolev-space bounds can be obtained for both solutions and their first time derivative [14, 15]. Although the requirement that the first time derivative be uniformly bounded was later eliminated [7, 22], the many results that followed ([1,2,3, 6, 8, 11, 17, 20, 23] and references therein) still require that the solution and some number of its spatial derivatives be uniformly bounded. Almost all of those singular limit results involve systems in which the large terms have constant coefficients, but even the few results for systems with variable-coefficient large terms [4, 5, 9, 13, 24] concern cases for which uniform spatial estimates can be proven.
However, the derivatives of solutions to many systems of nonlinear evolutionary PDEs containing a parameter do not remain uniformly bounded as that parameter tends to its limit. As will be discussed in Sect. 6.1, in some cases this lack of uniform bounds causes the time of existence of the solutions to tend to zero, but in other cases the solution nevertheless exists for a uniform time. One case of particular interest is when the solution has a particular structure
that explains the persistence of the solution despite the nonuniformity of the norms of its spatial derivatives. Solutions having such structures are common in the theory of geometric optics [10, 12, 13, 19], in which the initial data contains rapidly-varying terms of the form \(u_0(\textbf{x},\frac{\mathbf {\phi }_0(\textbf{x})}{\varepsilon })\).
In this paper we analyze equations having solutions of the form (1.1) from the point of view of singular limits, in which some terms in the PDE are of size \(O(\frac{1}{\varepsilon })\), rather than of geometric optics, in which the initial data contains bounded terms whose first derivatives are of size \(O(\frac{1}{\varepsilon })\). Although geometric optics problems can be transformed into singular limit equations by adding the variables \(\mathbf {\theta }\mathrel {:=}\tfrac{\mathbf {\phi }(t,\textbf{x})}{\varepsilon }\) [22, Sect. 5], [13, Sect. 5.1], and singular limit equations with large term \(\frac{1}{\varepsilon }C\partial _y\) can be formally transformed into geometric optics problems by a change of variables
these operations are not inverses of each other, so each viewpoint yields a different perspective. In particular, the fact that singular limit equations do not come equipped with phase functions or even their initial values adds to the challenge and interest. A more detailed analysis of the relation between singular limit equations and geometric optics will be presented in Sect. 6.3.
1.1 Equations
In this paper we construct appropriate phase functions for singular limit equations and systems, and use them to establish the existence and regularity of solutions for a time independent of the small parameter \(\varepsilon \) in the PDE and to determine the asymptotic form of solutions as \(\varepsilon \rightarrow 0\). Since the case of constant-coefficient large operators is mostly covered by the classical theory of singular limits mentioned above, the singular limit equations studied here will contain variable-coefficient large terms.
The phenomenon we study can be seen most simply in a PDE like \(u_t+u_x+\tfrac{\cos x}{\varepsilon }u_y=0\), whose solution having initial data \(u_0\) is \(u(t,x,y)=u_0(x-t,y-\tfrac{\sin (x)-\sin (x-t)}{\varepsilon })\). The form of that solution suggests trying the ansatz
for the more general equation
Substituting (1.3) into (1.4) and defining
yields
Hence if we let the “fast phase function” \(\mu \) satisfy the equation
obtained from the terms of order \(\tfrac{1}{\varepsilon }\) in (1.6) then the “profile” U should satisfy
Since the PDEs (1.7) and (1.8) are independent of the small parameter \(\varepsilon \), both \(\mu \) and U will exist and be bounded for a time independent of \(\varepsilon \), under suitable assumptions on the coefficients and initial data. In particular, we assume both here and later that
Hence (1.3) yields the exact solution of the initial-value problem for the model Eq. (1.4), and in particular specifies precisely the dependence of the solution on \(\varepsilon \).
In this paper we consider several generalizations of (1.4) for which the ansatz (1.3) or generalizations of it describes the leading-order behavior of solutions, although it no longer yields exact solutions. Specifically, under appropriate conditions that vary in their generality we will prove uniform existence, i.e., existence for at least a time independent of \(\varepsilon \), and describe the asymptotics of the solutions for certain equations of the following forms:
- Scalar hyperbolic equation with coefficients depending on y:
-
$$\begin{aligned}{} & {} a(t,x,y,\varepsilon u,\varepsilon )u_t+b(t,x,y,\varepsilon u,\varepsilon )u_x+\tfrac{c(t,x,y)}{\varepsilon }u_y \nonumber \\{} & {} \quad +d(t,x,y,u,\varepsilon ) u_y+f(t,x,y,u,\varepsilon )=0 \end{aligned}$$(1.10)
- Scalar nonuniformly parabolic equation:
-
$$\begin{aligned}{} & {} a(t,x,\varepsilon u,\varepsilon ) u _t +b(t,x,\varepsilon u,\varepsilon ) u _x+ \tfrac{c(t,x) }{\varepsilon }u _y+d(t,x,u,\varepsilon ) u _y +f(t,x,u,\varepsilon ) \nonumber \\{} & {} \qquad = \varepsilon ^2 g(t,x,u,\varepsilon ) u _{xx}+\varepsilon h(t,x,u,\varepsilon )u_{xy} + k(t,x,u,\varepsilon )u_{yy} \end{aligned}$$(1.11)
- Symmetric hyperbolic system:
-
$$\begin{aligned}{} & {} A(t,x,\varepsilon u,\varepsilon )u_t+B(t,x,\varepsilon u,\varepsilon )u_x+\tfrac{1}{\varepsilon }C(t,x) u_y {} \nonumber \\{} & {} \quad +D(t,x,u,\varepsilon )u_y+f(t,x,u,\varepsilon )=0\end{aligned}$$(1.12)
- \(2\times 2\) symmetric hyperbolic system with coefficients depending on y:
-
$$\begin{aligned}{} & {} A(t,x,y,\varepsilon u,\varepsilon ) u_t+B(t,x,y,\varepsilon u,\varepsilon )u_x+\tfrac{1}{\varepsilon }C(t,x,y)u_y \nonumber \\{} & {} \quad +D(t,x,y,u,\varepsilon )u_y+f(t,x,y,u,\varepsilon )=0 \end{aligned}$$(1.13)
The conditions under which uniform existence holds and asymptotics can be determined will be presented in detail before and in the theorems about each of those equations. In particular, we do not claim that all solutions of those equations having sufficiently smooth and bounded initial data exist for a time independent of \(\varepsilon \); to the contrary, we will present in Example 6.3 an equation of the form (1.10) and smooth bounded initial data for which uniform existence does not hold. Furthermore, our results for the system (1.12) require severe restrictions when that system contains three or more components, and our results for (1.13) require severe restrictions even though that system is assumed to be \(2\times 2\).
In all these equations, having a term \( D u_y\) is slightly more general than including dependence on \(\varepsilon u\) and \(\varepsilon \) in the term \(\frac{1}{\varepsilon }Cu_y\), because the O(1) part of \(\frac{1}{\varepsilon }C(\ldots ,\varepsilon u,\varepsilon )u_y\) would necessarily be affine-linear in u. The similarly slightly more general terms \(\varepsilon A_1(\ldots ,u,\varepsilon ) u_t+\varepsilon B_1(\ldots ,u,\varepsilon )u_x\) could be included without difficulty; such terms have been omitted for notational simplicity. As will be seen in the limit equations below, the \(O(\varepsilon )\) part of A and B appear in the limit just like D, which is effectively the \(O(\varepsilon )\) part of C. This is a general phenomenon for fast singular limits, e.g. [22, (2.18)].
Although (1.12) may appear to be the most difficult equation to treat since it involves systems of arbitrary size, it is in fact the only one of the four for which our results can almost be obtained from geometric optics results, specifically [13]. Even for that system our results are slightly more general in that the matrix A multiplying the time derivatives is not restricted to be the identity matrix, and we point out how the results there apply to singular limit systems beyond those obtained directly from geometric optics as considered in [13]. However, our largest contribution concerning (1.12) is a proof that is quite different, in the spirit of singular limits, and simpler than the proof for general geometric optics problems in [13]. Its key points will be pointed out during the course of the proof, in Sect. 4. A comparison of our result for (1.12) with corresponding results for geometric optics problems obtained from (1.12) via the transformation (1.2), and of the phase functions occurring in each version, will be presented in Sect. 6.3.
Geometric optics results do not apply to the parabolic Eq. (1.11) on account of the presence of the second-order terms. In fact, even the existence of solutions to that equation for a time that might depend on \(\varepsilon \) is not obvious because the second-order terms are nonlinear and non-uniformly parabolic. Nevertheless, thanks to the scaling of the parabolic terms by powers of \(\varepsilon \), the large term can be eliminated from (1.11) using the transformation (1.3) with the same fast phase function (1.7) as for (1.4). Uniform existence will then be obtained for the transformed system by applying the recent results of [25].
A characteristic feature of the singular limit systems in [13, §4] arising from geometric optics is that the coefficients appearing in the equations do not depend on the variables with respect to which derivatives are taken in the large terms. Both (1.11) and (1.12) also share this feature since the large terms involve derivatives only with respect to y and none of the coefficients depend on that variable. However, in (1.10) and (1.13) the coefficients do depend on y, which introduces new and interesting complications. First, it is no longer possible to eliminate the large term in the PDE by an appropriate choice of the fast phase function \(\mu \), because \(\mu \) must still be independent of y to avoid introducing a term of size \(\frac{1}{\varepsilon ^2}\) while the coefficients now do depend on y. The way to resolve this difficulty is to generalize the ansatz (1.3) by assuming that the solution depends on y only through the combination \(Y(t,x,y)-\frac{\mu (t,x)}{\varepsilon }\), for some appropriate function Y. The function Y is now chosen so as to eliminate the large term, while \(\mu \) is determined by the requirement that Y satisfy a functional equation ensuring that periodicity with respect to y translates into periodicity with respect to Y. That requirement introduces an averaging operator into the equation determining \(\mu \), such that when the coefficients do not actually depend on y then \(\mu \) reduces to the function satisfying (1.7) and Y(t, x, y) reduces to y. A second complication that arises from the dependence of the coefficients of (1.10) on y is that if the only phase included in the ansatz is \(Y(t,x,y)-\frac{\mu (t,x)}{\varepsilon }\) then the coefficients of the transformed equations would depend on \(\frac{\mu (t,x)}{\varepsilon }\) and hence would have derivatives of order \(\frac{1}{\varepsilon }\). We overcome this problem by adding \(\frac{\mu }{\varepsilon }\) as an additional phase, which paradoxically introduces a new large term. However, that new large term can be rendered harmless by changing the time variable in a manner reminiscent of the change of time variable for the system (1.12), and the resulting limit equation does not retain any dependence on the extra phase.
For the Eq. (1.10) it is possible to obtain uniform existence but not the asymptotic behavior by combining \(\varepsilon \)-weighted estimates for derivatives obtained by the method of characteristics (cf. [12, Proposition 6.1.1]) with the technique of solving the PDE for the derivative appearing in the large term (cf. [21]). Details are presented in Sect. 6.4 for completeness. That approach only works for scalar hyperbolic equations because for parabolic equations solving the PDE for a first-order large term would lose a derivative, while for systems \(\varepsilon \)-weighted derivative estimates do not suffice because uniform estimates for derivatives are needed in order to obtain a uniform \(L^\infty \) bound. Hence, as for the parabolic Eq. (1.11), even the uniform existence part of our results for certain systems of the form (1.13) is new.
In all these equations, the variable x may be replaced by several variables \(x_i\), with only straightforward modifications to the results and proofs, possibly including an increase in the the required smoothness of the coefficients and initial data. However, increasing the number of y variables would be more difficult except possibly in certain cases for which the approach here could be combined with methods of geometric optics.
When the system (1.12) has three or more components but the sufficient conditions given here do not hold then uniform existence is likely to fail. However, when the conditions for the system (1.13) do not hold it is plausible that uniform existence may still hold in some cases but with the asymptotics of the solutions being more complicated. We hope to consider such cases and other equations requiring a more general ansatz for their asymptotic structure in future work.
Although this paper is primarily concerned with theoretical aspects of the equations studied, it may be noted that (1.10) is a transport equation and example (6.19) of a system of the form (1.12) is the well-known system form of a wave equation, i.e., the component u of that system satisfies \(u_{tt}-b(x)^2u_{xx}-\frac{c(x)^2}{\varepsilon ^2} u_{yy}=0\). Since both transport equations and wave equations occur in a variety of physical contexts our results may also have physical relevance.
1.2 Theorems
In the following theorems, C denotes a constant independent of \(\varepsilon \) that may be different in each appearance, and \(C^s_B\) denotes the space of functions having s continuous derivatives that are bounded uniformly in the independent variables, on any compact set of the dependent variables.
Before the statement of each theorem we present definitions and notations that are used in that theorem, and possibly also in subsequent theorems. In some cases the definitions involve key ideas of the proofs, which are presented here despite their length because the definitions are needed in order to properly state the theorems. Some remarks explaining the assumptions and conclusions are also presented.
As hinted in Sect. 1.1, the asymptotic form of solutions to the scalar hyperbolic Eq. (1.10) will be \(u\sim U^0(\mu (t,x),x,Y(t,x,y)-\frac{\mu (t,x)}{\varepsilon })\). In order to state the theorem for the solutions to that PDE we must first define the equations that determine the fast part \(\mu \) and the slow part Y of the phase function, and define the coefficients that will appear in the PDE for the asymptotic profile U. We begin by defining averaging and limit operations.
For any function g that is periodic with period P in a variable w and may also depend on other variables, define
In addition, for any function g depending continuously on \(\varepsilon \), define
Assume that
that (1.9) holds, and that
Since the change of variables \(y\mapsto -y\) replaces \(\partial _y\) with \(-\partial _y\), which in essence replaces c with \(-c\), by making that change of variables if necessary we can normalize c to satisfy
Let \(\mu (t,x)\) be the solution of the initial-value problem
and define Y(t, x, y) by
Here and later the fast phase vanishes at time zero because fast oscillations are not present in the assumed form of the initial data. Equation (1.20) links the slow part Y(t, x, y) and the fast part \(\mu (t,x)\) of the phase function \(\varepsilon Y-\mu \). Hence it is not possible to first make a change of independent variables \(y\mapsto Y(t,x,y)\) to reduce to the case when the slow part of the phase is simply the independent variable as in (1.5), and afterwards look for the appropriate fast phase.
Let y(t, x, Y) denote the inverse function of Y considered to be a function of y, i.e., the function such that \(y(t,x,Y(t,x,y))\equiv y\), and let \(t(\tau ,x)\) denote the inverse with respect to t of the function
both of which will be shown to exist at least for small times. Then for any function g of y and other variables define
It will be shown in Lemma 2.1 that if g is periodic in y with the same period as the coefficients of (1.10) then \({\widehat{g}}\) is periodic in Y with the same period, so that \(\left\langle {{\widehat{g}}}\right\rangle _{Y}\) is well defined. It is not necessary to determine y(t, x, Y) in order to calculate \(\left\langle {{\widehat{g}}}\right\rangle _{Y}\), since the change of variables \(y=y(t,x,Y)\) transforms the integral \(\tfrac{1}{P}\int _0^P g(t,x,y(t,x,Y))\,dY\) into \(\tfrac{1}{P}\int _0^P g(t,x,y) \left[ \frac{{a}_{(0)}}{{{{c}_{(0)}}}} \,\mu _t+ \frac{{b}_{(0)}}{{{{c}_{(0)}}}}\,\mu _x\right] \,dy\).
In addition, define
and
Finally, the norms that will appear in the theorems are
with \(X=C\) or \(X=H\), where T is a positive number, \(\alpha =(\alpha _1,\alpha _2,\alpha _3)\) is a multi-index with nonnegative components, s is a positive integer, and \(\varepsilon \) is the small parameter appearing in the PDE. The reason that y-derivatives are not multiplied by powers of \(\varepsilon \) in the norm (1.25) is that there is no dependence on y in the fast phase(s) \(\mu (t,x)\) or \(\mu ^{(j)}(t,x)\) that appear multiplied by \(\frac{1}{\varepsilon }\) in the asymptotic forms of the solutions.
A collection \(u^{(\varepsilon )}\) of functions depending on \(\varepsilon \) is asymptotic in \(X^s_{\varepsilon ,T,\text {loc}}\) to a profile \(u^{(0)}\) possibly depending on \(\varepsilon \) is a specified manner if for every \(C^\infty \) function \(\phi \) with compact support in the spatial variables, not depending on \(\varepsilon \), the \(X^s_{\varepsilon ,T}\) norm of \(\phi \, (u^{(\varepsilon )}-u^{(0)})\) converges to zero as \(\varepsilon \rightarrow 0\).
Theorem 1.1
Assume that (1.9), (1.16), and (1.17) hold, and normalize c as described after (1.17) so that (1.18) holds.
-
1.
Assume that for some \(s\ge 1\) the coefficients a, b, and c belong to \(C^{s+1}_B\) and the coefficients d and f and the initial data \(u_0\) belong to \(C^s_B\). Then for any positive \(\varepsilon _0\) there exists a time \(T>0\) such that for \(0<\varepsilon \le \varepsilon _0\) the solution of (1.10) having initial data \(u_0\) exists and belongs to \(C^{s}_B\) for \(0\le t\le T\), and satisfies
$$\begin{aligned} \sup _{0<\varepsilon \le \varepsilon _0} \Vert u\Vert _{C^s_{\varepsilon ,T}} \le K. \end{aligned}$$(1.26)Moreover, there is a \({\widetilde{T}}>0\) such that as \(\varepsilon \rightarrow 0\) the solution u is asymptotic in \(C^{s-1}_{\varepsilon ,{\widetilde{T}},\text {loc}}\) (and hence in particular in \(C^0_\text {loc}\)) to
$$\begin{aligned} U^{(0)}(\mu (t,x),x,Y(t,x,y)-\tfrac{\mu (t,x)}{\varepsilon }), \end{aligned}$$where \(U^{(0)}(\tau ,x,z)\) is the unique solution of the limit profile equation
$$\begin{aligned}{} & {} U^{(0)}_\tau +\left\langle {\frac{{\widehat{b}}_{(0)}}{{{\widehat{A}}}_{(0)}}}\right\rangle _{\!\!Y}\!\!(\tau ,x)\,U^{(0)}_x +\left\langle {\frac{{{\widehat{D}}}_{(0)}}{{\widehat{A}}_{(0)}}}\right\rangle _{\!\!Y}\!\!(\tau ,x,U^{(0)})\,U^{(0)}_z\nonumber \\{} & {} \quad +\left\langle {\frac{{{\widehat{f}}}_{(0)}}{{\widehat{A}}_{(0)}}}\right\rangle _{\!\!Y}\!\!(\tau ,x,U^{(0)})=0 \end{aligned}$$(1.27)satisfying
$$\begin{aligned} U^{(0)}(0,x,z)=u^0(x,z), \end{aligned}$$(1.28)where the notations \(\widehat{}\), \({}_{(0)}\), and \(\left\langle {}\right\rangle _{Y}\) are defined in (1.22), (1.15), and (1.14), respectively, A is defined in (1.23), and D is defined in (1.24).
-
2.
Now assume that \(s\ge 2\), and let \(\tau ^{(0)}\) be any time such that the limit profile \(U^{(0)}\) exists and has a finite \(C^1\) norm for \(0\le \tau \le \tau ^{(0)}\). Let
$$\begin{aligned} T^{(0)}\mathrel {:=}\mathop {\textrm{argmin}}\limits _t\left\{ \inf _x \partial _t \mu (t,x)=0 \quad \text {or}\quad \sup _x \mu (t,x)=\tau ^{(0)}\right\} \end{aligned}$$(1.29)be the smallest time t at which \(\mu \) either stops being increasing in t or takes on the value \(\tau ^{(0)}\), possibly at infinity. Then there exists a positive \(\varepsilon _1\) such that for \(0<\varepsilon \le \varepsilon _1\) the solution u exists and has finite \(C^1\) norm for \(0\le t\le T^{(0)}\), and there is a finite K such that for \(0<\varepsilon \le \varepsilon _1\)
$$\begin{aligned} \Vert u(t,x,y)-U^{(0)}(\mu (t,x),x,Y(t,x,y)-\tfrac{\mu (t,x)}{\varepsilon })\Vert _{C^{s-2}_{\varepsilon ,T^{(0)}}} \le K\varepsilon . \end{aligned}$$(1.30)In particular, \(\Vert u-U^{(0)}\Vert _{C^0}\le K\varepsilon \). Moreover, there exists a function \(U(\tau ,x,z,\eta ;\varepsilon )\) such that
$$\begin{aligned} u(t,x,y)\equiv U(\mu (t,x),x,Y(t,x,y)-\tfrac{\mu (t,x)}{\varepsilon },\tfrac{\mu (t,x)}{\varepsilon };\varepsilon ), \end{aligned}$$(1.31)and
$$\begin{aligned} \max _{0\le \tau \le \tau ^{(0)}}\Vert U(\tau ,x,z,\eta ;\varepsilon )-U^{(0)}(\tau ,x,z)\Vert _{C^{s-2}}\le K\varepsilon . \end{aligned}$$(1.32)
Remark 1.2
-
1.
Besides being used in Lemma 2.1, which is only needed when c depends on y, there is an additional reason why c is assumed to be bounded away from zero in Theorem 1.1, which will be explained in Sect. 6.2.
-
2.
In [13, Sect. 5.1] the time variable \(\tau \) of the transformed equation is assumed to equal the original time variable t, which avoids the somewhat involved condition in (1.29) that defines \(T^{(0)}\). For simplicity, in the subsequent theorems that involve a change of the time variable we will just prove existence and asymptotics for some time independent of \(\varepsilon \), although a more precise version similar to (1.29) could also be proven for those theorems.
-
3.
Although it is more natural to estimate \(u-U^{(0)}\) in the original variables (t, x, y), as in (1.30), the estimate (1.32) in the transformed variables \((\tau ,x,z,\eta )\) is stronger since none of the spatial derivatives are weighted by \(\varepsilon \). The latter estimate therefore makes clearer that u has the specific asymptotic form \(U^{(0)}(\mu (t,x),x,Y(t,x,y)-\tfrac{\mu (t,x)}{\varepsilon })\).
We next define some notations that will be used in the statement of the theorem concerning (1.11). Recalling the notation \({}_{(0)}\) defined in (1.15), let \(\mu \) be the unique solution of
which is the special case of (1.19) in which a, b, and c are independent of y. Define
Theorem 1.3
Assume that (1.9) holds and that the coefficients g, h, and k satisfy the nonuniform parabolicity condition
Suppose that, for some integer \(s\ge 4\), the coefficients a, b, c from (1.11) belong to \(C^{s+2}_B\), the coefficients d, f, g, h, and k there and also \(\int _0^1 \tfrac{\partial f}{\partial u}(t,x,ru)\,dr\) belong to \(C^s_B\), and the initial data \(u_0(x,y)\) belongs to \(H^{2s}\). Then
-
1.
The solution of that PDE having initial data \(u_0\) exists and belongs to \(H^{s}\) for at least a positive time T independent of \(\varepsilon \).
-
2.
Now assume that \(s\ge 6\). Let \(U^{(0)}(t,x,z)\) be the unique solution, which exists and belongs to \(H^s\) on some time interval \([0,T^{(0)}]\), of
$$\begin{aligned} \begin{aligned}&{a}_{(0)}(t,x)U^{(0)}_t+{b}_{(0)}(t,x)U^{(0)}_x+{D}_{(0)}(t,x,U^{(0)})U^{(0)}_z +{f}_{(0)}(t,x,U^{(0)})\\ {}&\quad ={K}_{(0)}(t,x,U^{(0)})U^{(0)}_{zz}, \\ {}&U^{(0)}(0,x,z)=u_0(x,z). \end{aligned}\end{aligned}$$(1.37)Then on the time interval \([0,\min (T,T^{(0)})]\)
$$\begin{aligned} \Vert u(t,x,y)-U^{(0)}(t,x,y-\tfrac{\mu (t,x)}{\varepsilon })\Vert _{H^{s-2}_{\varepsilon ,\min (T,T^{(0)})}}\le C\varepsilon . \end{aligned}$$(1.38)Moreover, there exists a function \(U(t,x,z,\varepsilon )\) such that
$$\begin{aligned} u(t,x,y)\equiv U(t,x,y-\tfrac{\mu (t,x)}{\varepsilon },\varepsilon ), \end{aligned}$$(1.39)and
$$\begin{aligned} \max _{0\le t\le \min (T,T^{(0)})}\Vert U(t,x,z,\varepsilon )-U^{(0)}(t,x,z)\Vert _{H^{s-2}}\le C\varepsilon . \end{aligned}$$(1.40)
In particular, \(\Vert u-U^{(0)}\Vert _{C^0}\le C\varepsilon \).
Remark 1.4
-
1.
The powers of \(\varepsilon \) multiplying the diffusion terms \(u_{xx}\) and \(u_{xy}\) in (1.11) are needed to ensure that the coefficient K of the diffusion term \(U_{zz}\) in the Eq. (3.2) for U contains no powers of \(\frac{1}{\varepsilon }\). Due to the presence of those powers, the nonlinear diffusion terms in (1.11) cannot be uniformly parabolic uniformly in \(\varepsilon \). Consequently, classical results for nonlinear uniformly parabolic equations cannot be used. We shall apply instead recent results [25, Theorems 2.7, 4.1] for non-uniformly parabolic equations. The extra smoothness required in Theorem 1.3 and the other requirements there on the coefficients are conditions of those results.
-
2.
In contrast to (1.10), it does not seem possible to allow the coefficients of (1.11) to depend on y, because changing the time variable of (1.11) to \(\tau \mathrel {:=}\mu (t,x)\) as for (1.10) would yield a term involving the second derivative with respect to \(\tau \) not having a fixed sign, arising from the parabolic terms in (1.11).
We next consider system (1.12). By defining fast phases in an appropriate manner and making transformations of both independent and dependent variables it is possible under certain conditions to obtain a system for which the large terms have constant coefficients, which will ensure that solutions of the original system (1.12) exist for a time independent of \(\varepsilon \). The most important condition is (1.49), which as noted below corresponds to a special case of the coherence assumption of [13]. While this correspondence motivates assumption (1.49), we do not use any results from [13] in the proof of the theorem for system (1.12), but instead show by direct calculation that (1.49) ensures that the large terms of the transformed system have constant coefficients.
Phases. We will let the fast phases \(\mu ^{(j)}\) be the solutions of the generalization
of condition (1.7) that held for the prototypical equation (1.4). Now that the equation is a system it is no longer possible to make the large terms vanish entirely, but (1.41) ensures that a generalization of that condition will hold, namely that the matrices \({\widehat{C}}^{(j)}\) defined below in (1.45) will have determinant zero.
For any matrix M of size n, let \(\{\lambda _j(M)\}_{j=1}^n\) denote the set of its eigenvalues, whose order will be chosen according to some rule when needed. Since the vanishing of \(\mu \) at time zero implies that \(\mu _x(0,x)\) also vanishes, and we will assume that A is positive definite, (1.41) implies that
for some ordering of those \(\mu ^{(j)}\) and \(\lambda _j\). We will assume that for some choice of the eigenvalues \(\lambda _1\) and \(\lambda _2\),
and let \(\mu ^{(1)}\) and \(\mu ^{(2)}\) be the corresponding solutions of (1.41) satisfying (1.42).
Transformations. Define the new time variable by
As will be shown, (1.43) implies that \(\tau (t,x)\) can be inverted with respect to its first variable, at least on some time interval. Let \(t(\tau ,x)\) denote the inverse function, and define
We will show that \({{\widehat{A}}}_{(0)}\) is positive definite. Using that matrix, define, for any matrix-valued function \({\widehat{M}}\) of \((\tau ,x,\varepsilon U,\varepsilon )\) or \((\tau ,x, U,\varepsilon )\),
To begin the final set of transformations, let \(r^{(j)}(\tau ,x)\), \(j=1,\cdots , n\), where n is the length of the vector u in (1.12), be the normalized eigenvectors of the matrix \({{{\widetilde{C}}}}^{(1)}(\tau ,x)\), chosen to be orthogonal if they correspond to a repeated eigenvalue, and let \(R(\tau ,x)\) be the matrix whose columns are the \(r^{(j)}\). For any matrix-valued function \({\widetilde{M}}\) define
where any arguments not present in \({\widetilde{M}}\) are omitted from \({\mathcal {M}}\) as well, and define
Large operator. In order to obtain a constant-coefficient large operator, we will assume that all solutions \(\mu ^{(j)}(t,x)\) of (1.41) satisfy
where
Then define
where \(\mathop {\textrm{diag}}\limits (s^{(j)})\) denotes the diagonal matrix or operator with diagonal entries \(s^{(j)}\). Finally, as usual let \(e^{(j)}\) denote the vector whose \(j^{\hbox {th}}\) component is one and whose other components are zero.
Theorem 1.5
Assume that the matrices A, B, C, and D are symmetric, that D and \(f(t,x,u,\varepsilon )\) belong to \(C^s_B\) for some \(s\ge 4\), that A, B, and C belong to \(C^{s+2}_B\), and that the initial data \(u_0\) belongs to \(H^s\). Assume also that A satisfies the generalization
of (1.9). In addition, assume that there exist eigenvalues \(\lambda _1\) and \(\lambda _2\) of
satisfying (1.43), and that (1.49)–(1.50) hold. Then there exists a positive time \(T_{\min {}}\) such that the solution u(t, x, y) of (1.12) with the initial data \(u_0\) exists for \(0\le t\le T_{\min {}}\) and satisfies
where \(\tau (t,x)\) is defined in (1.44) and \(\mathcal {U}^{(0)}\) is the unique solution of
Moreover, there exists a function \({\mathcal {U}}(\tau ,x,z_1,z_2,\varepsilon )\) and a positive \(\tau _{\min }\) such that
and
In particular,
Remark 1.6
-
1.
The second equation of (1.54) says that the \(j^{\hbox {th}}\) component of \({\mathcal {U}}^{(0)}\) depends on the variables \(z_1\) and \(z_2\) only through the linear combination \(\alpha ^{(j)} z_1+(1-\alpha ^{(j)})z_2\). In other words, each component of \({\mathcal {U}}\) depends in the limit on its own phase \(y-\frac{\alpha ^{(j)}\mu ^{(1)}+(1-\alpha ^{(j)})\mu ^{(2)}}{\varepsilon }\) but not on the phases of the other components.
-
2.
Condition (1.49) always holds for \(j=1,2\), with \(\alpha ^{(1)}=1\) and \(\alpha ^{(2)}=0\), so that assumption only restricts systems that are \(3\times 3\) or larger. For those systems that restriction is quite severe, as will be illustrated in Example 6.6. In terms of the geometric-optics phases \(\phi ^{(j)}\mathrel {:=}{\widehat{y}}-\mu ^{(j)}(t,x)\), where \(\widehat{y}\mathrel {:=}\varepsilon y\), condition (1.49) ensures that the \(\phi ^{(j)}\) satisfy the coherence condition [13, Definition 2.1.1].
-
3.
The equation \(\mathbb {P}V=0\) does not necessarily imply the existence of a function W such that \(V={\mathcal {L}}{\mathcal {W}}\) because of the problem of small divisors. Cf. [22, pp. 486– 487]. The results of [20, 21] on the existence of such a W do not apply here, because \(\widehat{\mathcal {L}}({\mathbf {\xi }})=\xi _1 \mathop {\textrm{diag}}\limits (\alpha ^{(j)}-1)+\xi _2\mathop {\textrm{diag}}\limits (\alpha ^{(j)})\) does not satisfy the assumption there that the dimension of the null space of \(\widehat{{\mathcal {L}}}\) be independent of \(\mathbf \xi \) for all \({\mathbf \xi }\in \mathbb {R}{\setminus }\{0\}\).
-
4.
Theorems 1.5 and 1.7 below require that the matrices A, B, and C have one more derivative than was needed for Theorem 1.1, because the derivatives of \(({{\widehat{A}}}_{(0)})^{-\frac{1}{2}}\) appearing in (1.48) contain second derivatives of the \(\mu ^{(j)}\), which are as smooth as those matrices.
Some of the definitions for our final theorem are similar to those for Theorem 1.5, others are generalizations of definitions for Theorem 1.1, and a few are unique to Eq. (1.13). The key new condition, which unfortunately is quite restrictive, is (1.82) below, which ensures that the dependence of the coefficients on y does not produce any large term in the transformed system.
Phases. In the theorem for system (1.13) we will assume not only that A is positive definite, but also that C is invertible. Define the slow parts \(Y^{(k)}\) and the fast parts \(\mu ^{(k)}\) of the phases \(\varepsilon Y^{(k)}-\mu ^{(k)}\) by
and
Alternatively, the initial condition on the \(Y^{(k)}\) in (1.57) can be replaced by the normalization
where P is the period of the coefficients with respect to the variable y, which ensures that the Fourier series in y of \(Y^{(k)}(t,x,y)-y\) will have no constant term.
Since a condition \(0=\det (w M-N)\) is equivalent to requiring that w be an eigenvalue of \(M^{-1}N\), (1.57)–(1.58) generalize both (1.19)–(1.20) and (1.41), and will ensure that the matrices \({\widehat{C}}^{(k)}\) defined below in (1.72) will have determinant zero. The reason for using \(\mu ^{(k)}_t\lambda _k(C^{-1}A_{(0)}+\tfrac{\mu ^{(k)}_x}{\mu ^{(k)}_t}C^{-1}B_{(0)})\) rather than the simpler expression \(\lambda _k(\mu ^{(k)}_t C^{-1}A_{(0)}+\mu ^{(k)}_xC^{-1}B_{(0)})\) is to avoid complications arising from the fact that \(\lambda _k(c M)\) might equal \(c \lambda _{3-k}(M)\) rather than \(c\lambda _k(M)\) when \(c<0\), depending on the way the ordering of the eigenvalues is determined.
The assumptions on A and C ensure that the eigenvalues \(\lambda _k(C^{-1}A{\big |_{t=0}})\) are nonzero, although not necessarily positive, which will imply that both \(\lambda _k\big (C^{-1}A_{(0)}+\tfrac{\mu ^{(k)}_x}{\mu ^{(k)}_t}C^{-1}B_{(0)}\big )\) and \(\mu ^{(k)}_t\) are nonzero for sufficiently small times. In addition, we will assume that
By exchanging the labels 1 and 2 if necessary we can then arrange that
Transformations. Define the new time and spatial variables by
which generalize (1.44) and the formula \(z=y-\frac{\mu (t,x)}{\varepsilon }\) used implicitly in (1.31)–(1.32). In system (1.12) the coefficients did not depend on y, so there was no need to express y in terms of the new spatial variables \(z_1\) and \(z_2\). For Eq. (1.10) the formula for y took the simple form \(y=z+\eta \). However, for the system (1.13) the relationship between y and \((z_1,z_2)\) is more complicated, and in particular is intertwined with the relationship between the old and new time variables t and \(\tau \). In order that terms of size \(O(\frac{1}{\varepsilon })\) like those appearing in the formula for the \(z_k\) in (1.62) will not appear in the formula for y in terms of \((z_1,z_2)\), that formula will be derived using the functions
and
The point is that the formula for G together with (1.62) implies that
on account of the cancellation of the \(O(\tfrac{1}{\varepsilon })\) term \(\frac{\mu ^{(1)}\mu ^{(2)}}{\varepsilon t}\).
It will be shown that Q is invertible with respect to y. Let \(Q^{-1}\) denote the corresponding inverse function, i.e., the function such that
Applying \(Q^{-1}(t,x,\cdot )\) to both sides of (1.65) yields
We therefore define
Using \({\mathcal {Y}}\), define next
We will also show that \({\widehat{\tau }}\) is invertible with respect to its first variable. Let \({\widehat{t}}\) denote the corresponding inverse function, i.e., the function such that
We therefore define
We are now ready to define the first form of the transformed coefficients, generalizing (1.45):
where the term \(\frac{{\widehat{C}}^{(k)}-{\widehat{C}}_{(0)}^{(k)}}{\varepsilon }\) was included in \({\widehat{D}}\) because we will use \({{\widehat{C}}}_{(0)}^{(k)}\) rather than \({\widehat{C}}^{(k)}\) in what follows. We now proceed in similar fashion to (1.46)–(1.47): Define, for any matrix function \({\widehat{M}}\),
where any arguments absent in \({\widehat{M}}\) are also omitted from \({\widetilde{M}}\). Next, we let \(\{r^{(j)}(\tau ,x,z_1,z_2)\}_{j=1}^2\) be the normalized eigenvectors of \({{\widetilde{C}}_{(0)}}^{(1)}(\tau ,x,z_1,z_2)\) and let \(R(\tau ,x,z_1,z_2)\) be the matrix whose columns are the \(r^{(j)}\), and for any matrix-valued function \({\widetilde{M}}\) define
where any arguments not present in \({\widetilde{M}}\) are omitted from \({\mathcal {M}}\) as well. In slightly different fashion than (1.48), define
where \(\rho (\tau ,x,z_1,z_2,\varepsilon )\) is a scalar function to be chosen later. Its purpose is to make \(\rho ({\widehat{A}}_{(0)})^{-\frac{1}{2}}R\) independent of \((z_1,z_2)\), because otherwise we would need to add to \({\mathcal {F}}\) the terms
which would make the estimates non-uniform in \(\varepsilon \).
Large operator. Define
Theorem 1.7
Assume that A, B, C, and D are symmetric \(2\times 2\) matrices, that D and f belong to \(C^s\) for some \(s\ge 4\), that A, B, and C belong to \(C^{s+2}\), and that the initial data \(u_0\) belong to \(H^s\). Assume also that
and
that (1.60)–(1.61) hold, and that
Finally, assume that there exists a scalar function \(\rho (\tau ,x,z_1,z_2)\) satisfying
such that
Then there exists a positive time \(T_{\min {}}\) such that the solution u(t, x, y) of (1.13) with the initial data \(u_0\) exists for \(0\le t\le T_{\min {}}\) and satisfies
where \(\tau \), \(\mu ^{(k)}\), and \(Y^{(k)}\) are defined in (1.62), (1.58), and (1.57), respectively, and the asymptotic profile \(\mathcal {U}^{(0)}(\tau ,x,z_1,z_2)\) is the unique solution of
Moreover, there exists a function \({\mathcal {U}}(\tau ,x,z_1,z_2,\varepsilon )\) and a positive \(\tau _{\min }\) such that
and
In particular,
Remark 1.8
-
1.
As will be shown in the proof of Theorem 1.7, condition (1.80) ensures that the new time variable vanishes identically when the original time variable does. Condition (1.80) is therefore analogous to [13, Assumption 2.3.1] that requires that some linear combination of the phases vanish identically at time zero.
-
2.
As in Theorem 1.5, it would be possible to allow systems of size greater than two in Theorem 1.7, provided that condition (1.49) plus a similar condition on the slow parts \(Y^{(j)}\) of the phases hold, with the same coefficients \(\alpha ^{(j)}\) in both. However, in addition to the restrictions those conditions impose, condition (1.82) would become more much more restrictive as the size of the system increases.
-
3.
In all the theorems, the time of existence depends on the initial data only through the norm in which it is assumed to be bounded.
The results for the PDEs (1.10), (1.11), (1.12), and (1.13) will be proven in Sects. 2, 3, 4, and 5, respectively.
2 Scalar PDE with y-periodic coefficients
We begin by showing some properties of the transformation from y to Y. In order that the result will also be applicable to system (1.13), (1.17) rather than its normalized version (1.18) will be assumed.
Lemma 2.1
Assume that the coefficients a, b, and c in (1.10) are periodic with period P in y and belong to \(C^r_B\) for some \(r\ge 1\), and that a and |c| satisfy the positivity conditions (1.9) and (1.17), respectively. Then there exists a positive time T such that the following hold for \(0\le t\le T\):
-
1.
The transformation of the independent variables \(y\mapsto Y\) defined by (1.20) is \(C^{r}\), has a \(C^{r}\) inverse \(Y\mapsto y(t,x,Y)\), and satisfies
$$\begin{aligned} Y(t,x,y+P)\equiv Y(t,x,y)+P,\qquad y(t,x,Y+P)\equiv y(t,x,Y)+P. \end{aligned}$$(2.1) -
2.
For any function g(t, x, y, v) that is periodic with period P in y, the function \({\widehat{g}}\) defined in (1.22) satisfies
$$\begin{aligned} \begin{aligned} {\widehat{g}}(\tau ,x,Y+P,v) \equiv {\widehat{g}}(\tau ,x,Y,v), \end{aligned} \end{aligned}$$(2.2)i.e., \({\widehat{g}}\) is periodic in Y with period P.
Proof
The claimed smoothness of Y(t, x, y) follows from its definition and the assumed smoothness of the coefficients a, b, and c together with the assumed positivity of |c|. The definitions (1.20) of Y and (1.19) of \(\mu \) together with the assumed positivity of a and |c| ensure that \(Y_y(0,x,y)\ge k>0\), because (1.19) implies that \(\mu _t\) has the same sign as c at time zero. The PDE satisfied by \(\mu \) together with the assumed smoothness and boundedness of the coefficients a, b, and c then ensure that there exists a positive T such that \(Y_y(t,x,y)\ge \frac{k}{2}\) for \(0\le t\le T\). Hence Y(t, x, y) is invertible with respect to y in that time interval, and the smoothness of its inverse y(t, x, Y) follows from the smoothness of Y(t, x, y) together with the positivity of \(Y_y\).
Integrating the equation for \(Y_y\) in (1.20) from y to \(y+P\) and using the PDE (1.19) for \(\mu \) yields
which is the first identity in (2.1). Applying \(y(t,x,\cdot )\) to both sides of the first identity in (2.1) yields \( y(t,x,Y(t,x,y)+P)=y(t,x,Y(t,x,y+P)) =y+P\), and substituting \(y=y(t,x,Y)\) into the far right and far left of that result shows that the second identity in (2.1) also holds. In particular, for any function g(t, x, y, v) that is periodic with period P in y, the function \(\widehat{g}(\tau ,x,Y,v)\) defined in (1.22) satisfies
which shows that (2.2) holds, since the transformation from t to \(\tau \) only involves the variables t and x and so does not affect periodicity with respect to Y. \(\square \)
Proof of Theorem 1.1
Differentiating the first identity in (2.1) with respect to x shows that \(\frac{\partial Y}{\partial x}_{|y\mapsto y+P}=\frac{\partial Y}{\partial x}\), i.e., \(\frac{\partial Y}{\partial x}\) is periodic in y. Hence \(\widehat{\tfrac{\partial Y}{\partial x}}\) defined as in (1.22) is periodic in Y by (2.2), and the same holds for \(\tfrac{\partial Y}{\partial t}\) and \(\tfrac{\partial Y}{\partial y}\).
Next, the positivity of a and the \(C^0_B\) bound for c imply a positive lower bound for the coefficient \(\left\langle {\frac{{a}_{(0)}}{{{{c}_{(0)}}}}}\right\rangle _{\!\!y}\) of \(\mu _t\) in (1.19). Hence the PDE and initial condition there imply that for sufficiently small times the transformation from t and x to \(\tau \mathrel {:=}\mu (t,x)\) and x is invertible, and in view of the assumed smoothness of the coefficients both \(\mu (t,x)\) and its inverse function \(t(\tau ,x)\) belong to \(C^{s+1}\).
We now calculate that the function u defined by (1.31) will satisfy (1.10) provided that \(U(\tau ,x,z,\eta )\) satisfies the PDE
where A and D are defined in (1.23)–(1.24). The initial data for U is
The positivity of a, together with the PDE and initial condition satisfied by \(\mu \) and the fact that a depends on U only through \(\varepsilon U\) imply that A is positive up to some positive time T independent of \(\varepsilon \). Hence we can divide (2.4) by A to obtain a PDE in which both the time derivative and the large term have constant coefficients. Standard \(C^0\) estimates along characteristics, similar to those in [12, Sect. 6.2], then show that U and its spatial derivatives through order s are uniformly bounded up to some time independent of \(\varepsilon \). Moreover, since the initial data is independent of \(\eta \), \(U_\tau \) is uniformly bounded initially, and then similar estimates show that \(U_\tau \) is uniformly bounded in \(C^{s-1}\). Standard results for singular limits [22, Theorem 2.1, Theorem 2.3, Corollary 2.4], adapted to \(C^s\) spaces rather than the standard \(H^s\) spaces, therefore yield the convergence of U to a limit that is independent of \(\eta \) and satisfies the limit PDE obtained by averaging over \(\eta \) the PDE obtained by dividing (2.4) by A, thereby eliminating the large term, and taking the limit of the result as \(\varepsilon \rightarrow 0\). Since shifting the independent variable preserves averages over that variable, for any function Q
In addition, the standard results just cited show that under the extra smoothness assumption the solution exists for as long as the limit exists and converges at the rate \(O(\varepsilon )\) to that limit. Translating those results back from U to u yields the conclusions of the theorem. \(\square \)
3 Scalar parabolic PDE
Define
We look for a solution u having the form (1.39), where \(\mu \) is the unique solution of (1.33) and \(U(t,x,z,\varepsilon )\) will be the solution of
satisfying \(U(0,x,z,\varepsilon )=u_0(x,z)\), where D and K were defined in (1.34), (1.35).
Proof of Theorem 1.3
Plugging (1.39) into (1.11) yields
which by the definitions of \(\mu \) in (1.33), of D in (1.34), and of G and H in (3.1), and K in (1.35) reduces to (3.2). This shows that if \(\mu \) satisfies (1.33) and U satisfies (3.2) then the function u defined by (1.39) satisfies (1.11).
Since equation (1.33) is linear, the solution \(\mu \) belongs to \(C^{s+2}_B\) for all time. Note also that \(\mu \) does not depend on \(\varepsilon \). Hence the coefficients of (3.2) belong to \(C^{s}_B\). Moreover, equation (3.2) for U is nonuniformly parabolic in the sense used in [25], i.e., for all real \(\alpha \) and \(\beta \) and all values of the arguments \((t,x,U,\varepsilon )\),
where the inequality in the right hand-side follows from the parabolicity assumption (1.36) with \({\tilde{\alpha }}=\varepsilon \alpha -\beta \mu _x\) and \({\tilde{\beta }}=\beta \). In particular, taking \(\alpha \) equal to zero shows that
The assumptions of Theorem 1.3 imply that the conditions of [25, Theorem 4.1] hold, so by that theorem there exists a unique solution U of (3.2) having an \(H^s\) bound on some time interval [0, T]. Since the coefficients in (3.2) depend smoothly on \(\varepsilon \), that bound and time of existence are independent of \(\varepsilon \) for \(0<\varepsilon \le \varepsilon _0\).
Hence the right hand-side of (1.39) exists for at least a time independent of \(\varepsilon \) and satisfies (1.11) with initial data \(u_0\). Since [25, Theorem 4.1] can also be applied to (1.11) for fixed \(\varepsilon \) and that theorem includes a uniqueness result, the identity (1.39) must hold.
Similarly, in view of (3.3), the PDE (1.37) also satisfies the assumptions of [25, Theorem 4.1], so under the additional assumptions of the theorem the solution \(U^{(0)}\) of that equation exists and belongs to \(H^s\) on some time interval \([0,T^{(0}]\).
To prove the error estimate, note that the corrector \(U^{(1)}=\frac{U-U^{(0)}}{\varepsilon }\) satisfies
where for any function M the coefficient \(\delta M\) is defined to be \(\frac{M-{M}_{(0)}}{\varepsilon }\). The coefficients D, f, and K involve U not multiplied by \(\varepsilon \), neither directly multiplying the argument U nor multiplying the entire function. Hence \(\delta M\) with \(M\in \{D,f,K\}\) include terms of the form \(\frac{M(t,x,U,\varepsilon )-M(t,x,U^{(0)},0)}{\varepsilon }\), which in view of the fact that \(U=U^{(0)}+\varepsilon U^{(1)}\) can be written as
After subtracting \(\frac{M(t,x,U,0)-M(t,x,U^{(0)},0)}{\varepsilon }\) from R on the right side of (3.4) and subtracting the equivalent expression \(\left[ \int _0^1 M_U(t,x,(1-s) U^{(0)}+ s U)\,ds\right] U^{(1)}\) from the left side of that equation, the left side of the modified equation is linear in \(U^{(1)}\) and the modified R contains no explicit dependence on \(U^{(1)}\) but only on U and \(U^{(0)}\). Moreover the assumed smoothness of the coefficients together with the fact that U and \(U^{(0)}\) belong to \(H^s\), plus estimates like that in (3.5) but without needing to express U in terms of \(U^{(0)}\) and \(U^{(1)}\), ensure that the modified R is bounded in \(H^{s-2}\) by a constant independent of \(\varepsilon \). Since \(U^{(1)}\) equals \(\frac{U-U^{(0)}}{\varepsilon }\) it certainly exists and belongs to \(H^s\) on \([0,\min (T,T^{(0)})]\), and the coefficients of the equation it satisfies belong to \(C^{s-2}\), while the inhomogeneous term that belongs to \(H^{s-2}\). Hence we can apply [25, Theorem 2.7] with s replaced by \(s-2\) to obtain a uniform bound for \(U^{(1)}\). Moreover, since (3.4) is linear in \(U^{(1)}\), that bound holds on \([0,\min (T,T^{(0)})]\), which yields (1.40) and hence also (1.38). \(\square \)
4 Symmetric hyperbolic system
Proof of Theorem 1.5
The conditions on the fast phases \(\mu ^{(j)}\) in (1.41) and (1.50) together with assumption (1.43) and the boundedness of the coefficients of (1.12) ensure that the variable \(\tau \) defined by (1.44) satisfies
Hence there is a positive \(T_1\) such that the change of variables \((t,x)\mapsto (\tau ,x)\) is one to one for \(0\le t\le T_1\) and a positive \({\widetilde{T}}_1\) such that its inverse \(t(\tau ,x)\) is defined and one to one for \(0\le \tau \le {\widetilde{T}}_1\). Moreover, (1.42)–(1.43) plus (1.52) ensure that \({{\widehat{A}}}_{(0)}{\big |_{t=0}}\ge cI\) for some positive c, which by the smoothness of A ensures that that condition continues to hold up to some positive time with a possibly smaller yet still positive constant.
We look for solutions having the form
Substituting (4.2) into (1.12) shows that u will satisfy (1.12) provided that \(U(\tau ,x,z_1,z_2,\varepsilon )\) satisfies
where the \(\;\widehat{}\;\) coefficients are defined in (1.45).
The first simple yet key observation is that by construction
Applying \(({{\widehat{A}}}_{(0)})^{-1/2}\) on both the left and right of both sides of (4.4) yields
where the \(\;\widetilde{}\;\) coefficients are defined in (1.46). Equation (4.5) implies that the eigenvectors \(r^{(j)}(\tau ,x)\) of \({\widetilde{C}}^{(1)}\) defined after (1.46) are also eigenvectors of \({\widetilde{C}}^{(2)}\). Also, since the matrix \({\widetilde{C}}^{(1)}\) is symmetric, the set of its orthonormal eigenvectors \(r^{(j)}(\tau ,x)\) forms a basis, and hence the matrix R whose columns equal those eigenvectors is an orthogonal matrix.
The second simple key observation is that not only does \(\det {{\widetilde{C}}}^{(j)}=0\) for \(j=1,2\) hold on account of the definition of the \(\mu ^{(j)}\), but more generally the assumption (1.49) implies that
After relabeling if necessary, and allowing repeated solutions \(\mu ^{(j)}\) of (1.41) and \(\alpha ^{(j)}\) of (1.49), the eigenvectors \(r^{(j)}\) are the null eigenvectors of the matrices appearing inside the determinant in (4.6). Combining the resulting equation
with (4.5) shows that
The identities (4.8) together with the constancy of the \(\alpha ^{(j)}\) ensure that the multiplicity of each eigenvalue of the \({{\widetilde{C}}}^{(j)}\) is constant for sufficiently small times, and hence that the \(r^{(j)}(\tau ,x)\) can be chosen to be as smooth as the matrices \({{\widetilde{C}}}^{(j)}\). Hence the matrices \(\mathcal {A}\), \({\mathcal {B}}\), \({\mathcal {C}}^{(j)}\), and \({\mathcal {D}}\), and the vector \({\mathcal {F}}\) defined in (1.47)–(1.48) also belong to \(C^s\). Moreover, the identities (4.8) then imply that
Hence, upon substituting
into (4.3) and multiplying the result by \(R^T({{\widehat{A}}}_{(0)})^{-1/2}\) we obtain that \({\mathcal {U}}\) satisfies
where
Formulas (1.46) and (1.47) together with the orthogonality of the matrix R ensure that
while substituting (4.9) into (4.12) shows that definition (4.12) agrees with the previous definition (1.51) of \({\mathcal {L}}\); in particular, \({\mathcal {L}}\) has constant coefficients.
We now discuss the initial data \({\mathcal {U}}_0\) for the system (4.11). The condition that \({\mathcal {U}}_0\) correspond to the initial data \(u_0(x,y)\) of the original system (1.12) is
The third and final key observation is that the non-uniqueness of \({\mathcal {U}}_0\) arising from the two occurrences of y on the left side of (4.14) can be utilized to make \({\mathcal {U}}_0\) satisfy
which will ensure that
Formula (1.51) implies that in order to obtain (4.15) the first component of \({\mathcal {U}}\) should be independent of \(z_2\), its second component should not depend on \(z_1\), and in general component j should depend on \(z_1\) and \(z_1\) only via the combination \(\alpha ^{(j)} z_1+(1-\alpha ^{(j)} )z_2\). Hence, by construction, the formula for \({\mathcal {U}}_0\) in (1.54) satisfies both (4.14) and (4.15).
Although the matrix \({\mathcal {A}}\) multiplying the time derivatives in (4.11) depends in general on \(\tau \) and x as well as \(\varepsilon {\mathcal {U}}\), the fact that \({{\mathcal {A}}}_{(0)}=I\) together with the smoothness of the dependence of \({\mathcal {A}}\) on its arguments ensures that \({\mathcal {A}}_\tau \) and \({\mathcal {A}}_x\) are \(O(\varepsilon )\), just as \(\nabla _{\!{\mathcal {U}}}{\mathcal {A}}=O(\varepsilon )\). Together with the fact that the large operator has constant coefficients, this implies that for some positive time \(T_2\) standard energy estimates for the system (4.11) yield an \(H^s\) bound for \(\Vert {\mathcal {U}}\Vert _{H^s}\) that is uniform in \(\varepsilon \), as in the classical theorem [18, Theorem 2.3] for singular limits that assumes that \({\mathcal {A}}\) depends only on \(\varepsilon \mathcal {U}\). Furthermore, the uniform bound (4.16) on the initial value of \({\mathcal {U}}_\tau \) implies, again as in the classical case, a uniform bound for the \(H^{s-1}\) norm of \(\mathcal {U}_\tau \) on the same time interval. The proof of the convergence result for the classical case [18, Theorem 2.3 and comments in proof]), [21, Theorem 2] remains valid when the the limit is given in the form (1.54) even without the assumption [21, (2.7)] on the rank of the large operator. This shows that as \(\varepsilon \rightarrow 0\), \({\mathcal {U}}\) converges in \(H^r\) for \(r<s\) to the unique solution of the initial-value problem (1.54). As in the proof of Theorem 1.1, [22, Corollary 2.4] shows that the rate of convergence is \(O(\varepsilon )\), i.e. (1.56) holds. Transforming back to the original variables yields (1.53). \(\square \)
5 Special \(2\times 2\) systems with y-dependence
Proof of Theorem 1.7
The uniform positivity of \(A_{(0)}\) together with the uniform invertibility of C ensures that the eigenvalues of \(A_{(0)}^{\frac{1}{2}}C^{-1}A_{(0)}^{\frac{1}{2}}\) are bounded away from zero. The continuity of those eigenvalues together with (1.60) therefore ensures that
each have a fixed sign. In turn, this ensures that the averages of y of those eigenvalues and of their difference have the same fixed signs as before averaging, and are bounded away from zero. We can therefore replace some or all of the expressions in (1.61) by their averages and that result will still hold. Using this fact we will prove the claims made during the presentation of the definitions used in the theorem, namely that \(\tau \) is invertible with respect to the variable t, that Q is invertible with respect to y, and that \({{\widehat{A}}}_{(0)}\) is positive definite at time zero. In addition, we will show that \(\tau \) is identically zero when \(t=0\). In particular, we will show that up to some positive time
which by the definition (1.62) of \(\tau \) implies the desired invertibility of \(\tau \).
The null initial condition for \(\mu ^{(k)}\) in (1.58) together with the smoothness of the coefficients in the PDE for \(\mu \) there ensure that \(\frac{\mu ^{(k)}}{t}\) remains bounded as \(t\rightarrow 0\) and satisfies
where we have used the fact that matrices that are similar have the same eigenvalues. In particular, the eigenvalues of \(C^{-1}A_{(0)}{\big |_{t=0}}\) are real, bounded, and bounded away from zero, and by (1.60) and the discussion above their difference and the difference of their averages is at least a fixed constant. In particular, \(\mu ^{(k)}_t(0,x)\) is bounded and bounded away from zero. Since real eigenvalues of a \(2\times 2\) matrix with real coefficients can become complex only after they coalesce, condition (1.60) then ensures that up to some time independent of \(\varepsilon \) the eigenvalues of the matrix appearing on the right sides of (1.57)–(1.58) are distinct and real, even though that matrix is not necessarily conjugate to a symmetric matrix for \(t\ne 0\). The smoothness of the coefficient matrices together with (5.2) then ensures that the bounds on \(\mu \) and Y in the second inequality in (5.1) hold. Moreover, once \(Q_y\) is shown to be bounded and bounded away from zero the bounds in (5.1) for \({\mathcal {Y}}\) will also hold. Moreover, by (1.60) plus the discussion above and the smoothness of \(\mu \), (5.2) implies that the first inequality in (5.1) holds.
Substituting (5.2) into (1.57) and using once more the initial condition satisfied by \(\mu ^{(k)}_x\) shows that
Substituting (5.2)–(5.3) into the definition (1.64) of Q and using the assumptions (1.52), (1.79), (1.60), and the \(C^0_B\) bounds on the coefficient matrices A and C shows that Q remains bounded as \(t\rightarrow 0\) and
which by (1.61) plus the discussion above is bounded from below by a positive constant. The smoothness of \(\mu \) and the coefficient matrices A and C together with (5.2) implies that \(Q_y\) is \(C^s\), so \(Q_y\) is positive, and hence Q is invertible with respect to y, at least up to some positive time, and \(Q^{-1}\) will be \(C^s\) up to that time. As noted above, this implies the estimates for \({\mathcal {Y}}\) in (5.1).
To show that \(\tau (0,x,y,\varepsilon )\equiv 0\), subtract (5.3) with \(k=1\) from the same equation with \(k=2\) to obtain
which by assumption (1.80) is identically zero. Integrating (5.5) with respect to y and using the second condition in (1.57) or its alternative (1.59) then yields \(Y^{(1)}(0,x,y)-Y^{(2)}(0,x,y)\equiv 0\). Hence, in view of the initial condition \(\mu ^{(k)}(0,x)\equiv 0\) from (1.58), and the definition (1.62) of \(\tau \), \(\tau (0,x,y,\varepsilon )\) is indeed identically zero. In addition, the first estimate in (5.1) together with the definition of \(\tau \) implies that \(\tau >0\) when \(t>0\).
We next show that \({\widehat{A}}_{(0)}\) is positive definite at time zero, and hence also up to some positive time. Using once more the fact that the initial condition for \(\mu \) from (1.58) implies that \(\mu ^{(k)}_x(0,x)\equiv 0\) together with the formulas (5.2)–(5.3) for the initial values of \(\mu ^{(k)}_t\) and \(Y^{(k)}_y\) yields the formula
in which \(\lambda _k\) denotes \(\lambda _k\big (A_{(0)}^{\frac{1}{2}}C^{-1}A^{\frac{1}{2}}_{(0)}\big )\). Since the matrix \(A_{0}^{-\frac{1}{2}}C A_{0}^{-\frac{1}{2}}\) appearing in the last line of (5.6) is the inverse of \(\lambda _k\big (A_{(0)}^{\frac{1}{2}}C^{-1}A^{\frac{1}{2}}_{(0)}\big )\), the eigenvalues of the matrix M appearing in (5.6) are
which reduce to
The normalization (1.61), together with the fact shown above that any of the expressions in that inequality may be replaced by their averages, ensures that both eigenvalues of the symmetric matrix M are positive, and hence that matrix is positive definite. Since \(A_{(0)}\) is also positive definite by assumption (1.52), (5.6) shows that \({\widehat{A}}_{(0)}\) is positive definite at time zero, and hence also up to some positive time.
We now turn to analyzing the matrices \({{\widetilde{C}}}_{(0)}^{(k)}\) and \({{\mathcal {C}}}_{(0)}^{(k)}\). Substituting (1.57) into the definition of \({\widehat{C}}^{(k)}\) in (1.72), using definition (1.73), and pulling out a factor of \(\mu ^{(k)}_tC\) from the result yields
which shows that the matrices \({{\widehat{C}}}_{(0)}^{(k)}\), and hence also the matrices \({{\widetilde{C}}}_{(0)}^{(k)}\), are singular, i.e., there exist vectors \(r^{(k)}(\tau ,x,z_1,z_2)\) such that
In similar fashion to (4.4)–(4.5), the identities
hold by construction. Multiplying the second identity in (5.10) by \(r^{(k)}\) on the right and using (5.9) yields
Hence both \(r^{(k)}\) are eigenvectors of \({\widetilde{C}}_{(0)}^{(1)}\), so their definition here is consistent with their definition before the statement of the theorem. Moreover, (5.9) and (5.11) show that the eigenvalues of \({{\widetilde{C}}}_{(0)}^{(1)}\) are distinct, so the \(r^{(j)}\) are as smooth as \({{\widetilde{C}}}_{(0)}^{(1)}\), namely \(C^s\). Since \({{\widetilde{C}}}_{(0)}^{(1)}\) is symmetric the vectors are orthogonal, so after normalizing them to have length one the matrix R whose columns are the \(r^{(j)}\) is orthogonal, i.e.,
Moreover, (5.9)–(5.12) imply that
which are the analogues of (4.9) and (4.13).
We look for solutions of (1.13) having the form
Substituting (5.14) into (1.13) and using the definitions (1.62) and (1.72) shows that u defined by (5.14) will satisfy (1.13) provided that U satisfies
Making the change of variables
multiplying the resulting equation by \(\frac{1}{\rho }R^T\widetilde{A}_{(0)}^{-\frac{1}{2}}\), and using assumption (1.82) yields the system (4.11), where (4.12) and (4.13) again hold, but now the coefficients \({\mathcal {A}}\), \({\mathcal {B}}\), \({\mathcal {D}}\) and \({\mathcal {F}}\) depend on \(z_1\) and \(z_2\) in addition to \(\tau \) and x, and, by (5.13),
Since assumption (1.82) has eliminated the bad terms (1.76) that would otherwise be present in (4.11) for system (1.13), the remainder of the proof is then the same as for the case of coefficients independent of y given in Sect. 4, with one minor exception: Since the function \(\tau (t,x,y,\varepsilon )=\mu ^{(1)}(t,x)-\mu ^{(2)}(t,x)+\varepsilon [Y^{(2)}(t,x,y)-Y^{(1)}(t,x,y)]\) defined in (1.62) is \(C^1\), we have replaced \(\tau (t,x,y,\varepsilon )\) by \(\tau (t,x,y,0)\) in the estimates (1.83) and (1.56) so as to make the only dependence on \(\varepsilon \) in the profile appear in the phases, because the error induced by that replacement has the same size \(O(\varepsilon )\) as the error of those estimates. \(\square \)
6 Counterexamples and examples
6.1 Nonuniform existence of solutions to (1.10)
Since the coefficient a of \(u_t\) is always assumed to be positive, it is always possible to divide the scalar PDE (1.10) by that coefficient, which replaces the coefficients b and c by \(\tfrac{b}{a}\) and \(\tfrac{c}{a}\). When looking for counterexamples it therefore suffices to consider the coefficients b and c. We begin with two preliminary examples for equations with stronger dependence on the dependent variable than is allowed in (1.10); the first example is classical.
Example 6.1
(c depends on u) In the standard example \(u_t+\frac{c(u)}{\varepsilon }u_y=0\) with initial data \(u_0\) for which \(c'(u_0)u_0'\) takes O(1) negative values, the y-derivative of the solution blows up in finite time, and more specifically at a time \(O(\varepsilon )\) since the small parameter can be scaled into the time.
Example 6.2
(b depends on u) Since b is not multiplied by \(\frac{1}{\varepsilon }\) in (1.4), the scaling argument of Example 6.1 is not applicable. Moreover, classical singular limit theory obtains uniform existence when b depends on u assuming that c is independent of x. We therefore let b be a bounded positive function of u whose first derivative is nonzero at some point and let c be a bounded function of x whose first derivative is nonzero at some point. In similar fashion to [16, (3.5)], solving the characteristic equations for the PDE
shows that solutions of the initial-value problem for that PDE can be written in the implicit form
Proceeding as in that reference by taking the y derivative of (6.2) and solving the result for \(u_y\) yields the formula
The Taylor expansion formula for C(x) around the point \(x-b(u)t\) shows that the expression \(C(x)-C(x-b(u)t)-c(x-b(u)t)b(u)t\) equals \(\frac{1}{2} c'(x-\theta b(u)t)b(u)^2t^2\) for some \(\theta \in (0,1)\). Hence if \(b'(u_0(x,y))(u_0(x,y))_yc'(x)>0\) at some point then (6.3) implies that \(|u_y|\) will become infinite at some point at a time of order \(O(\sqrt{\varepsilon })\). Note that the more subtle mechanism of blow up as compared to Example 6.1 manifests itself in the blowup being less rapid than in that example. A similar calculation shows that \(u_x\) will also become infinite.
Moreover, estimates for u, \(u_y\), and \(\sqrt{\varepsilon }u_x\) along characteristics, with time rescaled by \(t=\sqrt{\varepsilon }\tau \), show that solutions of the equation obtained from the prototypical PDE (1.4) by replacing b there with b(t, x, u) exist for at least a time \(O(\sqrt{\varepsilon })\). Similar estimates are presented in detail in Example 6.3. Hence the blowup time obtained for equations in which b depends on u is sharp. Note that the scaling used to obtain that lower bound on the existence time is different than the scaling from [12, Proposition 6.1.1] mentioned in the introduction.
Example 6.3
(b depends on y, and c vanishes somewhere) Consider the PDE
which is a special case of (1.10). Of course, in order for Theorem 1.1 to not apply, the coefficients must fail to satisfy at least one of the hypotheses of that theorem. The construction here depends on c vanishing at some point, which violates assumption (1.17). The fast blowup of solutions will now require that three coefficients depend on particular variables, and the blowup time will be even less rapid, albeit only slightly, than in Example 6.2. For Eq. (6.4) it is not possible to solve the characteristic ODEs even to obtain an implicit formula for the solution like (6.2), except in the special case when \(b(y)=k_1y\) and \(c(x)=k_2x\). Nevertheless, it is still possible to derive ODEs for \(u_x\) and \(u_y\) and use them to prove that under certain conditions the time of breakdown tends to zero with \(\varepsilon \), in similar fashion to the calculations in [16, pp. 35–36] for the case of an equation in one spatial dimension not containing a parameter \(\varepsilon \). Taking the x or y derivative of (6.4) and defining the directional derivative along characteristics \(D_t v\mathrel {:=}v_t+bv_x+\tfrac{c}{\varepsilon }v_y+dv_y\) yields the system
We consider here the case when there exist \(x_*\), \(y_*\), and \(u_*\) such that \(c(x_*)=0\), \(b(y_*)=0\), and \(d(u_*)=0\) while \(c'(x_*)<0\), \(b'(y_*)<0\) and \(d'(u_*)<0\), and take smooth bounded initial data \(u_0\) such that \((u_0)_y(x_*,y_*)>0\). Then the characteristic through the point \((x_*,y_*,u_*)\) remains at that point for all time. Define
then on the characteristic through \((x_*,y_*,u_*)\) (6.5) becomes
In terms of the eigenmodes
of the linear part of the system (6.7), that system becomes
Since \(Q(0)>0\) by assumption and \(P(0)=O(\sqrt{\varepsilon })\), for sufficiently small \(\varepsilon \), we have \(R_\pm (0)>0\). Hence the ODEs (6.9) imply that
at least until \(R_+\ge \frac{\delta }{\sqrt{\varepsilon }}\) for a certain positive \(\delta \), which happens after a time \(T=O(\sqrt{\varepsilon }\ln \tfrac{1}{\varepsilon })\). The differential inequality for \(R_-\) in (6.10) implies that \(R_-=O(\sqrt{\varepsilon })\) at that time. This implies via (6.8) that both P and Q are positive and at least \(\frac{\delta }{2\sqrt{\varepsilon }}\) at time T. The ODEs (6.7) then imply that both P and Q remain positive at later times, so the ODE for Q there implies that
Since \(Q(T)\ge \frac{\delta }{2\sqrt{\varepsilon }}\), (6.11) implies that Q becomes infinite at a time \(T_*=T+O(\sqrt{\varepsilon })=O(\sqrt{\varepsilon }\ln \tfrac{1}{\varepsilon })\).
6.2 Boundary layers in solution to (1.10) when \(c=0\)
Example 6.4
(Boundary layer in solution to (1.10)) The assumption (1.17) for Theorem 1.1 requires that the coefficient c of the large term in (1.10) be bounded away from zero. Example 6.3 showed that when that assumption does not hold and in addition the coefficient b depends on y then the time of existence of the solution may tend to zero with \(\varepsilon \). We now show that even when the coefficients a, b, and c are independent of y the vanishing of the coefficient c may change the asymptotics of the solution from those given in Theorem 1.1. Specifically, boundary layers can appear.
We first note that uniform existence can be proven for solutions of the special case
of (1.10), assuming only that the coefficients belong to \(C^1\) and a satisfies (1.9). This can be shown via a slight extension of the method of [12, Propositions 6.1.1–6.1.3], using the method of characteristics and the \(\varepsilon \)-weighted \(C^1\) norm \(\Vert u\Vert _{C^0}+\Vert u_y\Vert _{C^0}+\varepsilon \Vert u_x\Vert _{C^0}\), because dynamic estimates for those norms are obtained with coefficients that are bounded in \(\varepsilon \). Similar estimates are presented in detail in Sect. 6.4 for the case when the coefficients do depend on y but c does not vanish, except that the estimate for \(u_y\) there is obtained by solving the PDE for that expression rather than dynamically.
Now specialize further to the equation
where f is assumed to be periodic in y, as in Theorem 1.1, and can be normalized to have mean zero by adjusting k. When c(x) is bounded away from zero as required by (1.17) then Theorem 1.1 shows that the asymptotics of the solution having initial data \(u_0(x,y)\) are
where U satisfies
Dividing (6.15) by c(x) and solving the result yields
and substituting this back into (6.14) yields
We now compare the asymptotic solution (6.17) of (6.13) predicted by Theorem 1.1 with the exact solution of that equation obtained by the method of characteristics, namely
When c(x) is bounded away from zero then the final term on the right side of the equation in (6.18) is of size \(O(\varepsilon )\) everywhere, in accordance with (6.17). However, the identity
shows that if c(x) vanishes at one or more points then for \(t>0\) the final term on the right side of the equation in (6.18) will be of order one in regions where \(c(x)=O(\varepsilon )\), and hence the leading asymptotics will be different than (6.17). In other words, a boundary layer appears.
6.3 Examples and comparison to geometric optics
Example 6.5
(\(2\times 2\) system and a \(3\times 3\) variant) Consider the system
where \(c(x)>0\) and b(x) is not identically zero. For system (6.19), Eq. (1.41) for the fast phases \(\mu \) becomes
which yields the equations
for the fast phases \(\mu \). Although the solutions of (6.21) generally cannot be determined explicitly, if \(\mu \) is a solution then so is \(-\mu \). Hence formula (1.44) becomes \(\tau =2\mu (t,x)\), and the ansatz (4.2) takes form
Substituting (6.22) into (6.19) and defining \(z_1=y-\frac{\mu (t,x)}{\varepsilon }\) and \(z_2=y+\frac{\mu (t,x)}{\varepsilon }\) yields the system
Using the formulas
yields
with
where the second forms of \(\alpha \) and \(\beta \) are obtained by using (6.21). Calculations show that
and
Hence the matrix R of eigenvectors is simply the identity matrix for this system. The change of variables (4.10) is therefore \(\left( {\begin{matrix}U\\ V\end{matrix}}\right) = \left( {\begin{matrix} \alpha &{}\beta \\ \beta &{}\alpha \end{matrix}}\right) \left( {\begin{matrix}{\mathcal {U}}\\ \mathcal V\end{matrix}}\right) \), and making that substitution yields the system
where
We take the initial conditions to be
which satisfy
in accordance with (4.15). Specializing (1.54) to system (6.19) shows that as \(\varepsilon \rightarrow 0\) the solutions of (6.27) tend to the unique solution of
and the solution of the original system (6.19) is asymptotic to \(\left( {\begin{matrix}\alpha &{}\beta \\ \beta &{}\alpha \end{matrix}}\right) \) times that limit, with
The connection between the results here for systems of the form (1.12) and the geometric optics theorem of [13] can be seen by making the change of variables \(y=\frac{{\widehat{y}}}{\varepsilon }\) in (6.19). This transforms (6.19) into
whose initial data now has the form \(\left( {\begin{matrix}u\\ v \end{matrix}}\right) {\big |_{t=0}} = \Big ({\begin{matrix}u_0(x,\frac{{\widehat{y}}}{\varepsilon })\\ v_0(x,\frac{{\widehat{y}}}{\varepsilon })\end{matrix}} \Big )\). In accordance with [13, Remark 2.3.3], the geometric optics phases are \({{\widehat{y}}}{\mp } \mu (t,x)\). Since system (6.32) together with its initial data is equivalent to the original system and initial data, the asymptotics of the solution are the same except that y is replaced by \(\frac{{\widehat{y}}}{\varepsilon }\) in (6.31). The same transformation and resulting translation of the asymptotics to the geometric optics framework holds for all systems of the form (1.12). As noted in the introduction, this yields an alternative proof of a special case of the results of [13], with the slight generalization that the matrix multiplying the time derivatives is not required to be the identity. Note that even though the order one y derivative term \(Du_y\) is transformed to \(\varepsilon Du_{{\widehat{y}}}\), that term still appears in the asymptotics of the solution, as do the order \(\varepsilon \) parts of A and B.
An example of a system larger than \(2\times 2\) for which condition (1.49) holds can be obtained by appending to system (6.19) a scalar equation having no large term, which may be coupled to (6.19) by symmetric terms involving x derivatives, symmetric terms involving order one y derivatives, and undifferentiated terms. Since the third fast phase \(\mu ^{(3)}\equiv 0\) is a constant-coefficient combination \(\frac{1}{2} \mu +\frac{1}{2} (-\mu )\) of the two fast phases \(\pm \mu \), Theorem 1.5 applies with \(\alpha ^{(3)}=\frac{1}{2}\).
Example 6.6
(Phases and necessity of condition (1.49)) One might think that the phase functions for singular limit equations are simply the \(\mu ^{(j)}\), because those functions appear multiplied by \(\frac{1}{\varepsilon }\) in the ansatz formulas (1.3), (1.31), (1.39), (4.2), and (5.14). Actually, the slow part y for equations (1.11) and (1.12) or Y(t, x, y) for (1.10) and (1.13) should be included, so that the full phase function is \(\varepsilon y-\mu ^{(j)}\) or \(\varepsilon Y(t,x,y)-\mu ^{(j)}\).
The correct definition of the phases functions can be seen from the necessity of condition (1.49). That condition is an analogue of the coherence condition [13, Definition 2.2.1 and Remark 2.2.3] for geometric optics, which says that all phases must be constant-coefficient linear combinations of a basis set. However, (1.49) requires that the fast phase functions \(\mu ^{(j)}\) for system (1.12) be convex, not arbitrary, linear combinations of the first two fast phases, i.e., must have coefficients that sum to one. The reason for that convexity requirement only becomes clear when we consider the full phase functions \(\varepsilon y-\mu ^{(j)}(t,x)\) for that system. If the phases functions were just \(\mu ^{(j)}\) then linear combinations \(\mu ^{(j)}(t,x)=\alpha \mu ^{(1)}(t,x)+\beta \mu ^{(2)}(t,x)\) should be allowed, but for the full phases the condition \(\varepsilon y-\mu ^{(j)}(t,x)=\alpha (\varepsilon y-\mu ^{(1)}(t,x)) + \beta (\varepsilon y-\mu ^{(2)}(t,x) )\) automatically implies that \(\beta \) must equal \(1-\alpha \), as required in (1.49).
The fact that the full phases for system (1.12) are \(\varepsilon y-\mu ^{(j)}(t,x)\) also explains why precisely two independent phases are possible for that system. On account of the initial condition \(\mu ^{(j)}(0,x)\equiv 0\) for the fast phase functions, all full phases \(\varepsilon y-\mu ^{(j)}(t,x)\) reduce at time zero to the same function \(\varepsilon y\), and so span a space of dimension one. By [13, Lemma 2.3.2] the dimension of the space of phases is at most one more than the dimension of the space they span at time zero, so only two independent phases are allowed.
To see the necessity of condition (1.49), consider the system
where M(t, x) is a matrix that couples the components of the system. Assume that the fast phases \(\mu ^{(1)}(t,x)\) and \(\mu ^{(2)}(t,x)\), which by (1.41) are determined by
satisfy \(\mu ^{(1)}_t(0,x)-\mu ^{(2)}_t(0,x)\ge c>0\) in accordance with (1.42)–(1.43). The remaining solution \(\mu ^{(3)}\) of (1.41) satisfies \(\mu ^{(3)}_t=c_3(t,x), \mu ^{(3)}(0,x)\equiv 0\). Calculating the matrices \({\mathcal {C}}^{(j)}\) of the transformed system (4.11)–(4.12) for (6.33) yields
Since \({\mathcal {C}}^{(2)}-{\mathcal {C}}^{(1)}=I\) in accordance with (4.5), it suffices to determine when \(\mathcal C^{(2)}\) is a constant matrix. That holds iff \(\frac{\mu ^{(3)}_t-\mu ^{(2)}_t}{\mu ^{(1)}_t-\mu ^{(2)}_t}=\alpha \) for some constant \(\alpha \), which may be solved for \(\mu ^{(3)}_t\) to obtain \(\mu ^{(3)}_t=\alpha \mu ^{(1)}_t+(1-\alpha )\mu ^{(2)}_t\). Since all the \(\mu ^{(j)}\) are identically zero at time zero, integrating with respect to t shows that (1.49) is indeed a necessary condition for the large terms of the transformed equation to have constant coefficients.
Example 6.7
(\(2\times 2\) system with y-dependent coefficients) Define
and consider the system
Calculations show that the eigenvalues of the matrix \( A^{\frac{1}{2}}\left( {\begin{matrix} 2&{}0\\ 0&{}1 \end{matrix}}\right) ^{-1} A^{\frac{1}{2}}\) are \(\frac{2+\cos (x+y)}{2}\) and \(4+2\cos (x+y)+\sin t\sin (x-y)\), whose y-averages are one and four, respectively. At time zero,
i.e., condition (1.80) is satisfied. Since the averages of the eigenvalues are independent of x as well as of y, the solution of (1.58) is \(\mu ^{(1)}=t\), \(\mu ^{(2)}=\frac{t}{4}\). Substituting those functions into the differential equation in (1.57) and solving using the alternative normalization (1.59) then yields \(Y^{(1)}=y+\frac{\sin (x+y)}{2}\), \(Y^{(2)}=y+\frac{\sin (x+y)}{2}+\frac{\sin t\,\cos (x-y)}{4}\). Using these, we calculate from (1.62) that
which indeed vanishes identically at time zero as desired. Also, using (1.64) we obtain
whose derivative with respect to y is indeed strictly positive, in accordance with the discussion after (5.4).
Since the inverses of \(\tau \) with respect to t and of Q with respect to y cannot be calculated explicitly, we will write the transformed coefficient matrices in terms of the original variables (t, x, y). Using the formulas (1.72) and (1.73) we calculate that
and
Since the \({\widetilde{C}}^{(j)}\) already have the form desired for the \({\mathcal {C}}^{(j)}\), the matrix R is the identity matrix. Upon setting \(\rho \mathrel {:=}\sqrt{6+3\cos (x+y)+2\sin t\,\sin (x-y)}\) we obtain \(\rho \widetilde{A}^{-\frac{1}{2}}R=\left( {\begin{matrix}2&{}0\\ 0&{}\sqrt{2}\end{matrix}} \right) \), which is independent of y, so that after evaluation at \(y=\widehat{{\mathcal {Y}}}\) it will be independent of \(z_1\) and \(z_2\) as desired. Since that matrix is also independent of t and x, all the terms in the formula (1.75) for the undifferentiated term of the transformed equation vanish. The fact that R is the identity matrix also implies that \({\mathcal {B}}\) equals \({\widetilde{B}}\). The transformed system and its limit are then obtained as in Example 6.5, with the matrix \({\mathcal {B}}\) here substituted for the corresponding matrix in (6.27) and with no undifferentiated term.
6.4 Estimates along characteristics for Eq. (1.10)
Taking the x, y, or t derivative of (1.10) and defining the directional derivative along characteristics \( D_t v\mathrel {:=}av_t+bv_x+\tfrac{c}{\varepsilon }v_y+dv_y\) yields the equations
Taking first \(w=t\) and then \(w=x\) in (6.41) and multiplying the result in each case by \(\varepsilon \) yields
It would not be useful to do the same for \(w=y\), either with or without multiplying the result by \(\varepsilon \) since the resulting equation would have a large term \(\frac{1}{\varepsilon }c_y\) times either \(\varepsilon u_y\) or \(u_y\), respectively, and so would not yield a uniform estimate. Instead, since c is assumed to be bounded away from zero, we solve (1.10) for \(u_y\) to obtain
Substituting (6.43) into each equation in (6.42) yields ODEs for \(\varepsilon u_t\) and \(\varepsilon u_x\) along characteristics that have right sides of order one when those variables are of order one. The original PDE (1.10) can be written as \(D_tu=-f\), whose right side is also of order one. Those ODEs therefore yield uniform \(C^0\) bounds for u, \(\varepsilon u_t\), and \(\varepsilon u_x\) up to some time independent of \(\varepsilon \), and (6.43) then yields a uniform \(C^0\) bound for \(u_y\) up to the same time. In particular, those bounds imply that the solution of (1.10) exists for a time independent of \(\varepsilon \).
References
Alazard, T.: A minicourse on the low Mach number limit. Discret. Contin. Dyn. Syst. Ser. S 1(3), 365��404 (2008)
Beirão da Veiga, H.: A review on some contributions to perturbation theory, singular limits and well-posedness. J. Math. Anal. Appl. 352(1), 271–292 (2009)
Cheng, B.: Singular limits and convergence rates of compressible Euler and rotating shallow water equations. SIAM J. Math. Anal. 44(2), 1050–1076 (2012)
Dutrifoy, A., Majda, A.J., Schochet, S.: A simple justification of the singular limit for equatorial shallow-water dynamics. Commun. Pure Appl. Math. 62(3), 327–333 (2009)
Dutrifoy, Alexandre: Fast averaging for long- and short-wave scaled equatorial shallow water equations with Coriolis parameter deviating from linearity. Arch. Ration. Mech. Anal. 216(1), 261–312 (2015)
Gallagher, I.: Applications of Schochets methods to parabolic equations. J. Math. Pures Appl. 77(10), 989–1054 (1998)
Grenier, E.: Oscillatory perturbations of the Navier–Stokes equations. J. Math. Pures Appl. 76(6), 477–498 (1997)
Grenier, E.: Pseudo-differential energy estimates of singular perturbations. Commun. Pure Appl. Math. 50(9), 821–865 (1997)
Gallagher, I., Saint-Raymond, L.: Mathematical study of the betaplane model: equatorial waves and convergence results. Mém. Soc. Math. Fr. (N.S.), (107):v+116 (2007) (2006)
Hunter, J.K., Majda, A., Rosales, R.: Resonantly interacting, weakly nonlinear hyperbolic waves. II. Several space variables. Stud. Appl. Math. 75(3), 187–226 (1986)
Jiang, S., Ju, Q., Li, F.: Incompressible limit of the compressible magnetohydrodynamic equations with periodic boundary conditions. Commun. Math. Phys. 297(2), 371–400 (2010)
Joly, J.-L., Métivier, G., Rauch, J.: Resonant one-dimensional nonlinear geometric optics. J. Funct. Anal. 114(1), 106–231 (1993)
Joly, J.-L., Métivier, G., Rauch, J.: Coherent and focusing multidimensional nonlinear geometric optics. Ann. Sci. École Norm. Sup. (4) 28(1), 51–113 (1995)
Klainerman, S., Majda, A.: Singular limits of quasilinear hyperbolic systems with large parameters and the incompressible limit of compressible fluids. Commun. Pure Appl. Math 34, 481–524 (1981)
Klainerman, S., Majda, A.: Compressible and incompressible fluids. Commun. Pure Appl. Math. 35, 629–653 (1982)
Lax, P.D.: Hyperbolic systems of conservation laws and the mathematical theory of shock waves. Society for Industrial and Applied Mathematics, Philadelphia, Pa., 1973. Conference Board of the Mathematical Sciences Regional Conference Series in Applied Mathematics, No. 11
Lions, P.-L.: Mathematical topics in fluid mechanics. Vol. 2, volume 10 of Oxford Lecture Series in Mathematics and its Applications. The Clarendon Press, Oxford University Press, New York, 1998. Compressible models, Oxford Science Publications
Majda, A.: Compressible Fluid Flow and Systems of Conservation Laws in Several Space Variables Applied Mathematical Sciences, vol. 53. Springer, New York (1984)
Majda, A., Rosales, R.: Resonantly interacting weakly nonlinear hyperbolic waves. I. A single space variable. Stud. Appl. Math. 71(2), 149–179 (1984)
Métivier, G., Schochet, S.: The incompressible limit of the non-isentropic euler equations. Arch. Ration. Mech. Anal 158, 61–90 (2001)
Schochet, S.: Symmetric hyperbolic systems with a large parameter. Commun. Partial Differ. Equ. 11(15), 1627–1651 (1986)
Schochet, S.: Fast singular limits of hyperbolic PDEs. J. Differ. Equ. 114(2), 476–512 (1994)
Schochet, S.: The mathematical theory of low Mach number flows. M2AN Math. Model. Numer. Anal. 39(3), 441–458 (2005)
Schochet, S.: Singular limits of symmetric hyperbolic systems with large variable-coefficient terms. Commun. Partial Differ. Equ. 39(5), 842–875 (2014)
Schochet, S.: Sobolev estimates for non-uniformly parabolic PDEs. Partial Differ. Equ. Appl. 3(1), 25 (2022)
Funding
Open access funding provided by Tel Aviv University.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Nordmann, S., Schochet, S. Asymptotics for singular limits via phase functions. Nonlinear Differ. Equ. Appl. 31, 26 (2024). https://doi.org/10.1007/s00030-023-00918-z
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00030-023-00918-z