1 Introduction

For a Hamiltonian action of a compact Lie group on a compact symplectic manifold, a fundamental work of Frances Kirwan [6] makes it possible to apply Morse theoretic techniques to the norm square of the momentum map. The norm square is not a Morse–Bott function; components of its critical set might not even be smooth submanifolds. Kirwan identifies a condition on a real valued function, being minimally degenerate, which is more general than the Morse–Bott condition and which nevertheless allows one to apply the machinery of Morse theory. Nowadays, this condition is sometimes called “Morse–Bott in the sense of Kirwan.”

Applying Morse theoretic arguments to the norm square of a momentum map is the main ingredient in the proof of Kirwan surjectivity, a pivotal result in equivariant symplectic geometry and geometric invariant theory, as well as in the study of the global topology of Hamiltonian compact group actions. These techniques played a central role in the mathematical confirmation of physicists’ predictions for the structure of the cohomology ring of moduli spaces of holomorphic vector bundles over Riemann surfaces [5].

Kirwan’s definition of a minimally degenerate function is not local: it requires the set of critical points to be a disjoint union of closed subsets, each of which has a neighbourhood that satisfies a certain condition. If the set of critical points is not discrete, then, in contrast to the Morse–Bott condition, a priori it is not clear if one can tell that a function is “minimally degenerate” by examining small neighbourhoods of individual critical points. This aspect of the definition makes it difficult to check whether a function satisfies this condition.

The main result of this paper is that Kirwan’s “minimally degenerate” condition actually is a local condition: if a function is minimally degenerate near each critical point, then it is minimally degenerate (Theorem 2.8). As a corollary, we obtain a Morse–Lemma-type local characterization of minimally degenerate functions (Theorem 2.9).

Section 2 contains basic definitions and the statements of our main results, Theorems 2.8 and 2.9. The main steps of the proof are formulated as Propositions 3.2, 3.4, and 3.5. Section 3 contains the statements of these propositions, followed by proofs of the main theorems. Sections 4, 5, and 6 are devoted to the proofs of these propositions. In Sect. 7, we then use Theorem 2.8 and the local normal form theorem to reprove Kirwan’s result that, for a Hamiltonian action of a compact Lie group, the norm square of a momentum map is minimally degenerate. Finally, in Sect. 8, we mention some consequences of this fact.

2 Statement of the main result

We begin by recalling Kirwan’s definitions of a minimizing manifold and of a minimally degenerate function.

Throughout this paper, we do not assume that the dimension of a manifold is constant. That is, by “manifold”, we allow a disjoint union of manifolds of different dimensions. In particular, in Definition 2.1, if C is not connected, we allow N to have connected components of different dimensions. Similarly, we allow a vector bundle over a manifold W to have different ranks over different components of W.

Definition 2.1

Let W be a smooth manifold and \(f :W \rightarrow {\mathbb {R}}\) a smooth function. Let C be a closed subset of W on which f is constant and such that every point of C is a critical point of f. A minimizing manifold for f along C is a submanifold N of W that contains C and that has the following two properties.

  1. (1)

    For each point x of C, the tangent space \(T_xN\) is maximal among subspaces of \(T_xW\) on which the Hessian of f is positive semidefinite.

  2. (2)

    The restriction \(f|_N\) of f to N attains its minimum exactly at the points of C.

Remark 2.2

In Definition 2.1, Condition (1) can be replaced by the following condition:

(1\('\)):

For each point x of C, there exists a splitting \(T_xW = T_xN \oplus E_x\) such that \(E_x\) is maximal among subspaces of \(T_xW\) on which the Hessian of f is negative definite.

Definition 2.3

Let W be a smooth manifold and \(f:W \rightarrow {\mathbb {R}}\) a smooth function. We say f is minimally degenerate if its critical set is a locally finite disjoint union of closed subsets along which there exist minimizing manifolds for f

Remark 2.4

Every Morse–Bott function is minimally degenerate. Kirwan showed that one can apply the usual techniques of Morse–Bott theory to a minimally degenerate function even if the function is not a Morse–Bott function.

Remark 2.5

Definition 2.3 was proposed by Kirwan [6, page 6], with two differences. First, Kirwan assumes that the set of critical points is a finite disjoint union of closed subsets along which there exist minimizing manifolds for f. Second, she requires that the minimizing manifolds be co-orientable. As for the first difference, the weaker assumption “locally finite” is sufficient for our purposes; it is equivalent to Kirwan’s assumption “finite” when the manifold is compact, as is the case for complex projective manifolds, which is the case in which Kirwan was the most interested. As for the second difference, co-orientability is required for many consequences (see Remark 2.6) and is guaranteed in our main application in symplectic geometry; it is not essential, though, when we study the differential topological properties of minimally degenerate functions.

Remark 2.6

The usual definition of a Morse–Bott function does not require the negative normal bundle to a critical set to be orientable. However, to obtain Morse–Bott inequalities for ranks of cohomology groups with coefficients in a field \({\mathbb {F}}\), the negative normal bundles must be \({\mathbb {F}}\)-orientable. A minimally degenerate function for which the normal bundles to the minimizing manifolds are \({\mathbb {F}}\)-orientable similarly gives rise to Morse–Bott inequalities for cohomology with \({\mathbb {F}}\) coefficients.

Note that every vector bundle is \({\mathbb {Z}}/2{\mathbb {Z}}\)-orientable and that orientability is equivalent to \({\mathbb {Q}}\)-orientability. For further details, see [10, § 2.6].

Remark 2.7

Let M be a smooth manifold and \(f :M \rightarrow {\mathbb {R}}\) a smooth function. Suppose that f is minimally degenerate. Then, for every open subset U of M, the restriction \(f|_U\) is minimally degenerate.

Our main result is that the existence of minimizing manifolds can be checked locally. Co-orientability of the minimizing manifolds is inherently a global property; in the presence of an almost complex structure, we give a condition that guarantees co-orientability and that can be checked locally.

Theorem 2.8

Let M be a smooth manifold and \(f :M \rightarrow {\mathbb {R}}\) a smooth function. Suppose that every point in M has an open neighbourhood U such that \(f|_U\) is minimally degenerate. Then f is minimally degenerate.

Suppose in addition that there exists an almost complex structure J on M such that at every critical point of f the Hessian of f is J-invariant. Then f is minimally degenerate with co-orientable minimizing manifolds.

Tolman and Weitsman note in [12, p. 759] that minimally degenerate functions “morally \(\ldots \) look like the product of a minimum and a non-degenerate Morse–Bott function”. Our locality result, Theorem 2.8, allows us to make this description rigorous:

Theorem 2.9

Let M be a smooth n-dimensional manifold and \(f :M \rightarrow {\mathbb {R}}\) a smooth function. Then the following conditions are equivalent.

  1. (a)

    For every critical point c, there exist coordinates \(x_1,\ldots ,x_k,y_{k+1},\ldots ,y_n\) centred at c such that in a neighbourhood of c

    $$\begin{aligned} f = f({\mathbf {x}}, {\mathbf {y}}) = g({\mathbf {y}})-\sum _{j=1}^k x_j^2, \end{aligned}$$

    where g is a smooth function with minimal value g(0) and no other critical values.

  2. (b)

    For every critical point c, there exist coordinates \(x_1,\ldots ,x_k,y_{k+1},\ldots ,y_n\) centred at c such that in a neighbourhood of c

    $$\begin{aligned} f = f(\mathbf {x},\mathbf {y}) = g(\mathbf {y}) + h(\mathbf {x}), \end{aligned}$$

    where g is a smooth function with minimal value g(0) and no other critical values and h is a Morse–Bott function.

  3. (c)

    f is minimally degenerate.

3 Outline of the proofs

In this section, we outline the proofs of Theorems 2.8 and 2.9. We begin with a technical lemma:

Lemma 3.1

Let M be a smooth manifold and \(f:M \rightarrow {\mathbb {R}}\) a smooth function. Suppose that every point in M has an open neighbourhood U such that \(f|_U\) is minimally degenerate. Then the critical set of f is a locally finite disjoint union of closed subsets on which f is constant.

Proof

The definition of “minimally degenerate” implies that each point has a neighbourhood in which the function takes only finitely many critical values. The conclusion of the lemma then holds when we take the closed subsets to be the intersections \( f^{-1}(c_i) \cap {{\mathrm{Crit\,}}}f \) where \(c_i\) are the critical values of f. \(\square \)

To prove Theorem 2.8, we fix a manifold M and a smooth function \(f:M \rightarrow {\mathbb {R}}\) that is locally minimally degenerate. Lemma 3.1 gives a decomposition of the critical set of f into a locally finite disjoint union of closed subsets on which f is constant. We may now focus on one such a subset, call it C. Thus, C is closed in M and has a neighbourhood W whose intersection with the critical set of f is exactly C, and f is constant on C. We need to prove that there exists a minimizing manifold for f along C. We will do this in three steps, formulated below as Propositions 3.2, 3.4, and 3.5.

For these propositions, we now fix a manifold W and a smooth function

$$\begin{aligned} f :W \rightarrow {\mathbb {R}}, \end{aligned}$$

and we assume that f is constant on the set \(C := {{\mathrm{Crit\,}}}(f)\) of its critical points.

An infinitesimally minimizing manifold for f along C is a submanifold N of W that contains C and that satisfies Property (1) of Definition 2.1:

For each point x of C, the tangent space \(T_xN\) is maximal among subspaces of \(T_xW\) on which the Hessian of f is positive semidefinite.

Proposition 3.2

(Infinitesimal minimal degeneracy) Suppose that every point x of C has an open neighbourhood \(U_x\) such that there exists an infinitesimally minimizing manifold for f along \(C \cap U_x\) Then there exists an infinitesimally minimizing manifold N for f along C.

Suppose in addition that there exists an almost complex structure J on W such that at every point of C the Hessian of f is J-invariant. Then there exists such an N that is co-orientable.

Definition 3.3

A compatibly fibred neighbourhood of C is a neighbourhood U of C together with a submersion \(\pi :U \rightarrow {\mathcal {O}}\) such that at every point x in C the vertical tangent space \(\ker \hbox {d}\pi |_x\) is maximal among subspaces of \(T_xW\) on which the Hessian of f is negative definite. The compatibly fibred neighbourhood is fibrewise orientable if its vertical tangent bundle \(\ker \hbox {d}\pi \rightarrow U\) is an orientable vector bundle.

Proposition 3.4

(Compatibly fibrated neighbourhood) Suppose that there exists an infinitesimally minimizing manifold N for f along C. Then C has a compatibly fibred neighbourhood.

Suppose in addition that N is co-orientable. Then C has a compatibly fibred neighbourhood that is fibrewise orientable.

Proposition 3.5

(Minimal degeneracy) Let \({\widetilde{N}}\) denote the set of fibrewise critical points of a compatibly fibred neighbourhood \(\pi :U \rightarrow {\mathcal {O}}\) of C. Then, after possibly intersecting with a smaller neighbourhood of C, the following is true.

  1. (a)

    \({\widetilde{N}}\) is an infinitesimally minimizing manifold for f along C.

    If in addition the compatibly fibred neighbourhood of C is fibrewise orientable, then \({\widetilde{N}}\) is co-orientable.

  2. (b)

    Suppose that every point x of C has an open neighbourhood \(U_x\) such that there exists a minimizing manifold for f along \(C \cap U_x\). Then \({\widetilde{N}}\) is a minimizing manifold for f along C.

We prove Proposition 3.2 in Sect. 4; the main challenge in the proof is to “patch together” minimizing manifolds for neighbourhoods of points in C. Proposition 3.4 is a consequence of the tubular neighbourhood theorem; we prove it in Sect. 5. We prove Proposition 3.5 in Sect. 6; Part (a) is a consequence of the implicit function theorem; Part (b) requires additional local arguments.

We conclude this section with proofs of Theorems 2.8 and 2.9.

Proof of Theorem 2.8

Let M be a manifold and \(f :M \rightarrow {\mathbb {R}}\) a smooth function. Suppose that every point of M has an open neighbourhood U such that \(f|_U\) is minimally degenerate.

By Lemma 3.1, the critical set of f is a locally finite disjoint union of closed subsets on which f is constant. Let C be one of these subsets. We may restrict our attention to an open neighbourhood W of C whose intersection with the set \({{\mathrm{Crit\,}}}(f)\) of critical points of f is equal to C.

By Proposition 3.2, there exists an infinitesimally minimizing manifold N for f along C. By Proposition 3.4, C has a compatibly fibred neighbourhood \(\pi :U \rightarrow {\mathcal {O}}\). By Proposition 3.5, after possibly restricting to a smaller neighbourhood of C, the set \({\widetilde{N}}\) of fibrewise critical points of \(\pi \) is a minimizing manifold for f along C, as required.

Now suppose, in addition, that there exists an almost complex structure J on M such that at every critical point of f the Hessian of f is J-invariant. By Proposition 3.2, there exists an infinitesimally minimizing manifold N for f along C that is co-orientable. By Proposition 3.4, C has a compatibly fibred neighbourhood \(\pi :U \rightarrow {\mathcal {O}}\) that is fibrewise orientable. By Proposition 3.5, after possibly restricting to a smaller neighbourhood of C, the set \(\tilde{N}\) of fibrewise critical points of \(\pi \) is a minimizing submanifold for f along C and is co-orientable, as required. \(\square \)

Proof of Theorem 2.9

Clearly, (a) implies (b): take \(h(\mathbf {x})=-\sum _{j=1}^k x_j^2\).

Suppose that (b) holds. Let c be a critical point and \((\mathbf {x}, \mathbf {y})\) coordinates as in (b). By the Morse–Bott Lemma, after a further change of coordinates in \({\mathbb {R}}^k\), we can bring h to the form \(h(x)=h(0) + x_1^2 + \cdots + x_{\ell }^2 - x_{\ell +1}^2 - \cdots - x_m^2\) near \(x=0\), with \(0 \le \ell \le m \le k\). Then, near c, the set of critical points is

$$\begin{aligned} \{ (x,y) \ | \ x_1 = \cdots = x_m = 0\quad \text {and}\quad g(y)=g(0) \}, \end{aligned}$$

and \(\{ x_{\ell +1}= \cdots = x_m = 0\}\) is a minimizing submanifold for f along this set. Because c was arbitrary, f is locally minimally degenerate. Theorem 2.8 guarantees that f is (globally) minimally degenerate. That is, (c) holds.

To show that (c) implies (a), we need to express a minimally degenerate function in local coordinates. For this we use a parametrized version of the Morse–Bott Lemma. Complete details of the proofs of the Morse Lemma and the Morse–Bott Lemma are spelled out by Banyaga and Hurtubise in [2]. The parametrized version that we use is described by Hörmander in [4, Lemma C.6.1]).

Namely, suppose that f is a minimally degenerate function, and let c be a critical point. Choose coordinates \(x_1',\ldots ,x_k',y_{k+1},\ldots ,y_n\) on a neighbourhood U of c, centred at c, such that \(\{\mathbf {x'}=0\}\) defines a minimizing submanifold for f along \(U \cap \text {Crit} f\), and such that the Hessian of f is negative definite on the subspace of \(T_cM\) that is represented by \({\mathbb {R}}^k \times \{0\}\). Applying the parametrized version of the Morse lemma, we find coordinates \((\mathbf {x},\mathbf {y})\) centred at c in which f has the desired form

$$\begin{aligned} f = f(\mathbf {x},\mathbf {y}) = g(\mathbf {y}) -\sum _{j=1}^k x_j^2 , \end{aligned}$$

where g is a smooth function. Minimal degeneracy guarantees that g must attain its minimum at \(\mathbf {y}=0\) and, after possibly shrinking the neighbourhood of c, that g has no critical values except its minimal value. \(\square \)

4 Existence of infinitesimally minimizing submanifolds

The purpose of this section is to prove Proposition 3.2.

Let W be a smooth manifold and \(f :W \rightarrow {\mathbb {R}}\) a smooth function. Assume that f is constant on the set \(C := {{\mathrm{Crit\,}}}(f)\) of its critical points. For every point x of C, let \(U_x\) be an open neighbourhood of x, and let \(N_x \subset U_x\) be an infinitesimally minimizing manifold for f along \(C \cap U_x\). To prove the proposition, we need to find an infinitesimally minimizing submanifold N for f along C, and, in the presence of an appropriate almost complex structure, to show that N is co-orientable.

We would like to obtain such an N by “patching together” the submanifolds \(N_x\). A priori it is not clear how to “patch together” submanifolds. We can “patch together” functions, by means of a partition of unity, so our first attempt is to express each \(N_x\) as the regular zero set of a function \(h_x :U_x \rightarrow {\mathbb {R}}^k\) and to take \(h := \sum \rho _i h_{x_i}\), where \(\{ \rho _i \}\) is a partition of unity on the union of the sets \(U_x\) with \({{\mathrm{supp\,}}}\rho _i \subset U_{x_i}\). To guarantee that zero remains a regular value of h near C, we require that the differentials of \(h_{x_i}\) and \(h_{x_j}\) coincide at the points of \(C \cap U_{x_i} \cap U_{x_j}\). If this can be arranged then

$$\begin{aligned} \big \{ h=0 \big \} \cap \big \{ \text {an appropriate neighbourhood of } C \big \} \end{aligned}$$

is a minimizing manifold for f along C.

One problem with this approach is that a regular level set of a function to \({\mathbb {R}}^k\) must have a trivial normal bundle. So this method already cannot work in the case that f is a Morse–Bott function and the negative normal bundle of C is non-trivial. To fix this, instead of working with functions to \({\mathbb {R}}^k\), we work with sections of a rank k vector bundle E. We carry out this plan in the following three lemmas.

First, we find a subbundle of the tangent bundle on which the Hessian is negative definite.

Lemma 4.1

There exist a neighbourhood U of C and a subbundle E of \(TW|_U\), such that at each point x of C the subspace \(E_x\) of \(T_xW\) is maximal among subspaces on which the Hessian \({{\mathrm{Hess\,}}}f|_x\) is negative definite.

Moreover, let J be an almost complex structure on W such that at every point of C the Hessian \({{\mathrm{Hess\,}}}f|_x\) is J-invariant. Then the bundle E may be chosen to be complex, hence orientable.

Proof

We will construct such a bundle E as a sum of eigenbundles of a fibrewise automorphism \(A :TW \rightarrow TW\), near C.

First, we will extend the Hessian of f, which is only defined at critical points, to a symmetric 2-tensor B that is defined on all of W. For this, let \(\{ U_\alpha \}\) be domains of coordinate charts that cover W; let \(B_\alpha \) be the symmetric 2-tensor on \(U_\alpha \) that in the local coordinates on \(U_\alpha \) is represented by the matrix of second partial derivatives of f; and take \(B = \sum \rho _\alpha B_\alpha \), where \(\{\rho _\alpha :W \rightarrow {\mathbb {R}}\}\) is a partition of unity with \({{\mathrm{supp\,}}}\rho _\alpha \subset U_\alpha \).

Next, choose a Riemannian metric \(\langle \cdot , \cdot \rangle \) on W, and define \(A :TW \rightarrow TW\) by \(B(u,v) = \langle u ,Av \rangle \). Because \(B(\cdot ,\cdot )\) is symmetric, A is self-adjoint with respect to \(\langle \cdot ,\cdot \rangle \), and so A is diagonalizable. For each \(x' \in W\), let \(\lambda _{1,x'}, \ldots , \lambda _{n,x'}\) denote the eigenvalues of \(A|_{x'}\), in (weakly) increasing order. For each i, the eigenvalue \(\lambda _{i,x'}\) is continuous, but perhaps not smooth, as a function of \(x'\).

Our assumptions imply that for every \(x \in C\) there exist a neighbourhood \(U_x\) and an integer \(k_x\) (namely the co-dimension of \(N_x\)) such that, for every \(x' \in U_x \cap C\), the automorphism A of \(T_{x'}W\) has exactly \(k_x\) negative eigenvalues.

Let \(C_k\) be the subset of C where \(k_x = k\). It is closed in W. There exists a neighbourhood \(U_k\) of \(C_k\) in W such that for all \(x' \in U_k\) the eigenvalue \(\lambda _{k+1,x'}\) is strictly greater than the eigenvalues \(\lambda _{1,x'}, \ldots ,\lambda _{k,x'}\). Shrink the sets \(U_k\) so that their closures become disjoint. For each \(x' \in U_k\), let \(E_{x'}\) be the sum of the eigenspaces of \(A|_{x'}:T_{x'}W \rightarrow T_{x'}W\) that correspond to the eigenvalues \(\lambda _{1,x'},\ldots , \lambda _{k,x'}\). Let \(U = \bigcup _k U_k\). Then E is a smooth subbundle of \(TW|_U\), and at each point x of C, \(E_x\) is a maximal subspace of \(T_xW\) on which \({{\mathrm{Hess\,}}}f|_x\) is negative definite.

Finally, suppose that the Hessian is J-invariant. By averaging, we can arrange the tensor B and the Riemannian metric \(\langle \cdot , \cdot \rangle \) to be J-invariant as well. The automorphism \(A :TW \rightarrow TW\) is then complex linear, so its eigenbundles are J-invariant, and E is a complex, and hence orientable, vector bundle. \(\square \)

Let \(h :U \rightarrow E|_U\) be a smooth section of a vector bundle E, and let \(x \in U\) be a point where this section vanishes. The vertical differential of h at x,

$$\begin{aligned} d_{V}h:T_x W \rightarrow E_x, \end{aligned}$$

is the composition of the differential \(dh|_x :T_xW \rightarrow T_{(x,0)} E\) with the projection to the second factor in the decomposition

$$\begin{aligned} T_{(x,0)} E\cong & {} T_{(x,0)} \big (\text {the zero section of } E\big ) \oplus T_{(x,0)} \big (\text {the fibre } E_x \text { of } E\big )\\\cong & {} T_xW \oplus E_x . \end{aligned}$$

In the second lemma, we find an appropriate section of the bundle E of Lemma 4.1 whose zero set will give us the desired submanifold N.

Lemma 4.2

Let U be a neighbourhood of C and E a subbundle of \(TW|_U\), such that at each point x of C the subspace \(E_x\) of \(T_xW\) is maximal among subspaces on which \({{\mathrm{Hess\,}}}f|_x\) is negative definite.

Then, after possibly shrinking U to a smaller neighbourhood of C, there exists a smooth section \(h :U \rightarrow E|_U\) that vanishes on C and such that at each critical point \(x \in C\) the restriction to \(E_x\) of the vertical differential of h is the identity map on \(E_x\) and Hess \(f|_x\) is positive semidefinite on \(\mathrm{ker}(d_Vh|_x)\).

Proof

Recall that, for each \(x \in C\), \(U_x\) is a neighbourhood of x and \(N_x \subset U_x\) is an infinitesimally minimizing manifold for f along \(C\cap U_x\).

Let \(x \in C\). Let

$$\begin{aligned} \varphi :U'_x \rightarrow {\mathbb {R}}^n \end{aligned}$$

be a coordinate chart on a neighbourhood \(U'_x\) of x in \(U_x \cap U\) in which the submanifold \(N_x\) is given by the equations \(\varphi _1 = \ldots \varphi _k = 0\). Because \(T_x N_x\) is complementary to \(E_x\), after possibly shrinking \(U'_x\), the differentials of \(\varphi _1,\ldots ,\varphi _k\) give a trivialization of \(E|_{U'_x}\):

$$\begin{aligned} E|_{U'_x} \rightarrow {\mathbb {R}}^k . \end{aligned}$$

Let

$$\begin{aligned} h_x :U'_x \rightarrow E|_{U'_x} \end{aligned}$$

be the section whose composition with the trivialization \( E|_{U'_x} \rightarrow {\mathbb {R}}^k \) is the map \((\varphi _1,\ldots ,\varphi _k)\). Then \(h_x\) vanishes on \(C \cap U'_x\), and, at each \(x' \in C \cap U'_x\), the restriction to \(E_{x'}\) of the vertical differential \(d_V h_x|x'\) is the identity map on \(E_{x'}\), and Hess \(f|_{x'}\) is positive semidefinite on \(\mathrm{ker}(d_Vh_x|_{x'})\).

Let \(U' = \bigcup _{x \in C} U'_x\). Define a section \(h :U' \rightarrow E\) by

$$\begin{aligned} h = \sum _\alpha \rho _\alpha h_{x_\alpha } \end{aligned}$$

where \(\{ \rho _\alpha :U' \rightarrow {\mathbb {R}}\}\) is a partition of unity with \({{\mathrm{supp\,}}}\rho _\alpha \subset U'_{x_\alpha }\). Then h satisfies the required properties.

Indeed, let \(x' \in C \cap U'\). Then \(h(x') = \sum _\alpha \rho _\alpha h_{x_\alpha } (x')\). Since \(h_{x_\alpha } (x') = 0\) for each \(\alpha \), we get that \(h(x') = 0\). Choosing a local trivialization of E near \(x'\) and identifying sections of E with \({\mathbb {R}}^k\) valued functions, for every \(v \in T_{x'}W\) we have

$$\begin{aligned} dh|_{x'} (v)=\sum _\alpha \rho _\alpha dh_{x_\alpha }|_{x'} (v) + h_{x_\alpha } (x') d \rho _\alpha |_{x'}(v). \end{aligned}$$

Because \(h_{x_\alpha } (x') = 0\) for every \(\alpha \), we get \(d_V h|_{x'} = \sum _\alpha \rho _\alpha d_V h_{x\alpha }|x'\). The required proper- ties of \(d_V h|_{x'}\) follow from the analogous properties of \(d_V h_{x\alpha }|x'\). (This uses the following linear algebra fact. Let V be a vector space, B a symmetric bilinear form on V, and E a subspace of V on which B is negative definite. For every \(\alpha \), let \(H_\alpha : V \rightarrow E\) be a projection map such that B is positive semidefinite on \(\mathrm{ker} \ H_\alpha \). Let \(H =\sum _\alpha \rho _\alpha H_\alpha \), where \(\rho _\alpha \) are non-negative real numbers with sum 1. Then \(H: V \rightarrow E\) is a projection map such that B is positive semidefinite on \(\mathrm{ker}\ H\).)

this expression becomes \(\sum _\alpha \rho _\alpha dh_{x_\alpha }|_{x'} (v)\). Let \(\overline{v} = (v_1,\ldots ,v_k)\) be the expression of v with respect to the chosen coordinates. If \(v \in E_{x'}\), then \(d h_{x_\alpha }|_{x'} (v) = \overline{v}\) for each \(\alpha \); this implies that \(dh_{x'}(v) = \overline{v}\), as required. \(\square \)

Finally, we use the zero set of the section found in Lemma 4.2 to obtain an infinitesimally minimizing submanifold N for f along C.

Lemma 4.3

Let U be a neighbourhood of C and E a subbundle of \(TW|_U\), such that at each point x of C the subspace \(E_x\) of \(T_xW\) is maximal among subspaces on which \({{\mathrm{Hess\,}}}f|_x\) is negative definite.

Let \(h :U \rightarrow E|_U\) be a smooth section that vanishes on C and such that at each critical point \(x \in C\) the restriction to \(E_x\) of the vertical differential of h is the identity map on \(E_x\) and Hess \(f|_x\) is positive semidefinite on \(\mathrm{ker}(d_V h|_x)\).

Then, after possibly shrinking the neighbourhood U of C, the set \(N = h^{-1}(0)\) is an infinitesimally minimizing submanifold for f along C.

If in addition E is an orientable vector bundle, then N is co-orientable.

Proof

Fix a point \(x \in C\), and trivialize E in a neighbourhood \(U_x\) of x. In terms of this trivialization, h becomes a map from \(U_x\) to \({\mathbb {R}}^k\). The hypotheses guarantee that the differential of h at x is onto. By the implicit function theorem, after possibly shrinking \(U_x\), the intersection \(U_x \cap h^{-1}(0)\) is a submanifold of \(U_x\). So we have shown that every point in C has a neighbourhood \(U_x\) whose intersection with \(N = h^{-1}(0)\) is a submanifold of \(U_x\). After possibly shrinking U, we obtain that N itself is a submanifold of U.

Let \(x \in C\). By the implicit function theorem, \(T_xN = \mathrm{ker}(d_V h|_x)\). Because Hess \(f|_x\) is positive semidefinite on this space and negative definite on \(E_x\), and because these spaces are complementary, \(T_xN\) is maximal among subspaces of \(T_xW\) on which Hess\(f|_x\) is positive semidefinite.

Finally, we note that if E is an orientable vector bundle, then N is co-orientable. Indeed, we have constructed N so that its normal bundle is the pullback of E, and co-orientability is precisely orientability of the normal bundle. \(\square \)

This completes the proof of Proposition 3.2.

5 Compatibly fibrating neighbourhood

The purpose of this section is to prove Proposition 3.4.

Let W be a smooth manifold and \(f :W \rightarrow {\mathbb {R}}\) a smooth function. Assume that f is constant on the set \(C := {{\mathrm{Crit\,}}}(f)\) of its critical points. Let N be an infinitesimally minimizing manifold for f along C. To prove the proposition, we need to find a compatibly fibred neighbourhood for C, which is fibrewise orientable if N is co-orientable.

Clearly, for every point x of C, there exists an open neighbourhood \(U_x\) of x and an infinitesimally minimizing manifold \(N_x \subset U_x\) for f along \(C \cap U_x\): take \(U_x = W\) and \(N_x = N\). By Lemma 4.1, we conclude that there exist a neighbourhood U of C and a subbundle E of \(TW|_U\) such that at each point x of C the subspace \(E_x\) of \(T_xW\) is maximal among subspaces on which the Hessian of f is negative definite.

The exponential map with respect to any Riemannian metric identifies a disc subbundle of \(E|_{N \cap U}\) with a (tubular) neighbourhood \(U'\) of \(N \cap U\) in U. Let \(\pi :U' \rightarrow N \cap U\) be the corresponding projection map. For each \(x \in C\), the tangent to the fibre of \(\pi \) at x is \(E_x\), which is maximal among subspaces of \(T_xW\) on which the Hessian of f is negative definite. So \(\pi :U' \rightarrow N \cap U\) is a compatibly fibred neighbourhood of C.

Here, \(N \cap U\) serves two roles: it is a submanifold of W that contains C and it is the base of the fibration \(\pi \). To match the notation of Definition 3.3, we use a different symbol for the base of the fibration. Namely, we write \({\mathcal {O}}\) instead of \(N \cap U\) for the base of the fibration, so that the tubular neighbourhood map becomes a map \(\pi :U' \rightarrow {\mathcal {O}}\), and this map is a compatibly fibred neighbourhood of C.

Finally, we consider the case when N is co-orientable. An orientation on the fibres of the normal bundle of N in W gives an orientation on the fibres of the compatibly fibred neighbourhood that we have constructed. Thus, the compatibly fibred neighbourhood is fibrewise orientable, as desired.

This completes the proof of Proposition 3.4.

6 Minimal degeneracy

The purpose of this section is to prove Proposition 3.5.

Let W be a smooth manifold and \(f :W \rightarrow {\mathbb {R}}\) a smooth function. Assume that f is constant on the set \(C:={{\mathrm{Crit\,}}}(f)\) of its critical points. Let \(\pi :U \rightarrow {\mathcal {O}}\) be a compatibly fibred neighbourhood of C. Namely, U is a neighbourhood of C in W, \(\pi \) is a submersion, and at every point x in C the vertical tangent space at x is maximal among subspaces of \(T_xW\) on which the Hessian of f is negative definite. Let \({\widetilde{N}}\) denote the set of fibrewise critical points of \(\pi \), that is, the points whose vertical tangent space is in the kernel of df.

Lemma 6.1

For every point x of C, there exists a neighbourhood \(W_x\) such that the intersection \(N_{W_x} := W_x \cap {\widetilde{N}}\) has the following properties.

  • \(N_{W_x}\) is a manifold, containing x.

  • \(T_x N_{W_x}\) is maximal among subspaces of \(T_xW\) on which the Hessian of f is positive semidefinite.

Proof

Because every critical point is fibrewise critical, x is in \({\widetilde{N}}\). Because \(\pi \) is a submersion, without loss of generality we may identify a neighbourhood of x in W with a neighbourhood of the origin in \({\mathbb {R}}^a \times {\mathbb {R}}^b\), such that x becomes the origin, and such that \(\pi \) becomes the projection map

$$\begin{aligned} \pi (\xi _1 , \ldots , \xi _a , \eta _1,\ldots ,\eta _b ) = (\xi _1 , \ldots , \xi _a) . \end{aligned}$$

The vertical differential of f then becomes the function

$$\begin{aligned} d_V f :{\mathbb {R}}^a \times {\mathbb {R}}^b \rightarrow {\mathbb {R}}^b \end{aligned}$$

that is given by

$$\begin{aligned} \left( \frac{\partial f}{\partial \eta _1} , \ldots , \frac{\partial f}{\partial \eta _b} \right) . \end{aligned}$$

The set \({\widetilde{N}}\) of fibrewise critical points is precisely the zero set of \(d_V f\). By the implicit function theorem, to show that \({\widetilde{N}}\) is a manifold near x, it is enough to show that the differential of \(d_Vf\) at the origin,

$$\begin{aligned} d (d_V f)|_0 :{\mathbb {R}}^a \times {\mathbb {R}}^b \rightarrow {\mathbb {R}}^b, \end{aligned}$$
(6.1)

is onto. In coordinates, the linear map (6.1) is represented by the \((a+b) \times b\) matrix

$$\begin{aligned} \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} \frac{\partial ^2 f}{ \partial \xi _1 \partial \eta _1} \Big |_0 &{} \ldots &{} \frac{\partial ^2 f}{ \partial \xi _a \partial \eta _1} \Big |_0 &{} \frac{\partial ^2 f}{ \partial ^2 \eta _1} \Big |_0 &{} \ldots &{} \frac{\partial ^2 f}{ \partial \eta _b \partial \eta _1} \Big |_0 \\ \vdots &{} &{} \vdots &{} \vdots &{} &{} \vdots \\ \frac{\partial ^2 f}{ \partial \xi _1 \partial \eta _b} \Big |_0 &{} \ldots &{} \frac{\partial ^2 f}{ \partial \xi _a \partial \eta _b} \Big |_0 &{} \frac{\partial ^2 f}{ \partial \eta _1 \partial \eta _b} \Big |_0 &{} \ldots &{} \frac{\partial ^2 f}{ \partial ^2 \eta _b } \Big |_0 \end{array} \right) . \end{aligned}$$

The fact that the right \(b \times b\) block of this matrix is negative definite and hence non-degenerate, implies that (6.1) is onto, as required.

This fact also implies that the kernel of (6.1) is a complementary subspace to \(\{ 0 \} \times {\mathbb {R}}^b\) in \({\mathbb {R}}^a \times {\mathbb {R}}^b\) and thus that \(T_xN_{W_x}\) is a complementary subspace to \(\ker \hbox {d}\pi |_x\) in \(T_xW\). After a further change of coordinates, we can arrange that \(T_x N_{W_x}\) is represented by the subspace \({\mathbb {R}}^{a} \times \{0\}\) of \({\mathbb {R}}^a\times {\mathbb {R}}^b\).

In these new coordinates, because \(\frac{\partial f}{\partial \eta _j}\) vanishes along \({\widetilde{N}}\) and \(\frac{\partial }{\partial \xi _i}\) is tangent to \({\widetilde{N}}\) at x, we get that \(\left. \frac{\partial ^2 f}{\partial \xi _i \partial \eta _j}\right| _0 = 0\) for all \(1 \le i \le a\) and \(1 \le j \le b\). So, in these coordinates, the Hessian of f at x becomes the bilinear form that is represented by the block diagonal matrix

$$\begin{aligned} \left( \begin{array}{c@{\quad }c} \left. \frac{\partial ^2 f}{\partial \xi _i\partial \xi _{i'}}\right| _0 &{} 0 \\ 0 &{} \left. \frac{\partial ^2 f}{\partial \eta _j \partial \eta _{j'}}\right| _0 \end{array} \right) . \end{aligned}$$

By assumption, \(\{ 0 \} \times {\mathbb {R}}^b\) is maximal among subspaces of \({\mathbb {R}}^{a+b}\) on which this matrix is negative definite. Because the matrix is block diagonal, it follows that \({\mathbb {R}}^a \times \{ 0 \} \) is maximal among subspaces of \({\mathbb {R}}^a \times {\mathbb {R}}^b\) on which this matrix is positive semidefinite. That is, \(T_x N_{W_x}\) is maximal among subspaces of \(T_xW\) on which the Hessian of f is positive semidefinite, as required. \(\square \)

Following the notation of Lemma 6.1, we set \(U' := \bigcup W_x\). Then \(U'\) is an open neighbourhood of C, the map \(\pi ' := \pi |_{U'} :U' \rightarrow {\mathcal {O}}\) is a submersion, and \({\widetilde{N}}' := {\widetilde{N}}\cap U'\) is the set of fibrewise critical points of \(\pi '\).

The sets \(N_{W_x}\) form an open covering of \({\widetilde{N}}'\). By Lemma 6.1, each of them is a manifold. We deduce that \({\widetilde{N}}'\) is a manifold. Moreover, because every critical point is fibrewise critical, \(N_{W_x}\) contains \(W_x \cap C\); it follows that \({\widetilde{N}}'\) contains C. For each \(x \in C\), by Lemma 6.1, and since \(T_x{\widetilde{N}}' = T_xN_{W_x}\), we have that \(T_x{\widetilde{N}}'\) is maximal among subspaces of \(T_xW\) on which the Hessian of f is positive semidefinite. Moreover, when the compatibly fibred neighbourhood is fibrewise orientable, the vertical tangent bundle \(\ker \hbox {d}\pi |_{{\widetilde{N}}'}\) is by definition orientable. But this bundle is isomorphic to the normal bundle to \({\widetilde{N}}'\) in W, and so \({\widetilde{N}}'\) is co-orientable. This completes the proof of Part (a) of Proposition 3.5.

Since the vertical tangent bundle \(\ker \hbox {d}\pi |_{{\widetilde{N}}'}\) is complementary to \(T{\widetilde{N}}'\) in \(TW|_{{\widetilde{N}}'}\), there exists an open neighbourhood \(U_{{\widetilde{N}}'}\) of \({\widetilde{N}}'\) in \(U'\) and a tubular neighbourhood map

$$\begin{aligned} \pi _{{\widetilde{N}}'} :U_{{\widetilde{N}}'} \rightarrow {\widetilde{N}}' \end{aligned}$$

whose fibres are open subsets of the fibres of \(\pi \).

More precisely, let E be the pullback to \({\widetilde{N}}'\) of the vertical tangent bundle, that is, for each \(x' \in {\widetilde{N}}'\), the fibre of E at \(x'\) is \(\ker \hbox {d}\pi |_{x'}\). Choose a fibrewise Riemannian metric on \(U'\). Then the fibrewise exponential map gives a diffeomorphism from a neighbourhood of the zero section in E to a neighbourhood \(U_{{\widetilde{N}}'}\) of \({\widetilde{N}}'\) in W that carries the projection map to \(\pi _{{\widetilde{N}}'}\) when we identify \({\widetilde{N}}'\) with the zero section.

With the identification of \(U_{{\widetilde{N}}'}\) with an open subset of a vector bundle, the fibrewise second derivative of f becomes well defined on \(U_{{\widetilde{N}}'}\). The set of points where this fibrewise second derivative of f is negative definite is an open neighbourhood of \({\widetilde{N}}'\) in \(U_{{\widetilde{N}}'}\). This, and the fact that the set of fibrewise critical points is exactly \({\widetilde{N}}'\), together imply that, after possibly shrinking \(U_{{\widetilde{N}}'}\) to a smaller neighbourhood of \({\widetilde{N}}'\), the fibrewise maxima of f are achieved exactly at the points of \({\widetilde{N}}'\).

Abusing notation by returning to our previous symbols, we now assume that we have the following set-up: \({\widetilde{N}}\) is a submanifold of W that contains C; U is an open neighbourhood of \({\widetilde{N}}\); \(\pi :U \rightarrow {\widetilde{N}}\) is a tubular neighbourhood map; the set of fibrewise critical points of \(\pi \) is exactly \({\widetilde{N}}\); the fibrewise maxima of f are achieved exactly at the points of \({\widetilde{N}}\); and, at each point x of C, we have that \(\ker \hbox {d}\pi |_x\) is maximal among subspaces of \(T_xW\) on which the Hessian of f is negative definite.

Now we want to show that \(f|_{{\widetilde{N}}}\) attains its minimum value exactly at the points of C, after possibly intersecting with a smaller neighbourhood of C. For every point x of C, let \(U_x\) be an open neighbourhood of x and \(Z_x \subset U_x\) a minimizing submanifold for the function \(f|_{U_x} :U_x \rightarrow {\mathbb {R}}\) along the closed subset \(C \cap U_x\). Since \(\ker \hbox {d}\pi |_x\) is maximal among subspaces of \(T_xW\) on which the Hessian of f is negative definite and \(T_x Z_x\) is maximal among subspaces of \(T_xW\) on which the Hessian of f is positive semidefinite, \(Z_x\) is transverse to the fibres of \(\pi |_{U_x}\) at the point x. Therefore, \(\pi |_{Z_x} :Z_x \rightarrow {\widetilde{N}}\) is a submersion at the point \(x \in Z_x\), so \(\pi (Z_x)\) contains a neighbourhood of x in \({\widetilde{N}}\). So \(\bigcup _x \pi (Z_x)\) contains a neighbourhood of C in \({\widetilde{N}}\). Let \({\mathcal {O}}'\) be such a neighbourhood. By the choice of \(Z_x\), the restriction \(f|_{Z_x}\) attains its minimum exactly on \(Z_x\cap C\). Thus, for every \(y \in Z_x {\smallsetminus } C\), we have \(f(y) > f(x)\). But we have also arranged that the fibrewise maxima of f are attained exactly on \({\widetilde{N}}\). So, for every \(y \in Z_x\), we have \(f(y) \le f(\pi (y))\). We conclude that, for every \(y \in Z_x\), we have \(f(\pi (y)) \ge f(x)\), and equality implies that \(y \in C\), which implies that \(\pi (y) = y\) and hence that \(\pi (y) \in C\). It follows that \(f|_{\pi (Z_x)}\) attains its minimum exactly on \(Z_x \cap C\). Hence, \(f|_{{\mathcal {O}}'}\) attains its minimum exactly on C. So \({\mathcal {O}}'\) is a minimizing manifold for f along C, as required.

This completes the proof of Proposition 3.5.

7 Norm square of the momentum map

Let \(\Phi :M \rightarrow \mathfrak {g}^*\) be the momentum map for the action of a compact Lie group G on a symplectic manifold \((M,\omega )\). Fix an \({{\mathrm{Ad}}}\)-invariant inner product on \(\mathfrak {g}\), and fix the induced inner product on \(\mathfrak {g}^*\).

Theorem 7.1

The function \(\Vert \Phi \Vert ^2 :M \rightarrow {\mathbb {R}}\) is minimally degenerate.

Theorem 7.1 was proved by Kirwan in [6]. We will present here a slightly different proof: Theorem 2.8 reduces the problem to a local result, and we deduce this local result from the local normal form theorem for Hamiltonian G actions.

Remark 7.2

  1. (1)

    We recall what it means for \(\Phi :M \rightarrow \mathfrak {g}^*\) to be a momentum map. First, for every \(\xi \) in the Lie algebra \(\mathfrak {g}\) of G, denoting the corresponding vector field by \(\xi _M\), we have Hamilton’s equation:

    $$\begin{aligned} d \langle \Phi ,\xi \rangle = \iota _{\xi _M}\omega . \end{aligned}$$
    (7.1)

    Second, \(\Phi \) intertwines the G action on M with the co-adjoint G action on the dual \(\mathfrak {g}^*\) of the Lie algebra \(\mathfrak {g}\).

    The local normal form theorem gives an explicit formula for the G-action, the symplectic form \(\omega \), and the momentum map \(\Phi \), on a neighbourhood of a G orbit.

  2. (2)

    Kirwan’s motivation for generalizing Morse(–Bott) theory was in fact to apply such a theory to the norm square of the momentum map, as suggested in the work of Atiyah and Bott [1]. In Sect. 8 we recall some of the consequences of this application.

  3. (3)

    Suppose that G is a torus, M is compact, and M has a G invariant Kähler metric. The compactness of M implies that the gradient flows of the components of the momentum map \(\Phi \) are defined for all times, and the integrability of the complex structure implies that these flows commute. In this case, Kirwan proves in [7] that \(f \circ \Phi \) is minimally degenerate for any convex function f; in particular, \(\Vert \Phi \Vert ^2\) is minimally degenerate. So in this case one does not need the full power of Kirwan’s analysis in [6] nor our local analysis here.

  4. (4)

    Notice that a connected component of the critical set of \(\Vert \Phi \Vert ^2\) need not be a manifold. For example, the circle action with weights 1 and \(-1\) has momentum map

    $$\begin{aligned} \Phi : {\mathbb {C}}^2\rightarrow & {} {\mathbb {R}}\\ (z,w)\mapsto & {} \frac{|z|^2}{2}-\frac{|w|^2}{2}. \end{aligned}$$

    Note that this is a Morse–Bott function, and it has a critical value at \(0 \in {\mathbb {R}}\). The critical set for \(\Phi \) is \(\{0\}\subset {\mathbb {C}}^2\), which is a submanifold. The norm square \(\Vert \Phi \Vert ^2\) also has a critical value at 0, and for the norm square, the critical set is

    $$\begin{aligned} \left\{ (z,w)\in {\mathbb {C}}^2\ \Bigg | \ \frac{|z|^2}{2}-\frac{|w|^2}{2}=0\right\} , \end{aligned}$$

    which is not a manifold.

We now recall some general criteria for identifying the critical set.

Lemma 7.3

(Kirwan [6, §3]) Let \(\beta = \Phi (p)\). Let \(T_\beta \) be the closure in G of the one parameter subgroup that is generated by the element of \(\mathfrak {g}\) that corresponds to \(\beta \) by the inner product. Let \(\mathfrak {h}\) denote the Lie algebra of the stabilizer of p; let \(\mathfrak {h}^*\) be its dual, embedded in \(\mathfrak {g}^*\) by the inner product. The following conditions are equivalent.

  1. (i)

    \(p \in {{\mathrm{Crit\,}}}\Vert \Phi \Vert ^2\).

  2. (ii)

    \(\beta \perp {{\mathrm{image\,}}}\hbox {d}\Phi |_p\).

  3. (iii)

    \(\beta \in \mathfrak {h}^*\).

  4. (iv)

    p is fixed by \(T_\beta \).

Proof

Since \(\Vert \Phi \Vert ^2 = \langle \Phi , \Phi \rangle \), we have \(d \Vert \Phi \Vert ^2 = 2 \langle \hbox {d}\Phi , \Phi \rangle \). So

$$\begin{aligned} d \Vert \Phi \Vert ^2 |_p (v) = 2 \langle \hbox {d}\Phi |_p(v) , \Phi (p) \rangle . \end{aligned}$$

This vanishes for all \(v \in T_pM\) exactly if every element in the image of \(\hbox {d}\Phi |_p :T_pM \rightarrow \mathfrak {g}^*\) is perpendicular to \(\Phi (p)\). This shows that (i) is equivalent to (ii).

The subset of \(\mathfrak {g}^*\) that is identified with \(\mathfrak {h}^*\) by the inner product is exactly the orthocomplement of the annihilator \(\mathfrak {h}^0\) of \(\mathfrak {h}\) in \(\mathfrak {g}^*\). But, since \(\mathfrak {h}\) is the Lie algebra of the stabilizer of p, the image of \(\hbox {d}\Phi |_p :T_pM \rightarrow \mathfrak {g}^*\) is exactly equal to \(\mathfrak {h}^0\); this is a consequence of Hamilton’s equation for the momentum map and the non-degeneracy of the symplectic form \(\omega \). Thus, (ii) is equivalent to (iii).

Consider the isomorphism \(\mathfrak {g}^* \xrightarrow {\simeq } \mathfrak {g}\) that is induced by the inner product. Let \(\widehat{\beta }\) denote the image of \(\beta \). Then \(T_\beta \) is the closure of the one parameter subgroup generated by \(\widehat{\beta }\), and so (iv) is equivalent to the condition that \(\widehat{\beta }\) belong to the infinitesimal stabilizer at p. Applying the isomorphism \(\mathfrak {g}\rightarrow \mathfrak {g}^*\), the relation \(\widehat{\beta } \in \mathfrak {h}\) becomes (iii). \(\square \)

Example 7.4

We consider the linear action with weights (1, 0), (0, 1) and \((1,-1)\):

$$\begin{aligned} (a,b) \cdot (z_1,z_2,z_3) = (az_1, bz_2, ab^{-1}z_3). \end{aligned}$$

The quadratic momentum map for this action is

$$\begin{aligned} Q :{\mathbb {C}}^3\rightarrow & {} {\mathbb {R}}^2\\ (z_1,z_2,z_3)\mapsto & {} \left( \frac{|z_1|^2}{2}+\frac{|z_3|^2}{2} ,\, \frac{|z_2|^2}{2}-\frac{|z_3|^2}{2} \right) . \end{aligned}$$

We shift it by \((-3, 1)\), to obtain the momentum map

$$\begin{aligned} \Phi \left( (z_1,z_2,z_3)\right) = \left( -3+ \frac{|z_1|^2}{2}+\frac{|z_3|^2}{2} , \, 1+\frac{|z_2|^2}{2}-\frac{|z_3|^2}{2} \right) . \end{aligned}$$

The momentum image is shown in Fig. 1.

Fig. 1
figure 1

The shaded region is the momentum map image, \(\Phi \left( {\mathbb {C}}^3\right) \). The lines are critical values for \(\Phi \), and the large dots are the critical values for \(||\Phi ||^2\)

A point \((z_1,z_2,z_3)\in {\mathbb {C}}^3\) is a critical point for \(||\Phi ||^2\) if and only if it satisfies one of the following conditions:

  1. (i)

    \(z_1=z_2=z_3=0\);

  2. (ii)

    \(z_1=z_2=0\) and \(\displaystyle {\frac{|z_3|^2}{2}} =2\);

  3. (iii)

    \(z_2=z_3=0\) and \(\displaystyle {\frac{|z_1|^2}{2}} = 3\); or

  4. (iv)

    \(-3+\displaystyle {\frac{|z_1|^2}{2}} +\displaystyle {\frac{|z_3|^2}{2}}=0\) and \(1+\displaystyle {\frac{|z_2|^2}{2}} -\displaystyle {\frac{|z_3|^2}{2}}=0\).

We note that Condition (i) describes a single point, and each of Conditions (ii) and (iii) describes a single one-dimensional \(T^2\)-orbit. Condition (iii) does not define an entire \(\Phi \)-level set, but each of the other conditions does. Condition (iv) defines a principal \(T^2\) bundle over (the reduced space, which is) a two-sphere.

Let \(\Phi _T :M \rightarrow \mathfrak {t}^*\) denote the momentum map for a maximal torus T of G; thus, \(\Phi _T\) is the composition of \(\Phi :M \rightarrow \mathfrak {g}^*\) with the natural projection \(\mathfrak {g}^* \rightarrow \mathfrak {t}^*\). Using the inner product, we also view \(\mathfrak {t}^*\) as a subspace of \(\mathfrak {g}^*\).

Lemma 7.5

(Kirwan [6,  Lemma 3.1]) Suppose that \(\Phi (p) \in \mathfrak {t}^*\). Then \(p \in {{\mathrm{Crit\,}}}\Vert \Phi \Vert ^2\) if and only if \(p \in {{\mathrm{Crit\,}}}\Vert \Phi _T \Vert ^2\).

Proof

Let \(\beta = \Phi (p)\), and let \(T_\beta \) denote the closure in G of the one parameter subgroup that is generated by the element of \(\mathfrak {g}\) that corresponds to \(\beta \) by the inner product. The assumption that \(\beta \in \mathfrak {t}^*\) implies that \(T_\beta \) is contained in T. The lemma then follows from the equivalence of (i) and (iv) in Lemma 7.3, applied to \(\Phi \) and to \(\Phi _T\). \(\square \)

In preparation for proving Theorem 7.1, we will examine linear symplectic actions of compact Lie groups. We start with torus actions.

Recall that, for a torus T with Lie algebra \(\mathfrak {t}\) and dual space \(\mathfrak {t}^*\), the characters (homomorphisms \(T \rightarrow S^1\)) are determined by their differentials at the identity. Having identified the Lie algebra of \(S^1\) with \({\mathbb {R}}\), these differentials form the weight lattice \(\mathfrak {t}^*_{\mathbb {Z}}\) in \(\mathfrak {t}^*\).

Lemma 7.6

Fix a linear symplectic action of a torus T on a symplectic vector space V. Fix a T invariant compatible complex structure on V. Consider the decomposition of V into weight spaces,

$$\begin{aligned} V = \bigoplus _{\mu } V_\mu . \end{aligned}$$

That is, for every weight \(\mu \in \mathfrak {t}^*_{\mathbb {Z}}\), denoting the corresponding character \(T \rightarrow S^1\) by \(a \mapsto a^\mu \), we have

$$\begin{aligned} V_\mu = \left\{ z \in V \ \left| \ a \cdot z = a^\mu z \text { for all } a \in T \right. \right\} . \end{aligned}$$
  1. (1)

    Let \(\Phi _T :V \rightarrow \mathfrak {t}^*\) be a momentum map for the T action on V. Fix an inner product on \(\mathfrak {t}^*\). Then the set \(\Phi _T \left( {{\mathrm{Crit\,}}}\Vert \Phi _T \Vert ^2 \right) \) is finite.

  2. (2)

    Let \(Q :V \rightarrow \mathfrak {h}^*\) be the quadratic momentum map for a linear symplectic action of a compact Lie group H that commutes with the T action and preserves the complex structure. Then, for any \(z \in V\), writing

    $$\begin{aligned} z = \sum _{\mu } z_\mu , \quad z_\mu \in V_\mu , \end{aligned}$$

    we have

    $$\begin{aligned} Q(z)=\sum _\mu Q(z_\mu ). \end{aligned}$$

Proof of Part (1) of Lemma 7.6

We can take the indexing set for the decomposition into weight spaces to be \({\mathcal {W}}:= \left\{ \mu \in \mathfrak {t}_{{\mathbb {Z}}} \ | \ V_\mu \ne \{ 0 \} \right\} \). Because V is finite dimensional, this set is finite.

Let \(\beta = \Phi _T(0)\). For \(z \in V\), writing

$$\begin{aligned} z = \sum _{\mu \in {\mathcal {W}}} z_\mu , \quad z_\mu \in V_\mu , \end{aligned}$$

we have

$$\begin{aligned} \Phi _T(z) = \beta + \sum _{\mu \in {\mathcal {W}}} \frac{|z_\mu |^2}{2}\mu . \end{aligned}$$

By the equivalence of (i) and (ii) in Lemma 7.3, we have that \(z \in {{\mathrm{Crit\,}}}\Vert \Phi _T \Vert ^2\) if and only if \(\Phi _T(z) \perp {{\mathrm{image\,}}}\hbox {d}\Phi _T|_z\). By the above formula for the momentum map, \(\Phi _T(z)\) belongs to the affine space

$$\begin{aligned} \beta + \text {span} \{ \mu \, | \, z_\mu \ne 0 \}, \end{aligned}$$

and \({{\mathrm{image\,}}}\hbox {d}\Phi _T|_z\) is the corresponding linear space \(\text {span} \{ \mu \, | \, z_\mu \ne 0 \}\). So the condition \(\Phi _T(z) \perp {{\mathrm{image\,}}}\hbox {d}\Phi _T|_z\) holds if and only if \(\Phi _T(z)\) is the foot of the perpendicular from the origin to the affine space.

For every subset I of \({\mathcal {W}}\), let \(\beta _I\) denote the foot of the perpendicular from the origin in \(\mathfrak {t}^*\) to the affine space \(\beta + \text {span} \{ \mu \, | \, \mu \in I \}\). Then \(\Phi _T({{\mathrm{Crit\,}}}\Vert \Phi _T \Vert ^2)\) is contained in the finite set \(\{ \beta _I \, | \, I \subset {\mathcal {W}}\}\). This proves Part (1) of Lemma 7.6. \(\square \)

Proof of Part (2) of Lemma 7.6

Denote the \(\mathfrak {h}\)-action on V by \(\xi :z \mapsto \xi \cdot z\) for \(\xi \in \mathfrak {h}\). Then the quadratic momentum map is

$$\begin{aligned} Q(z) = \frac{1}{2} B(z,z) , \end{aligned}$$
(7.2)

where B is the (symmetric) \(\mathfrak {h}^*\) valued bilinear form whose components are given by

$$\begin{aligned} B^\xi (u,v) = \omega ( \xi \cdot u , v) \quad \text {for all}\;\xi \in \mathfrak {h}. \end{aligned}$$
(7.3)

The weight spaces \(V_\mu \) are H-invariant; this follows from the fact that the H action commutes with the T action, and it implies that

$$\begin{aligned} \text { for any }\mu \in {\mathcal {W}}\text { and } \xi \in \mathfrak {h}, \text { if } z_\mu \in V_\mu \text { then also } \xi \cdot z_\mu \in V_\mu . \end{aligned}$$
(7.4)

The spaces \(V_\mu \) are symplectically orthogonal:

$$\begin{aligned} \text { for any } \mu _1 \ne \mu _2, \text { if } \zeta _1 \in V_{\mu _1} \text { and } \zeta _2 \in V_{\mu _2}, \text { then } \omega (\zeta _1,\zeta _2) = 0. \end{aligned}$$
(7.5)

By (7.3), (7.4), and (7.5),

$$\begin{aligned} \text {whenever } \mu _1 \ne \mu _2, \text { we have } B(z_{\mu _1},z_{\mu _2}) = 0. \end{aligned}$$
(7.6)

Now,

$$\begin{aligned} Q(z)&= \frac{1}{2} B(z,z) \quad \text {by (7.2)} \\&= \sum _{\mu _1,\mu _2} \frac{1}{2} B(z_{\mu _1},z_{\mu _2}) \quad \text {because } B \text { is bilinear and } z=\sum z_\mu \\&= \sum _{\mu } \frac{1}{2} B(z_\mu ,z_\mu ) + \sum _{\mu _1 \ne \mu _2} \frac{1}{2} B(z_{\mu _1},z_{\mu _2})&\\&= \sum _\mu Q(z_\mu ) \quad \text { by (7.2) and (7.6)}. \end{aligned}$$

This proves Part (2) of Lemma 7.6.

Next, we examine linear symplectic actions of possibly non-abelian compact Lie groups, with attention to a neighbourhood of the origin.

Lemma 7.7

Let \(\Phi :V \rightarrow \mathfrak {h}^*\) be a momentum map for a linear symplectic action of a compact Lie group H on a symplectic vector space V. Fix an \({{\mathrm{Ad}}}\)-invariant inner product on \(\mathfrak {h}\). Then there exist

  • an H invariant linear subspace N of V,

  • an H invariant closed connected subset C of N that contains the origin, and

  • an open neighbourhood U of C in V,

such that

  1. (1)

    N is a minimizing manifold for \(\Vert \Phi \Vert ^2\) along C, and

  2. (2)

    the intersection of U with the critical set \({{\mathrm{Crit\,}}}\Vert \Phi \Vert ^2\) is exactly C.

Construction of C and N for Lemma 7.7

Let \(\beta = \Phi (0)\). Then

$$\begin{aligned} \Phi (\cdot ) = \beta + Q(\cdot ) , \end{aligned}$$
(7.7)

where \(Q :V \rightarrow \mathfrak {h}^*\) is the quadratic momentum map. Because \(\Phi \) is equivariant, \(\beta \in \mathfrak {h}^*\) is fixed under the co-adjoint action of H.

Let \(T_\beta \) denote the closure in H of the one parameter subgroup that is generated by the element of \(\mathfrak {h}\) that is identified with \(\beta \) by the inner product. For each weight \(\mu \in (\mathfrak {t}_\beta )^*_{\mathbb {Z}}\), let \(V_\mu = \{ \mu \in V \ | \ a \cdot z = a^\mu z \text { for all } a \in T_\beta \}\) with respect to an H-invariant compatible complex structure. The definition of \(T_\beta \) implies that if \(\langle \mu , \beta \rangle = 0\) then all the vectors in \(V_\mu \) are fixed by \(T_\beta \) and so \(\mu = 0\). So we can write the weight space decomposition as

$$\begin{aligned} V \ = \ V^{T_\beta } \oplus \bigoplus _{\begin{array}{c} \mu \ \text {s.t.}\\ \langle \mu ,\beta \rangle > 0 \end{array}} V_\mu \oplus \bigoplus _{\begin{array}{c} \mu \ \text {s.t.} \\ \langle \mu ,\beta \rangle < 0 \end{array}} V_\mu . \end{aligned}$$

We set

$$\begin{aligned} C = Q^{-1}(0) \cap V^{T_\beta } \end{aligned}$$

and

$$\begin{aligned} N = V^{T_\beta } \oplus \bigoplus _{\begin{array}{c} \mu \ \text {s.t.} \\ \langle \mu ,\beta \rangle > 0 \end{array}} V_\mu . \end{aligned}$$

Because \(\beta \) is fixed under the co-adjoint action of H, the torus \(T_\beta \) is contained in the centre of H, so the weight spaces \(V_\mu \) are H invariant. So N is an H invariant linear subspace of V, and C is an H invariant closed (conical, hence) connected subset of N that contains the origin.\(\square \)

Proof of Part (1) of Lemma 7.7

We will now show that N is a minimizing manifold for \(\Vert \Phi \Vert ^2\) along C.

For \(z \in V\), writing

$$\begin{aligned} z = \sum _{\mu } z_\mu , \quad z_\mu \in V_\mu , \end{aligned}$$

we have

$$\begin{aligned} Q(z) = \sum _\mu \frac{|z_\mu |^2}{2} \mu . \end{aligned}$$
(7.8)

We also have

$$\begin{aligned} \Vert \Phi (z) \Vert ^2&= \Big \Vert \beta + Q(z) \Big \Vert ^2 \nonumber \\&= \Vert \beta \Vert ^2 + 2 \langle \beta , Q(z) \rangle + \Vert Q(z) \Vert ^2 \nonumber \\&= \Vert \beta \Vert ^2 + \sum _\mu |z_\mu |^2 \langle \mu , \beta \rangle + \Vert Q(z) \Vert ^2 \quad \text { by (7.8).} \end{aligned}$$
(7.9)

Now suppose that \(z \in N\). Then

$$\begin{aligned} z = z_0 + \sum _{\begin{array}{c} \mu \ \text {s.t.} \\ \langle \mu ,\beta \rangle > 0 \end{array}} z_\mu . \end{aligned}$$

From (7.9), we get that

$$\begin{aligned} \Vert \Phi (z) \Vert ^2 \ge \Vert \beta \Vert ^2 , \end{aligned}$$

with equality if and only if

$$\begin{aligned} |z_\mu |^2 \langle \mu ,\beta \rangle = 0\quad \text {for all}\; \mu \quad \text {and} \quad Q(z) = 0. \end{aligned}$$
(7.10)

The first of these two conditions holds if and only if \( z \in V^{T_\beta }\). Thus, the conditions (7.10) hold exactly if \(z \in C\).

We have shown that \(\Vert \Phi (\cdot ) \Vert ^2|_N \ge \Vert \beta \Vert ^2\), with equality exactly at the points of C. That is, N satisfies the second of the two conditions for being a minimizing manifold for \(\Vert \Phi \Vert ^2\) along C.

Because \(\Vert \Phi (\cdot ) \Vert ^2 |_N\) attains its minimum exactly at the points of C, the Hessian \({{\mathrm{Hess\,}}}\Vert \Phi \Vert ^2 |_{T_xN} \) is positive semidefinite at every \(x \in C\). It remains to show that, for all \(x \in C\), the Hessian \({{\mathrm{Hess\,}}}\Vert \Phi \Vert ^2\) is negative definite on a subspace of \(T_xV\) that is complementary to \(T_xN\). We can take this subspace to be the image of \({\bigoplus _{\begin{array}{c} \mu \ s.t. \\ \langle \mu , \beta \rangle < 0 \end{array}} } V_\mu \) under the natural identification of V with \(T_xV\). Thus, for \(x \in C\), we need to show that the Hessian of the map

$$\begin{aligned} \left( \zeta \mapsto \Vert \Phi (x+\zeta ) \Vert ^2 \right) :\bigoplus _{\begin{array}{c} \mu \ s.t. \\ \langle \mu , \beta \rangle <0 \end{array}} V_\mu \ \rightarrow \ {\mathbb {R}}\end{aligned}$$

is negative definite at \(\zeta = 0\).

Consider

$$\begin{aligned} z = x + \zeta \quad \text { with } x \in C \text { and } \zeta \in \bigoplus \limits _{\begin{array}{c} \mu \ s.t.\\ \langle \mu ,\beta \rangle <0 \end{array}} V_\mu . \end{aligned}$$

Because \(x \in C\), we have \(x \in V^{T_\beta }\). So \(z_0 = x\), and

$$\begin{aligned} |z_\mu |^2 \langle \mu , \beta \rangle = {\left\{ \begin{array}{ll} \displaystyle { |\zeta _\mu |^2 \langle \mu ,\beta \rangle } &{} \text { if } \langle \mu ,\beta \rangle < 0 \\ \ 0 &{} \text { otherwise. } \end{array}\right. } \end{aligned}$$
(7.11)

Part (2) of Lemma 7.6, applied to the actions of \(T_\beta \) and of H on V, gives \(Q(x+\zeta ) = Q(x)+Q(\zeta )\). Because \(x \in C\), we have \(Q(x)=0\). So

$$\begin{aligned} Q(z) = Q(\zeta ). \end{aligned}$$
(7.12)

Substituting (7.11) and (7.12) into (7.9), we get

$$\begin{aligned} \Vert \Phi (x+\zeta ) \Vert ^2 = \Vert \beta \Vert ^2 + \sum _{\begin{array}{c} \mu \ s.t.\\ \langle \mu ,\beta \rangle <0 \end{array}} |\zeta _\mu |^2 \langle \mu , \beta \rangle + \Vert Q(\zeta ) \Vert ^2. \end{aligned}$$

Because \(\zeta \mapsto \Vert Q(\zeta ) \Vert ^2\) is homogeneous of degree four, the Hessian of the map \(\zeta \mapsto \Vert \Phi (x+\zeta ) \Vert ^2\) at \(\zeta =0\) is the bilinear form that corresponds to the quadratic form

$$\begin{aligned} \zeta \mapsto \sum _{\begin{array}{c} \mu \ s.t. \\ \langle \mu ,\beta \rangle <0 \end{array}} |\zeta _\mu |^2 \langle \mu ,\beta \rangle . \end{aligned}$$

It is negative definite, as required. This proves that N also satisfies the first of the two conditions for being a minimizing submanifold for \(\Vert \Phi \Vert ^2\) along C. This completes the proof of Part (1) of Lemma 7.7. \(\square \)

Proof of Part (2) of Lemma 7.7

Let T be a maximal torus in H that contains \(T_\beta \). Let

$$\begin{aligned} \Phi _T :V \rightarrow \mathfrak {t}^* \end{aligned}$$

denote the momentum map for T.

Consider \(\alpha \in \mathfrak {t}^* \subset \mathfrak {h}^*\), and suppose that \(\alpha \) is in \(\Phi ({{\mathrm{Crit\,}}}\Vert \Phi \Vert ^2)\). Take any \(z \in {{\mathrm{Crit\,}}}\Vert \Phi \Vert ^2 \) with \(\Phi (z) = \alpha \). Because \(\Phi (z) \in \mathfrak {t}^*\),

  • \(z \in {{\mathrm{Crit\,}}}(\Vert \Phi \Vert ^2)\) if and only if \(z \in {{\mathrm{Crit\,}}}(\Vert \Phi _T \Vert ^2)\), by Lemma 7.5;

  • \(\Phi (z) = \Phi _T(z)\).

Thus, \(\alpha \) is also in \(\Phi _T( {{\mathrm{Crit\,}}}\Vert \Phi _T \Vert ^2 )\).

By Part (1) of Lemma 7.6, applied to the T action on V, the set of such \(\alpha \) is finite. Note that this set contains \(\beta \). So we can write

$$\begin{aligned} \Phi ( {{\mathrm{Crit\,}}}\Vert \Phi \Vert ^2 ) \cap \mathfrak {t}^* = \{ \beta , \alpha _1, \ldots , \alpha _m \}, \end{aligned}$$

where \(\alpha _1,\ldots ,\alpha _m\) are different from \(\beta \). Because \({{\mathrm{Crit\,}}}\Vert \Phi \Vert ^2 \) is H invariant and the momentum map is H equivariant, we conclude that

$$\begin{aligned} \Phi ( {{\mathrm{Crit\,}}}\Vert \Phi \Vert ^2 ) = \{ \beta \} \cup \bigcup _{j=1}^m {{\mathrm{Ad}}}^*(H)( \alpha _j ) . \end{aligned}$$

The co-adjoint orbits \({{\mathrm{Ad}}}^*(H)(\alpha _j)\) are closed in \(\mathfrak {h}^*\) and do not contain \(\beta \), so a sufficiently small neighbourhood of \(\beta \) in \(\mathfrak {g}^*\) does not meet any of these orbits. Let U be the preimage in V of such an open neighbourhood of \(\beta \). Then

$$\begin{aligned} U \cap {{\mathrm{Crit\,}}}\Vert \Phi \Vert ^2&= \Phi ^{-1}(\beta ) \cap {{\mathrm{Crit\,}}}\Vert \Phi \Vert ^2 \\&= Q^{-1}(0) \cap {{\mathrm{Crit\,}}}\Vert \Phi \Vert ^2 \quad \text {by (7.7)} \\&= Q^{-1}(0) \cap V^{T_\beta } \quad \text {by the equivalence of (i) and (iv) in Lemma 7.3}\\&= C , \end{aligned}$$

as required.

This completes the proof of Part (2) of Lemma 7.7. \(\square \)

Next, we consider Hamiltonian G models such as those that occur in the local normal form theorem. We will need to keep track only of the G action and momentum map on these models.

Definition 7.8

Fix a compact Lie group G, an \({{\mathrm{Ad}}}(G)\) invariant inner product on its Lie algebra \(\mathfrak {g}\), and the corresponding \({{\mathrm{Ad}}}^*(G)\) invariant inner product on the dual space \(\mathfrak {g}^*\). A Hamiltonian G model is a manifold Y, equipped with a map

$$\begin{aligned} \Phi _Y :Y\rightarrow \mathfrak {g}^*, \end{aligned}$$

that are obtained from the following construction.

Fix an element \(\beta \in \mathfrak {g}^*\). Let \(G_\beta \) be the stabilizer of \(\beta \) under the co-adjoint action of G on \(\mathfrak {g}^*\) and \(\mathfrak {g}_\beta \) the Lie algebra of \(G_\beta \). Let H be a closed subgroup of \(G_\beta \). Let V be a symplectic vector space on which H acts linearly and symplectically with a quadratic momentum map

$$\begin{aligned} Q :V \rightarrow \mathfrak {h}^*. \end{aligned}$$

Set

$$\begin{aligned} Y \ = \ G \times _H (V \times (\mathfrak {g}_\beta /\mathfrak {h})^*). \end{aligned}$$
(7.13)

Here, H acts on V through the given action and on \((\mathfrak {g}_\beta /\mathfrak {h})^*\) through the co-adjoint representation, and Y is the quotient of \(G \times V \times (\mathfrak {g}_\beta /\mathfrak {h})^* \) by the H action

$$\begin{aligned} H \ni a :(g,z,\nu ) \mapsto (g a^{-1} , a \cdot z , a \cdot \nu ). \end{aligned}$$

For \([ g,z,\nu ]\) in the model \(G \times _H ( V \times (\mathfrak {g}_\beta /\mathfrak {h})^* )\), set

$$\begin{aligned} \Phi _Y ([g,z,\nu ]) \ = \ {{\mathrm{Ad}}}^*(g) \left( \beta + Q(z) + \nu \right) , \end{aligned}$$

where \(\mathfrak {h}^*\) and \((\mathfrak {g}_\beta /\mathfrak {h})^*\) are identified with subspaces of \(\mathfrak {g}^*\) through the given \({{\mathrm{Ad}}}^*(G)\)-invariant inner product.

A Hamiltonian G model \(Y = G \times _H (V \times (\mathfrak {g}_\beta /\mathfrak {h})^* )\) is centred if it satisfies one, hence both, of the following equivalent conditions.

  • The value \(\beta \) is fixed by \({{\mathrm{Ad}}}^*(G)\).

  • The torus \(T_\beta \) is contained in the centre of G.

Remark 7.9

On any Hamiltonian G model \(Y=G \times _H (V \times (\mathfrak {g}_\beta /\mathfrak {h})^*)\) there exists a G-invariant closed two-form \(\omega _Y\) that is non-degenerate at the basepoint [1, 0, 0] and for which the map \(\Phi _Y :Y \rightarrow \mathfrak {g}^*\) is a momentum map. If the model is centred, there exists such a \(\omega _Y\) that is everywhere non-degenerate, and the central orbit \(G \cdot [1,0,0]\) in Y is isotropic with respect to \(\omega _Y\).

We now examine centred Hamiltonian G models.

Lemma 7.10

Fix a centred Hamiltonian G model Y with momentum map

$$\begin{aligned} \Phi _Y :Y \rightarrow \mathfrak {g}^* . \end{aligned}$$

Assume that the basepoint [1, 0, 0] of Y is a critical point for \(\Vert \Phi _Y\Vert ^2\). Then there exist

  • a G invariant closed connected subset C of Y that contains the basepoint [1, 0, 0],

  • a G invariant closed connected submanifold N of Y that contains C, and

  • a G invariant open neighbourhood U of C in Y,

such that

  1. (1)

    N is a minimizing manifold for \(\Vert \Phi _Y \Vert ^2\) along C, and

  2. (2)

    the intersection of U with the critical set \({{\mathrm{Crit\,}}}\Vert \Phi _Y \Vert ^2\) is exactly C.

Proof

By the equivalence of (i) and (iii) in Lemma 7.3, applied to the basepoint [1, 0, 0] in the model Y, the condition that [1, 0, 0] is critical for \(\Vert \Phi _Y \Vert ^2\) is equivalent to

$$\begin{aligned} \beta \in \mathfrak {h}^*. \end{aligned}$$

(Recall that we have identified \(\mathfrak {h}^*\) with a subspace of \(\mathfrak {g}^*\).)

Because \(\beta \) is fixed by \({{\mathrm{Ad}}}^*(H)\) (as an element of \(\mathfrak {g}^*\) and hence as an element of \(\mathfrak {h}^*\)),

$$\begin{aligned} \beta + Q(\cdot ) \end{aligned}$$

is an (equivariant) momentum map for the H action on V. Applying Lemma 7.7 to it, let \(N_H\) be an H invariant subspace of V, let \(C_H\) be a closed connected subset of \(N_H\) that contains the origin such that \(N_H\) is a minimizing manifold for \(\Vert \beta + Q(\cdot ) \Vert ^2\) along \(C_H\), and let \(U_H\) be an H invariant open neighbourhood of \(C_H\) in V whose intersection with \({{\mathrm{Crit\,}}}\Vert \beta + Q(\cdot ) \Vert ^2\) is exactly \(C_H\).

Set

$$\begin{aligned} N= & {} G \times _H \left( N_H \times (\mathfrak {g}_\beta /\mathfrak {h})^* \right) ,\\ C= & {} G \times _H \left( C_H \times \{ 0 \} \right) , \end{aligned}$$

and

$$\begin{aligned} U= & {} G \times _H \left( U_H \times (\mathfrak {g}_\beta /\mathfrak {h})^* \right) . \end{aligned}$$

We have

$$\begin{aligned} \Vert \Phi _Y([g,z,\nu ]) \Vert ^2 = \Vert \beta + Q(z) + \nu \Vert ^2 = \Vert \beta + Q(z) \Vert ^2 + \Vert \nu \Vert ^2 . \end{aligned}$$
(7.14)

The first equality is by the formula for \(\Phi _Y\) and since the norm on \(\mathfrak {g}^*\) is \({{\mathrm{Ad}}}^*(G)\) invariant. The second equality is because \(\beta \) and Q(z) are in \(\mathfrak {h}^*\) and \(\nu \) is in \((\mathfrak {g}_\beta /\mathfrak {h})^*\), which, as a subspace of \(\mathfrak {g}^*\), is orthogonal to \(\mathfrak {h}^*\). From (7.14), we deduce that

$$\begin{aligned}{}[g,z,\nu ] \in {{\mathrm{Crit\,}}}\Vert \Phi _Y\Vert ^2 \quad \text { if and only if } \quad \nu = 0 \quad \text {and}\quad z \in {{\mathrm{Crit\,}}}\Vert \beta + Q(\cdot ) \Vert ^2. \end{aligned}$$
(7.15)

The properties of \(C_H\) and \(U_H\), and (7.15), imply that C is closed in the model Y and U is a neighbourhood of C in the model Y whose intersection with \({{\mathrm{Crit\,}}}\Vert \Phi _Y \Vert ^2\) is equal to C.

Because \(\Vert \beta + Q(\cdot ) \Vert ^2 |_{N_H}\) attains its minimum exactly on \(C_H\), and by (7.14), we conclude that \(\Vert \Phi _Y \Vert ^2|_{N}\) attains its minimum exactly on C. So N satisfies the second of the two conditions for being a minimizing manifold for \(\Vert \Phi _Y \Vert ^2\) along C.

As in the proof of Lemma 7.7, let \(E_H\) be an H invariant complementary subspace to \(N_H\) in V, such that, for any \(x_H \in C_H\), the Hessian at \(\zeta = 0\) of the function

$$\begin{aligned} ( \zeta \mapsto \Vert \beta + Q(x_H+\zeta )\Vert ^2 ) :\ E_H \rightarrow {\mathbb {R}}\end{aligned}$$
(7.16)

is negative definite. Now take an arbitrary point of C; write it as \(x = [g,x_H,0]\) with \(g \in G\) and \(x_H \in C_H\). The map \(\zeta \mapsto [g,x_H+\zeta ,0]\), from a sufficiently small neighbourhood of the origin in \(E_H\) to Y, provides a transverse slice to N at x. The pullback of \(\Vert \Phi _Y \Vert ^2\) by this map is the map (7.16), whose Hessian at \(\zeta =0\) is negative definite. So N also satisfies the first of the two conditions for being a minimizing manifold for \(\Vert \Phi _Y \Vert ^2\) along C. \(\square \)

We now examine Hamiltonian G models that are not necessarily centred.

Lemma 7.11

Fix a Hamiltonian G model Y with momentum map

$$\begin{aligned} \Phi _Y :Y\rightarrow \mathfrak {g}^*. \end{aligned}$$

Assume that the basepoint [1, 0, 0] of Y is a critical point for \(\Phi _Y\). Then there exist

  • a G invariant closed connected subset C of Y that contains the basepoint [1, 0, 0],

  • a G invariant closed connected submanifold N of Y that contains C, and

  • a G invariant open neighbourhood U of the basepoint [1, 0, 0] in Y,

such that

  1. (1)

    N is a minimizing manifold for \(\Vert \Phi _Y \Vert ^2\) along C, and

  2. (2)

    the intersection of U with the critical set \({{\mathrm{Crit\,}}}\Vert \Phi _Y \Vert ^2\) is exactly \(U\cap C\).

Proof

We use the notation of Definition 7.8. We have the centred Hamiltonian \(G_\beta \) model

$$\begin{aligned} Y' = G_\beta \times _H (V \times (\mathfrak {g}_\beta /\mathfrak {h})^*) \end{aligned}$$

with the momentum map

$$\begin{aligned} \Phi ' :Y' \rightarrow \mathfrak {g}_\beta ^* , \quad \Phi '([g,z,\nu ]) = {{\mathrm{Ad}}}^*(g)(\beta +\Phi _V(z)+\nu ) . \end{aligned}$$

We can identify

$$\begin{aligned} Y = G \times _{G_\beta } Y' , \end{aligned}$$
(7.17)

exhibiting Y as bundle with fibre \(Y'\) and base \(G/G_\beta \). (The base is also naturally identified with the co-adjoint orbit through \(\beta \).)

Let \(U_{\mathfrak {g}_\beta ^*}\) be a \(G_\beta \) invariant neighbourhood of \(\beta \) in \(\mathfrak {g}_\beta ^*\), (where we have identified \(\mathfrak {g}_\beta ^*\) with a subspace of \(\mathfrak {g}^*\),) that is sufficiently small so that the “sweeping map”

$$\begin{aligned} \left( [g,\eta ] \mapsto {{\mathrm{Ad}}}^*(g)(\eta ) \right) :G \times _{G_\beta } U_{\mathfrak {g}_\beta ^*} \rightarrow \mathfrak {g}^* \end{aligned}$$
(7.18)

is a G equivariant open embedding.

The restriction of

$$\begin{aligned} \Phi _Y :Y \rightarrow \mathfrak {g}^* \end{aligned}$$

to the open subset

$$\begin{aligned} G \times _{G_\beta } \left( {\Phi '}^{-1} (U_{\mathfrak {g}_\beta ^*}) \right) \end{aligned}$$

(with the identification (7.17)) is given by the composition of the map

$$\begin{aligned} \left( [g,y'] \mapsto [g,\Phi '(y')] \right) :G \times _{G_\beta } Y' \rightarrow G \times _{G_\beta } \mathfrak {g}_\beta ^* \end{aligned}$$

with the open embedding (7.18). So, for any \(y' \in {\Phi '}^{-1} (U_{\mathfrak {g}_\beta ^*}) \), we have

$$\begin{aligned}{}[g,y'] \in {{\mathrm{Crit\,}}}\Vert \Phi _Y \Vert ^2 \quad \text { if and only if } \quad y' \in {{\mathrm{Crit\,}}}\Vert \Phi ' \Vert ^2. \end{aligned}$$

In particular, because the basepoint [1, 0, 0] of Y is critical for \(\Vert \Phi _Y \Vert ^2\), the basepoint [1, 0, 0] of \(Y'\) is also critical for \(\Vert \Phi ' \Vert ^2\). (Alternatively, this fact follows from the equivalence of (i) and (iii) in Lemma 7.3, applied to the basepoint [1, 0, 0] in the model \(Y'\) and to the basepoint [1, 0, 0] in the model Y. In both cases, being critical for the norm squared of the momentum map is equivalent to \(\beta \in \mathfrak {h}^*\), where we embed \(\mathfrak {h}^* \subset \mathfrak {g}_\beta ^* \subset \mathfrak {g}^*\).)

Applying Lemma 7.10 to the centred Hamiltonian \(G_\beta \) model \(Y'\), let \(C'\) be a \(G_\beta \) invariant closed connected subset of \(Y'\) that contains the basepoint [1, 0, 0], let \(N'\) be a \(G_\beta \) invariant closed connected submanifold of \(Y'\) that contains \(C'\), and let \(U'\) be a \(G_\beta \) invariant open neighbourhood of \(C'\) in \(Y'\), such that

  1. (1)

    \(N'\) is a minimizing manifold for \(\Vert \Phi '\Vert ^2\) along \(C'\), and

  2. (2)

    the intersection of \(U'\) with the critical set \({{\mathrm{Crit\,}}}\Vert \Phi ' \Vert ^2\) is exactly \(C'\).

With the identification (7.17), set

$$\begin{aligned} C = G \times _{G_\beta } C' \end{aligned}$$

and

$$\begin{aligned} N = G \times _{G_\beta } N' . \end{aligned}$$

Then C is a G invariant closed connected subset of Y that contains the basepoint [1, 0, 0], and N is a G invariant closed connected submanifold of Y that contains C. Because \(\Vert \Phi _Y ([g,y']) \Vert ^2 = \Vert \Phi '(y') \Vert ^2\), the fact that \(N'\) is a minimizing manifold for \(\Vert \Phi ' \Vert ^2\) along \(C'\) implies that N is a minimizing manifold for \(\Vert \Phi _Y \Vert ^2\) along C.

Set

$$\begin{aligned} U = G \times _{G_\beta } \left( U' \cap {\Phi '}^{-1} (U_{\mathfrak {g}_\beta ^*}) \right) . \end{aligned}$$

As noted earlier, for \(y=[g,y'] \in U\), we have

$$\begin{aligned} y \in {{\mathrm{Crit\,}}}\Vert \Phi _Y \Vert ^2 \quad \text { if and only if } \quad y' \in {{\mathrm{Crit\,}}}\Vert \Phi ' \Vert ^2. \end{aligned}$$

The fact that \(U' \cap {{\mathrm{Crit\,}}}\Vert \Phi ' \Vert ^2 = C'\) then implies that \(U \cap {{\mathrm{Crit\,}}}\Vert \Phi _Y \Vert ^2 = U \cap C\). \(\square \)

We can now prove Theorem 7.1.

Proof of Theorem 7.1

Recall that \(\Phi :M \rightarrow \mathfrak {g}^*\) is an (equivariant) momentum map for the action of a compact Lie group G on a symplectic manifold M and that we have fixed an \({{\mathrm{Ad}}}\)-invariant inner product on \(\mathfrak {g}\) and the induced inner product on \(\mathfrak {g}^*\).

Fix any critical point \(p \in {{\mathrm{Crit\,}}}\Vert \Phi \Vert ^2\).

By the local normal form theorem (see Guillemin–Sternberg [3] or Marle [9]; also see Sjamaar [11, pp. 77–78]), for each G orbit \(G \cdot p\) in M there exists a Hamiltonian G model

$$\begin{aligned} Y = G \times _H (V \times (\mathfrak {g}_\beta /\mathfrak {h})^*) , \quad \Phi _Y :Y \rightarrow \mathfrak {g}^* , \end{aligned}$$
(7.19)

and a G equivariant diffeomorphism

$$\begin{aligned} {\mathcal {O}}_M \rightarrow {\mathcal {O}}_Y \end{aligned}$$
(7.20)

from a neighbourhood \({\mathcal {O}}_M\) of the orbit \(G \cdot p\) in M to a neighbourhood \({\mathcal {O}}_Y\) of the central orbit \(G \cdot [1,0,0]\) in Y that takes p to [1, 0, 0] and whose composition with \(\Phi _Y\) is \(\Phi \).

Because p is a critical point for \(\Vert \Phi \Vert ^2\), the basepoint [1, 0, 0] of Y is a critical point for \(\Vert \Phi _Y \Vert ^2\). By Lemma 7.11, there exist a G-invariant closed connected subset \(C_Y\) of Y, a G-invariant closed connected submanifold \(N_Y\) of Y that contains \(C_Y\), and a G-invariant open neighbourhood \(U_Y\) of the basepoint [1, 0, 0] in Y, such that \(N_Y\) is a minimizing manifold for \(\Vert \Phi _Y \Vert ^2\) along \(C_Y\), and such that the intersection of \(U_Y\) with the critical set \({{\mathrm{Crit\,}}}\Vert \Phi _Y \Vert ^2\) is exactly \(U_Y \cap C_Y\).

Let C, N, and U be the preimages of \({\mathcal {O}}_Y \cap C_Y\), of \({\mathcal {O}}_Y \cap N_Y\), and of \({\mathcal {O}}_Y \cap U_Y\), under the local normal form diffeomorphism (7.20). Then N is a minimizing manifold for \({{\mathrm{Crit\,}}}\Vert \Phi \Vert ^2\) along C, and U is an open neighbourhood of \(G \cdot p\) whose intersection with \({{\mathrm{Crit\,}}}\Vert \Phi \Vert ^2\) is \(U \cap C\).

Because the critical point p was arbitrary, this shows that every point in M has a neighbourhood on which \(\Vert \Phi \Vert ^2 \) is minimally degenerate. By Theorem 2.8, we conclude that \(\Vert \Phi \Vert ^2\) is minimally degenerate. This completes the proof of Theorem 7.1. \(\square \)

8 Morse theoretic consequences

For the convenience of the reader, and to put our work in context, we now recall the main topological consequences of the fact that the norm square of the momentum map is minimally degenerate.

Let M be a compact manifold and \(f :M \rightarrow {\mathbb {R}}\) a smooth function that is minimally degenerate. So the critical set \({{\mathrm{Crit\,}}}f\) is a locally finite union of closed subsets C, on each of which f is constant, and, for each such critical set C, there exists a minimizing submanifold \(N_C\) for f along C.

Kirwan developed the analytic tools necessary to extend results about Morse functions to minimally degenerate functions [6, § 10]. There exists a Riemannian metric on M for which the gradient vector field of f is tangent to the minimizing manifold \(N_C\) on a neighbourhood of C, for each critical set C. For such a Riemannian metric, we let

$$\begin{aligned} S_C := \left\{ x\in M \ \left| \ \begin{array}{c} \hbox {the gradient trajectory for } -f \hbox { starting at } x \\ \hbox { has a limit point in } C\end{array} \right. \right\} . \end{aligned}$$

Kirwan then established the following facts. First, \(S_C\) is a submanifold of M that coincides with \(N_C\) near C. Moreover, the inclusion map \(C \subset S_C\) induces an isomorphism in Čech cohomology [6, Lemma 10.17]. Finally, the submanifolds \(S_C\) give a decomposition of M into a disjoint union

$$\begin{aligned} M = \bigsqcup _C S_C \end{aligned}$$
(8.1)

that satisfies the frontier condition

In the presence of a compact connected group action that preserves f, we can choose the Riemannian metric to be invariant, and then the submanifolds \(S_C\) are invariant and the inclusions \(C \rightarrow S_C\) induce isomorphisms in equivariant cohomology. The decomposition (8.1) gives rise to the Morse inequalities, and in the presence of a group action, the equivariant Morse inequalities. When \(f=||\Phi ||^2\) is the norm square of the momentum map for a Hamiltonian action of a compact Lie group, (8.1) leads to Kirwan surjectivity [6, pp. 31–34].

Theorem 8.1

(Kirwan surjectivity) Let a compact Lie group G on a compact symplectic manifold M with momentum map \(\Phi :M \rightarrow \mathfrak {g}^*\). Then the inclusion \(\Phi ^{-1}(0)\rightarrow M\) induces a surjection in equivariant cohomology

$$\begin{aligned} H_G^*(M;{\mathbb {Q}})\rightarrow H_G^*(\Phi ^{-1}(0);{\mathbb {Q}}). \end{aligned}$$

For a sufficiently small ball U about the origin in \(\mathfrak {g}^*\), there exists an equivariant deformation retraction from the preimage \(\Phi ^{-1}(U)\) to the level set \(\Phi ^{-1}(0)\). If 0 is a regular value of \(\Phi \), this follows from the tubular neighbourhood theorem; in general, it follows from the results of [8]. Theorem 8.1 then follows from the proof of [6, Lemma 2.18]. We note that a key technical tool in the proof of [6, Lemma 2.18] is the Atiyah–Bott Lemma [1, Proposition 13.4], which provides a condition that guarantees an equivariant Euler class to be a nonzero divisor. The Atiyah–Bott Lemma may be applied to the (normal bundles of the) strata \(S_C\), but not necessarily to the critical sets themselves.

Remark 8.2

Lerman’s paper [8] gives a retraction from \(\Phi ^{-1}(U)\) to \(\Phi ^{-1}(0)\) which is an equivariant homotopy inverse to the inclusion map of \(\Phi ^{-1}(0)\) in \(\Phi ^{-1}(U)\). Usually this retraction is only continuous and not smooth. We believe that the inclusion map of \(\Phi ^{-1}(0)\) in \(\Phi ^{-1}(U)\) does have a smooth equivariant homotopy inverse (whose restriction to \(\Phi ^{-1}(0)\) is homotopic to the identity but not equal to the identity). Details will appear elsewhere.

Acknowledgements

The first author has been partially supported by Simons Foundation Grant \(\#208975\) and National Science Foundation Grant DMS-1206466. The second author is partially supported by the Natural Sciences and Engineering Research Council of Canada. We would like to thank Gwyneth Whieldon for helping us resolve a stubborn LATEX challenge.