1 Introduction

Many invariants of a hypersurface singularity can be computed from its Newton diagram, if the singularity is non-degenerate. Almost all singularities with a given diagram are non-degenerate, but a given function is degenerate for most choices of coordinates and for most functions it is even impossible to find suitable coordinates in which the function is non-degenerate. Sometimes this becomes possible after adding a quadratic form in new variables to the function. Invariants computed from the Newton diagram of the new function allow conclusions about the original singularity. A successful case is the study of Luengo’s example [18] of a non-smooth \(\mu \)-const stratum in [27]. Attention to the fact that a singularity can be made non-degenerate by a coordinate transformation after adding variables was drawn by Arnold, who raised the question whether this is always possible.

Problem 3 of Arnold’s list [1] in the Arcata volume (the Russian version of the problem is older, see problems 1975-3 and 1976-8 in [2]) reads:

Is every function stably equivalent to a \(\varGamma \) -non-degenerate function (in a neighbourhood of a critical point of finite multiplicity)?

Function germs are R-equivalent (or shortly equivalent) if they can be turned into each other under the action of invertible coordinate changes, and stably equivalent if they become equivalent after the addition of non-degenerate quadratic forms in additional variables [3, 11.1]. The function \(f(x_0,\dots ,x_n)+Q(y_0,\dots ,y_m)\) with Q a non-degenerate quadratic form is called a stabilisation of the function \(f(x_0,\dots ,x_n)\).

In this paper, we argue that the answer to Arnold’s question is negative. We call a function which is not stably equivalent to a non-degenerate function shortly for stably degenerate.

The Newton number \(\nu (\varGamma )\) of a Newton diagram \(\varGamma \) gives a lower bound for the Milnor number \(\mu (f)\) and for a non-degenerate function f the equality \(\mu (f)=\nu (\varGamma (f))\) holds [16]. This equality is a necessary and sufficient condition for a weaker non-degeneracy condition, defined by Mondal [21]. His partially non-degeneracy condition does not involve the partial derivatives of initial forms, but initial forms of the partial derivatives. Another condition (NPND\(^*\)) was introduced by Wall [31], who wanted a condition sufficient for the principal results of the theory, and wide enough to include all weighted homogeneous functions with isolated singularity. Following the terminology of Boubakri, Greuel and Markwig [6], we call it inner non-degeneracy. Conjecturally Mondal’s and Wall’s conditions are equivalent in characteristic zero.

In finite characteristic, one can make the same definitions, but the results are weaker. It is no longer true that the generic function with a given diagram is non-degenerate. This is related to the occurrence of wild vanishing cycles (see SGA 7 [11]) . We conjecture that these do not appear for non-degenerate singularities. This implies that the simplest example of singularities with finite Milnor number, \(x^p+x^q\) in characteristic p, is not stably equivalent to a non-degenerate function.

This negative answer does not extend to the case of real or complex functions. I found a number of successful cases, using basically only one trick, which, however, carries a long way. By lack of counterexamples, I expected that every function could be made non-degenerate. The first indication that this is not true came by considering deformations on the \(\mu \)-const stratum in Luengo’s example. A closer analysis led to simpler examples. The easiest example (see Example 3.8) is the singularity

$$\begin{aligned} f_{23}=x^5+xy^3+z^3-3x^2yz+ x^4y\;. \end{aligned}$$

We conjecture that in fact \(\mu (f)=\mu (\tilde{f})>\nu (\varGamma (\tilde{f}))\) for every \(\tilde{f}\), stably equivalent to \(f_{23}\). This implies degeneracy for all three concepts. We present Luengo’s example and other examples, both stably non-degenerate and conjecturally stably degenerate. In particular, we conjecture that there are stably degenerate and stably non-degenerate functions on the same \(\mu \)-constant stratum. This means that simple and less simple topological invariants do not discriminate between stably degenerate and non-degenerate singularities.

The main reason to conjecture that our examples are stably degenerate is that our methods do not work in these cases. We describe why they have to fail. This does not exclude the possibility that some unknown, complicated transformation makes the function non-degenerate after stabilisation.

In the last section, evidence is presented that every irreducible plane curve singularity (with an arbitrary number of Puiseux pairs) is stably equivalent to a non-degenerate singularity. The number of variables is rapidly increasing, making it difficult to determine the Newton diagram and check non-degeneracy, but the form of the equations indicates that the defining functions are non-degenerate.

2 Non-degenerate Functions

We recall the standard definition of non-degeneracy, given by Kouchnirenko [16], and the related concepts of Wall [31] and Mondal [21].

2.1 The Newton Diagram

Let \(f\in k[[x_1,\dots ,x_n]]\) be a formal power series over a field k, with algebraic closure K. Write (in multi-index notation) \(f=\sum a_mx^m\). The support of f is (note that \(0\in {\mathbb N}\)). We will assume that \(f(0)=0\), so \(0\notin {{\,\mathrm{Supp}\,}}(f)\) (otherwise one defines the reduced support by removing the origin [4, 6.2.1]). A Newton diagram \(\varGamma ({\mathcal A})\) can be defined for an arbitrary subset \({\mathcal A}\) of \({\mathbb N}^n\) not containing the origin. The Newton diagram \(\varGamma (f)\) of f is then the Newton diagram of its support. The Newton polyhedron \(\varGamma _+({\mathcal A})\) is the convex hull of the set \(\bigcup _{m\in {\mathcal A}}(m+{\mathbb R}_+^n) \subset {\mathbb R}^n\). The Newton diagram \(\varGamma ({\mathcal A})\) of \({\mathcal A}\) is the union of all compact faces of \(\varGamma _+({\mathcal A})\). The union \(\varGamma _-({\mathcal A})\) of all segments connecting the origin and the Newton diagram is the Newton polytope. See Fig.1 for an example.

A set \({\mathcal A}\) is convenient if it contains a point on each coordinate axis. A series f is convenient if its support is convenient, that is if for every \(1\le i \le n\) there is a \(m_i\) such that the monomial \(x_i^{m_i}\) occurs with non-zero coefficient. In such cases, also the Newton diagram is called convenient.

Fig. 1
figure 1

The Newton polyhedron \(\varGamma _+(f)\) of \(f=(y^2-x^3)^2-4x^5y+x^7\)

Given \(f=\sum a_mx^m\) and a subset \(S\subset {\mathbb R}^n\) (e.g. a face \(\varDelta \) of \(\varGamma (f)\)), we denote by \(f_S\) the series \(\sum _{m\in S}a_mx^m\). The principal part of f is the polynomial \(f_{\varGamma }=\sum _{m\in \varGamma (f)}a_mx^m\). The classical concept of non-degeneracy is treated by Kouchnirenko [16].

Definition 1.1

The series f is non-degenerate if for every closed face \(\varDelta \subset \varGamma (f)\) the polynomials

$$\begin{aligned} x_1\frac{\partial f_\varDelta }{\partial x_{1}},\dots , x_n\frac{\partial f_\varDelta }{\partial x_{n}} \end{aligned}$$

have no common zero on the torus \((K^*)^n\).

Fig. 2
figure 2

The Newton diagram \(\varGamma (\tilde{f})\) of \(\tilde{f}=-z^2+2z(y^2-x^3)-4x^5y+x^7\)

Example 1.2

The function \(f=(y^2-x^3)^2-4x^5y+x^7\) is degenerate. Its Newton diagram \(\varGamma (f)\) can be seen in Fig.1 as the line between the Newton polytope \(\varGamma _-(f)\) and the Newton polyhedron \(\varGamma _+(f)\). If \({{\,\mathrm{char}\,}}k\ne 2\), the function f is stably equivalent to the non-degenerate function (provided \({{\,\mathrm{char}\,}}k \ne 3,13\)) \(\tilde{f}=-\tilde{z}^2+(y^2-x^3)^2-4x^5y+x^7= -z^2+2z(y^2-x^3)-4x^5y+x^7\) (where \(f-\tilde{z}^2\) is a stabilisation and \(\tilde{z}= z-y^2+x^3\) a coordinate transformation) with Newton diagram as shown in Fig.2. The function f is convenient, but \(\tilde{f}\) is not.

2.2 Milnor and Newton Number

If f is non-degenerate, many invariants can be computed from the Newton diagram. We concentrate here on the Milnor number

$$\begin{aligned} \mu (f)=\dim _k k[[x_1,\dots ,x_n]]\big /\big (\textstyle \frac{\partial f}{\partial x_{1}},\dots , \frac{\partial f}{\partial x_{n}}\big )\;. \end{aligned}$$

Note that \(\mu (f)\) can be infinite.

For any compact polytope S in \({\mathbb R}_+^n\) with the origin as vertex, we denote by \(V_k(S)\) the sum of the k-dimensional volumes of the intersections of S with the k-dimensional coordinate subspaces of \({\mathbb R}^n\), and we define following Kouchnirenko [16] its Newton number to be

$$\begin{aligned} \nu (S) =\sum _{k=0}^n(-1)^{n-k}k!V_k(S)\;. \end{aligned}$$

Definition 1.3

The Newton number \(\nu (f)\) of a convenient series f is the Newton number of its Newton polytope \(\varGamma _-(f)\). For a non-convenient series, \(\nu (f):=\sup _{m\in {\mathbb N}} \nu (f+\sum x_i^{m})\).

Likewise one can define the Newton number \(\nu ({\mathcal A})\) of a set \({\mathcal A}\); it is in fact the common value of \(\nu (f)\) for all f with \({{\,\mathrm{Supp}\,}}(f)={\mathcal A}\).

The main result of Kouchnirenko [16] is:

Theorem 1.4

For every series \(f\in k[[x_1,\dots ,x_n]]\), one has \(\mu (f)\ge \nu (f)\). Equality holds if f is convenient and non-degenerate. If \({{\,\mathrm{char}\,}}k=0\), then equality holds also for non-degenerate series which are not convenient. Moreover, then almost all series with given Newton diagram are non-degenerate.

Kouchnirenko proves that in characteristic zero the set of degenerate principal parts is a proper algebraic subset in the variety of all principal parts corresponding to a given Newton diagram [16, Théorème I (iii)]. Furthermore, given any subset \({\mathcal A}\subset {\mathbb N}^n{\setminus }\{0\}\), with \(\nu ({\mathcal A})<\infty \) there exist a non-degenerate series f with \({{\,\mathrm{Supp}\,}}(f)={\mathcal A}\) [16, 1.13 Remarque (i)], see also [17] where a combinatorial criterion on \({\mathcal A}\) for \(\nu ({\mathcal A})<\infty \) is given.

For non-isolated singularities, the meaning of \(\nu (\varGamma _-(f))\) is in the complex case given by a theorem of Varchenko [30], conjectured by Kouchnirenko [16].

Theorem 1.5

For a non-degenerate series \(f\in {\mathbb C}\{x_1,\dots ,x_n\}\), the Newton number \(\nu (\varGamma _-(f))\) is equal to \((-1)^{n-1}(\chi (F)-1)\), where \(\chi (F)\) is the Euler characteristic of the Milnor fibre.

The converse of Theorem 1.4 does not hold in general: for degenerate series it can be that \(\mu (f)= \nu (f)\).

Example 1.6

The simplest example is the function \((z+x)^2+xy+y^2\) [16, Remarque 1.21]. More generally, one can start from any homogeneous isolated curve singularity of the form yf(xy) of degree d and make a suspension \(z^d+yf(x,y)\). A simple linear coordinate transformation gives the degenerate function \(g=(z+x)^d+yf(x,y)\), but \(\mu (g)=\nu (g)=(d-1)^3\).

2.3 Inner and Partially Non-degenerate Functions

The function in the above example is in fact non-degenerate in the sense of Wall [31] and of Mondal [21]. As their definitions are given for algebraically closed fields, we assume from now on for simplicity that the coefficient field is algebraically closed; the definitions to follow can easily be extended by taking coefficients in a smaller field k, but zeroes over an algebraic closure K. But first we need some more notation and terminology.

The exponents m of monomials lie in \({\mathbb N}^n\subset {\mathbb R}^n_+\). On \({\mathbb R}^n\) we take coordinates \(r=(r_1,\dots ,r_n)\) . Let \(w=(w_1,\dots ,w_n)\) be a system of positive (rational) weights on the variables \(x_i\). We consider w as element in the dual space \(({\mathbb R}^n)^*\). So it defines a linear function \(\lambda :r\mapsto \langle w,r\rangle \) on \({\mathbb R}^n\), and a valuation on \(K[[x_1,\dots ,x_n]]\) by . For a subset \({\mathcal A}\subset {\mathbb R}^n_+\), we set . The initial set \({{\,\mathrm{In}\,}}_w({\mathcal A})\) of \({\mathcal A}\) is the set ; for a convex polytope it is also called minimising face. For a series \(f=\sum a_mx^m \in K[[x_1,\dots ,x_n]]\), the initial form \({{\,\mathrm{In}\,}}_w(f)\) is \(f_{{{\,\mathrm{In}\,}}_w({{\,\mathrm{Supp}\,}}(f))}\), that is \({{\,\mathrm{In}\,}}_w(f)=\sum _{m:\langle w,m\rangle =w(f)}a_mx^m \).

A (finite) set of linear functions \(\lambda _j\) given by a set of weights has a minimum \(\lambda _J:r\mapsto \min _{j\in J} \lambda _j(r) = \min _{j\in J}\langle w^{(j)},r\rangle \). We suppose the set to be irredundant, in that no proper subset has the same minimum. It defines a diagram . The faces are non-empty and \((n-1)\)-dimensional. Conversely, given a diagram \(\varGamma \) such that the closed region \(\varGamma _+\) on and above it is convex and central projection onto the unit simplex is a bijection, each facet \(\varDelta \) defines a unique linear function \(\lambda _\varDelta \) such that \(\lambda _\varDelta (r)=1\) for all \(r\in \varDelta \), that is, there is a uniquely defined system of weights \(w_\varDelta \) such that all points \(r\in \varDelta \) satisfy \(\langle w_\varDelta ,r\rangle =1\). The collection of these linear functions defines a convenient diagram \(\varGamma \) as above.

Definition 1.7

A convenient and convex diagram \(\varGamma \) defined (as above) by a finite set of positive weights is called a C-diagram.

For an arbitrary subset \(I\subset \{1,\dots ,n\}\), we denote the coordinate subspace by \({\mathbb R}^I\). For \(K^n\), we use a similar notation, so . Furthermore we put \((K^*)^I= (K^*)^n\cap K^I\). Let \(Q=(q_1,\dots ,q_n)\in K^n\) be a point. We set . Then, .

Note that Kouchnirenko’s non-degeneracy condition depends only on the principal part of the series f. As the condition in Definition 1.1 only involves zeroes on \((K^*)^n\) of the ideal \((x_1\frac{\partial f_\varDelta }{\partial x_{1}},\dots , x_n\frac{\partial f_\varDelta }{\partial x_{n}} )\) for \(\varDelta \) a closed face of \(\varGamma (f)\), one can as well require that the ideal \((\frac{\partial f_\varDelta }{\partial x_{1}},\dots , \frac{\partial f_\varDelta }{\partial x_{n}})\) has no zero on \((K^*)^n\). We first reformulate the definition following Mondal [21].

Definition 1.8

The series f is non-degenerate if for every system of positive weights w the ideal

$$\begin{aligned} \left( \frac{\partial {{\,\mathrm{In}\,}}_w(f)}{\partial x_{1}},\dots , \frac{\partial {{\,\mathrm{In}\,}}_w(f)}{\partial x_{n}}\right) \end{aligned}$$

has no zero on the torus \((K^*)^n\).

Mondal’s non-degeneracy condition does not involve the partial derivatives of initial forms, but initial forms of the partial derivatives.

Definition 1.9

[21]] A series f with \(f(0)=0\) is partially non-degenerate if for every non-empty subset I of \(\{1,\dots ,n \}\) and any system of positive weights w on the \(x_i\) with \(i\in I\) the ideal

$$\begin{aligned} \left( {{\,\mathrm{In}\,}}_w\Big (\frac{\partial f}{\partial x_{1}}\Big |_{K^I}\Big ),\dots , {{\,\mathrm{In}\,}}_w\Big (\frac{\partial f}{\partial x_{n}}\Big |_{K^I}\Big ) \right) \end{aligned}$$

has no zero on the torus \((K^*)^n\).

This condition involves also terms of the series f different from the principal part. As example, consider functions f(xy) with principal part \(f_\varGamma =x^a+y^b\), where \(\gcd (a,b)=1\). Then, \({{\,\mathrm{In}\,}}_w(\frac{\partial f_\varGamma }{\partial x}) =a x^{a-1}\) for all w, but in general \({{\,\mathrm{In}\,}}_w(\frac{\partial f}{\partial x})\) contains terms involving the variable y.

Wall’s non-degeneracy condition is stronger than Kouchnirenko’s, but will be required for less faces. It starts from a C-diagram \(\varGamma \), for which the intersection points with the coordinate axes need not be lattice points.

Definition 1.10

A face \(\varDelta \) is an inner face of a C-diagram \(\varGamma \) if it is not contained in any coordinate hyperplane.

Definition 1.11

Let f be a series whose support has no points below the C-diagram \(\varGamma \). The series f is inner non-degenerate with respect to \(\varGamma \) if for every inner face \(\varDelta \) the following holds: \(\varDelta \cap {\mathbb R}^{I_Q}=\emptyset \) for each common zero Q of the ideal \((\frac{\partial f_\varDelta }{\partial x_{1}},\dots , \frac{\partial f_\varDelta }{\partial x_{n}})\).

We say that f is inner non-degenerate if there exists a C-diagram \(\varGamma \) with respect to which f is inner non-degenerate. Wall [31] calls his condition NPND\(^*\); we follow the terminology of Boubakri et al. [6], where the concept is extended to finite characteristic. The condition depends on the diagram \(\varGamma \), and it is not quite clear how it is related to the Newton diagram \(\varGamma (f)\) of f. The case \(n=2\) is easy to analyse; this is done by Wall [31]. A detailed study of the possible shape of Newton diagrams in \({\mathbb R}^3\) is made by Oleksik [23] in connection with the computation of Łojasiewicz exponents. He defines an exceptional face \(\varDelta \) of \(\varGamma (f)\subset {\mathbb R}^n\) as a facet with one of its vertices at a distance 1 to a coordinate axis, while the remaining vertices define an \((n - 2)\)-dimensional face in one of the coordinate hyperplanes through that axis. A combinatorial characterisation of Newton polyhedra \(\varGamma _+\subset \varGamma '_+\) in \( {\mathbb R}_+^3\) with \(\nu (\varGamma _-)=\nu (\varGamma '_-)\) Brzostowski, Krasiński and Walewska [8].

If \(\dim \varGamma (f)=n-1\) and not convenient one obtains a convenient diagram by taking the diagram determined by the linear functions \(\lambda _\varDelta \) for all facets of \(\varGamma (f)\). Here one can leave out the exceptional faces. It is not clear from the definition, but we conjecture that in characteristic zero, if f is inner non-degenerate with respect to a C-diagram \(\varGamma \), and \(\varGamma '\) is a C-diagram with \({{\,\mathrm{Supp}\,}}(f)\subset \varGamma '_+\) with the same Newton number, then f is also inner non-degenerate with respect to \(\varGamma '\).

Example 1.12

(Example 1.2 continued). The two systems of weights \((\frac{1}{6},\frac{1}{4},\frac{1}{2})\) and \((\frac{2}{13},\frac{3}{13},\frac{7}{13})\) define a C-diagram \(\varGamma \), shown in Fig. 3. The function \(\tilde{f}=-z^2+2z(y^2-x^3)-4x^5y+x^7\) is inner non-degenerate with respect to \(\varGamma \) (if \({{\,\mathrm{char}\,}}K\) is not 2, 3 or 13). There are only three inner faces. The (reduced) singular set of \(f_\varDelta =2z(y^2-x^3)-4x^5y\) is the z-axis and the face \(\varDelta \) does not touch this axis.

Fig. 3
figure 3

A C-diagram for \(\tilde{f}=-z^2+2z(y^2-x^3)-4x^5y+x^7\)

Example 1.13

(Example 1.6 continued). Let yf(xy) be a homogeneous isolated curve singularity of degree d and consider the degenerate function \(g=(z+x)^d+yf(x,y)\). This function is inner non-degenerate with respect to \(\varGamma \) consisting of the triangle given by the weights \((\frac{1}{d}, \frac{1}{d},\frac{1}{d})\). The triangle itself is the only inner face, and as g has an isolated singularity, the non-degeneracy condition is satisfied.

The function is also partially non-degenerate. Restricted to \(y=0\) and with weights (1, 1), or what amounts to the same, weights \((\frac{1}{d},\frac{1}{d})\) the ideal of initial forms of the partial derivatives is generated by \((z+x)^{d-1}\) and f(x, 0). As f(x, 0) is a non-zero multiple of \(x^{d-1}\), there are no zeroes on \((K^*)^3\).

2.4 Relations Between the Different Conditions

Kouchnirenko’s non-degeneracy of a series f does not imply that f is inner or partial non-degenerate, as non-isolated singularities also can be non-degenerate. On the other hand, inner non-degenerate functions have finite Milnor number [6, 31], and the same is true for partially non-degenerate functions.

Proposition 1.14

If f is partially non-degenerate, then the origin is an isolated critical point of f, that is \(\mu (f)<\infty \).

Proof

Suppose that \(\mu (f)=\infty \). Let B be a branch of a curve contained in the zero set \(V(\frac{\partial f_\varDelta }{\partial x_{1}},\dots , \frac{\partial f_\varDelta }{\partial x_{n}})\). Let . The weights w of an appropriate weighted tangent cone to \(B\subset K^{I_B}\) lead to initial forms violating the non-degeneracy condition (cf. [21], LemmaX.17). \(\square \)

We have the following relations between the different non-degeneracy conditions.

Proposition 1.15

([21, Proposition XII.6]). If f is non-degenerate and \(\mu (f)<\infty \), then f is partially non-degenerate.

Proposition 1.16

([21, Proposition XII.9]). An inner non-degenerate series is partially non-degenerate.

We give here the proof of the easiest case, that f is non-degenerate and convenient. Let \(I\subset \{1,\dots ,n \}\) and w a system of positive (integral) weights on the \(x_i\) with \(i\in I\) be given. As f is convenient, \(\varGamma (f)\cap {\mathbb R}^I\ne \emptyset \). We can extend w to a system \(w'\) of positive (rational) weights on all \(x_i\) such that \({{\,\mathrm{In}\,}}_{w'}(\varGamma (f))\subset {\mathbb R}^I\). Then, \({{\,\mathrm{In}\,}}_{w'}(f)\) depends only on the \(x_i\) with \(i\in I\). By non-degeneracy the polynomials \(\frac{\partial {{\,\mathrm{In}\,}}_{w'}(f)}{\partial x_{i}}\), \(i\in I\), have no common zero in \((K^*)^n\). If for \(i\in I\) the polynomial \(\frac{\partial {{\,\mathrm{In}\,}}_{w'}(f)}{\partial x_{i}}\) is not identically zero, then \(\frac{\partial {{\,\mathrm{In}\,}}_{w'}(f)}{\partial x_{i}}=\frac{\partial {{\,\mathrm{In}\,}}_{w}(f|_{K^I})}{\partial x_{i}}= {{\,\mathrm{In}\,}}_w ( \frac{\partial f}{\partial x_{i}}|_{K^I})\). Those functions do not have a common zero, so also not all \({{\,\mathrm{In}\,}}_w ( \frac{\partial f}{\partial x_{i}}|_{K^I})\) with \(i\in \{1,\dots ,n \}\).

The other direction of the implication in Proposition 1.16 is not true in finite characteristic (the simplest example is \(x^p+x^q\)), but in characteristic zero no counterexamples are known. It is easy to see that for \(n=2\) partial non-degeneracy implies inner non-degeneracy, and Mondal gives a proof for \(n=3\) [21, XII.30].

Conjecture 1.17

A partially non-degenerate series \(f\in K[[x_1,\dots ,x_n]]\), with \({{\,\mathrm{char}\,}}K =0\), is also inner non-degenerate.

2.5 Minimal Milnor Number

For series \(g_1,\dots ,g_n\in K[[x_1,\dots ,x_n]]\), the intersection multiplicity (at the origin) is

$$\begin{aligned}{}[g_1,\dots ,g_n]_0=\dim _K K[[x_1,\dots ,x_n]]/(g_1,\dots ,g_n)\;. \end{aligned}$$

For a collection \((\varGamma _1,\dots ,\varGamma _n)\) of n diagrams in \({\mathbb R}^n\) define \([\varGamma _1,\dots ,\varGamma _n]_0\) as the minimal intersection multiplicity at the origin of series \(g_1,\dots ,g_n\) with the Newton diagram of \(g_i\) on or above \(\varGamma _i\).

Given a subset \({\mathcal A}\subset {\mathbb N}^n\) define \(\partial _i{\mathcal A}\) as the support of \(\frac{\partial f}{\partial x_{i}}\) for any \(f\in K[[x_1,\dots ,x_n]]\) with \({{\,\mathrm{Supp}\,}}(f)={\mathcal A}\), that is

We now state Mondal’s main result on the generic Milnor number.

Theorem 1.18

([21, Theorem XII.3].) Suppose that the minimal intersection multiplicity \([\varGamma (\partial _1{\mathcal A}),\dots ,\varGamma (\partial _n{\mathcal A})]_0\) is finite. For a series \(f\in K[[x_1,\dots ,x_n]]\) with support in \({\mathcal A}\) and with \(\varGamma (\frac{\partial f}{\partial x_{j}})=\varGamma (\partial _j{\mathcal A})\) for all j one has \(\mu (f)\ge [\varGamma (\partial _1{\mathcal A}),\dots ,\varGamma (\partial _n{\mathcal A})]_0\) with equality if and only if f is partially non-degenerate. If \({{\,\mathrm{char}\,}}K=0\), then series realising equality exist.

In characteristic zero, the minimal value \([\varGamma (\partial _1{\mathcal A}),\dots ,\varGamma (\partial _n{\mathcal A})]_0\) for the Milnor number is in fact equal to \(\nu ({\mathcal A})\). This follows from Kouchnirenko’s result mentioned after Theorem 1.4, that there exists a non-degenerate function f with support equal to \({\mathcal A}\); its Milnor number is \(\nu (f)\). It follows from Proposition 1.16 and Theorem 1.18 that in characteristic zero an inner non-degenerate function satisfies \(\mu (f)=\nu (f)\); the proof of [31, Theorem 1.6] is incomplete, as it only shows that a non-degenerate, not convenient function f is right equivalent to a convenient function \(f+\sum x_i^m\) with \(m\gg 0\) (this is [16, Théorème 3.7 (i)] but Kouchnirenko does not show it in detail), but does not prove \(\mu (f)=\nu (f)\) for convenient degenerate inner non-degenerate functions (as in Example 1.6). Wall’s argument does show that \(\nu (f)=\nu (\varGamma _-)\), if f is inner non-degenerate with respect to the C-diagram \(\varGamma \).

Corollary 1.19

A series \(f\in K[[x_1,\dots ,x_n]]\) with \(\mu (f)<\infty \) is (inner, partially) degenerate if there exists a series g with \({{\,\mathrm{Supp}\,}}(g)={{\,\mathrm{Supp}\,}}(f)\) and lower Milnor number: \(\mu (g)<\mu (f)\).

This is easy to check, without determining the faces of the Newton diagram. One has to compute, say with Singular [10], \(\mu (f)\) and \(\mu (g)\) for a general enough function with the same support. Taking all coefficients equal to 1 might not be general enough; in my experience a good choice is to use the coefficients \(1,2,3,\dots , k\), if there are k monomials.

3 Finite Characteristic

3.1 Weakly Non-degenerate Functions

In finite characteristic, it is no longer true that the Milnor number is invariant under contact equivalence. The simplest example is the function \(f(x)=x^p\) in characteristic p with \(\mu (f)=\infty \), while \(\mu (g)=p\) for the contact equivalent function \(g(x)=(1+x)f(x)=x^p+x^{p+1}\). Recall that \(f, g \in K[[x_1,\dots ,x_n]]\) are contact equivalent if there is an automorphism \(\varphi \in {\text {Aut}}(K[[x_1,\dots ,x_n]])\) and a unit \(u \in K[[x_1,\dots ,x_n]]^*\) such that \(f= u \cdot \varphi (g)\) (see e.g. [6], p.62).

Some invariants depend only on the contact class, that is on the zero set of f. An example is the \(\delta \)-invariant (the number of virtual double points) for plane curve singularities. The question then arises under which conditions such invariants can be computed from the Newton diagram. A face function \(f_\varDelta \) is quasi-homogeneous, but in finite characteristic it can be that \(f_\varDelta \) does not lie in the ideal \((\frac{\partial f_\varDelta }{\partial x_{1}},\dots ,\frac{\partial f_\varDelta }{\partial x_{n}})\): in characteristic zero one can use Euler’s identity, but not if the characteristic divides the weighted degree. The simplest example is again the polynomial \(x^p\). This leads to the following definitions.

Definition 2.1

The series f is weakly non-degenerate if for every closed face \(\varDelta \subset \varGamma (f)\) the polynomials

$$\begin{aligned} f_\varDelta , \frac{\partial f_\varDelta }{\partial x_{1}},\dots ,\frac{\partial f_\varDelta }{\partial x_{n}} \end{aligned}$$

have no common zero on the torus \((K^*)^n\).

Definition 2.2

Let f be a series whose support has no points below the convenient diagram \(\varGamma \). The series f is weakly inner non-degenerate with respect to \(\varGamma \) if for every inner face \(\varDelta \) the following holds: \(\varDelta \cap {\mathbb R}^{I_Q}=\emptyset \) for each common zero Q of the ideal \((f_\varDelta ,\frac{\partial f_\varDelta }{\partial x_{1}},\dots , \frac{\partial f_\varDelta }{\partial x_{n}})\).

In fact, it is rather common in the literature on p-adic zeta-functions to add the function \(f_\varDelta \) in the definition of non-degeneracy, see, e.g Denef and Hoornaert [12]. This is also the definition of Beelen and Pellikaan [5] in the case \(n=2\); they included convenience. They refer to Kouchnirenko’s definition as non-degeneracy in the strong sense. The term weakly non-degenerate is from Boubakri et al. [6], where the condition is only asked for top-dimensional faces, as a direct but nowhere used generalisation of the definition in Beelen and Pellikaan [5]; as the weak non-degeneracy condition is automatically satisfied for zero dimensional faces, it suffices in the case \(n=2\) to ask the condition for top-dimensional faces. Note that the function \(f_\varDelta \) is also added in Khovanskii’s definition of non-degenerate Laurent polynomials [15], called 0-non-degenerate by Varchenko [30].

Example 2.3

([[5, Remark 3.15]]). Consider \(f=x^{p+1}+x^{p-1}y+xy^{p-1}+y^{p+1}\), or more generally a function f of the form \(xyf_{p-2}(x,y)+f_{>p}(x,y)\), with \(xyf_{p-2}(x,y)\) a homogeneous polynomial of degree p with p distinct factors, and \(f_{>p}(x,y)\) a series with multiplicity at least \(p+1\), making the function convenient and the Milnor number finite. Then, f is (partially) degenerate: for \(w=(1,1)\) we have \({{\,\mathrm{In}\,}}_w(\frac{\partial f}{\partial x})= \frac{\partial {{\,\mathrm{In}\,}}_w(f)}{\partial x}=\frac{\partial (xyf_{p-2})}{\partial x}\) and similarly for the derivative w.r.t. y. As \(x\frac{\partial g_p}{\partial x}+y\frac{\partial g_p}{\partial y}=0\) for any homogeneous polynomial \(g_p\) of degree p in \({{\,\mathrm{char}\,}}p >0\), we get non-trivial solutions. But f is weakly non-degenerate, and in fact weakly inner non-degenerate with respect to the segment joining (p, 0) and (0, p).

Example 2.4

The function \(\tilde{f}=-z^2+2z(y^2-x^3)-4x^5y+x^7\) of Example 1.2 (see Fig. 2) degenerates in characteristic 13 on the facet \(\varDelta \) with vertices (3, 0, 1), (0, 2, 1) and (5, 1, 0). Indeed, as

$$\begin{aligned}\begin{vmatrix} 3&\quad 0&\quad 5\\0&\quad 2&\quad 1\\1&\quad 1&\quad 0 \end{vmatrix} =-13\;,\end{aligned}$$

the polynomials \(x\frac{\partial f_\varDelta }{\partial x}\), \(y\frac{\partial f_\varDelta }{\partial y}\) and \(z\frac{\partial f_\varDelta }{\partial z}\) are linearly dependent, whatever the coefficients of the monomials are. The determinant in question occurs in the computation of the Newton number.

Example 2.5

The polynomial

$$\begin{aligned} f(x,y,z)= x^py+y^pz + z^px \end{aligned}$$

is non-degenerate and inner non-degenerate. For every value of \(m>p\), the function

$$\begin{aligned} f_m= x^py+y^pz + z^px+x^m+y^m+z^m \end{aligned}$$

is still inner non-degenerate, but degenerate, also when \(p \not \mid m\). The face \(\delta \) with \(f_{m,\delta }=x^py+y^m\) is not an inner face. The ideal of partial derivatives is generated by \(\frac{\partial f_{m,\delta }}{\partial y}=x^p+my^{m-1}\). It follows that \(f_m\) is weakly non-degenerate, except when \(m=kp+1\).

The previous example shows that the characteristic zero proof of Kouchnirenko and Wall for the equality \(\mu (f)=\nu (f)\) does not extend to finite characteristic, contrary to what Boubakri et al. [6, Proof of Theorem 7] claim. The proof uses finite determinacy to conclude that f is equivalent to \(f+\sum x_i^m\) for suitable large m and the equality of Milnor and Newton number for convenient non-degenerate series (Theorem 1.4). For \(n=2\), the argument does work in finite characteristic, as \(f(x,y)+x^m+y^m\) is non-degenerate for suitable m, if f(xy) is non-degenerate.

3.2 Conjectures

For non-degenerate plane curve singularities, one can compute the \(\delta \)-invariant from the Newton diagram. Over the complex numbers, this is described in [4, 13.3.1]: it is the number \(\sigma \) of subdiagrammatic monomials \(x^m\), meaning that \(m+(1,1)\) does not belong to the interior of the Newton polyhedron. More generally, for \(n>2\) the number of subdiagrammatic monomials gives the geometric genus of the singularity. An elementary proof in all characteristics in the plane curve case is given by Beelen and Pellikaan [5].

Proposition 2.6

If \(f\in K[[x,y]]\) is weakly non-degenerate then \(\delta (f)\) is equal to \(\sigma (f)\), the number of subdiagrammatic monomials.

As also the number r of branches of f is easily computed from the Newton diagram, this gives, as observed in [6]:

Theorem 2.7

For a weakly non-degenerate \(f\in K[[x,y]]\), one has \(\nu (f)=2\delta - r +1\).

Observe that in general Milnor’s formula \(\mu =2\delta -r + 1\) does not hold in finite characteristic. The difference \(\mu -(2\delta -r + 1)\) is the number of wild vanishing cycles [20, p. 265]. In particular, if f is non-degenerate then there are no wild vanishing cycles. We conjecture that this holds in any dimension. Greuel and Nguyen [13, p. 579] write: ‘Although we can compute the number of wild vanishing cycles, it seems hard to understand them’. The number of wild vanishing cycles was introduced by Deligne in SGA 7 [11]. He defines sheaves of vanishing cycles \(R^i\varPhi _{\bar{\eta }}({\mathbb Z}/\ell )\) (with \(\ell \ne p\)). He gets the total number of vanishing cycles as the sum of the number of (ordinary) vanishing cycles and the number of wild vanishing cycles. In the equicharacteristic case, Deligne proves that the Milnor number is equal to the total number of vanishing cycles [11, Exposé XVI].

Conjecture 2.8

If \(f\in K[[x_1,\dots ,x_n]]\) is non-degenerate, then there are no wild vanishing cycles. If f is weakly non-degenerate, then \(\mu (f)-\nu (f)\) is the number of wild vanishing cycles.

This conjecture implies

Conjecture 2.9

In characteristic 13, the function \(f=(y^2-x^3)^2-4x^5y+x^7\) of Example 1.2 is stably degenerate.

Even simpler examples are obtained from the function \(f(x)=x^p\) in \({{\,\mathrm{char}\,}}p\). This function is weakly non-degenerate. As it has \(\mu (f)=\infty \), it can not be stably equivalent to an inner or partially non-degenerate function. Arnold’s problem asks for functions with finite multiplicity. We take a function with the same zero locus: we consider \(f_q(x)=x^p+x^q\) with \(q>p\) and p and q coprime. In this case, \(\mu (f_q)=q-1\), and \(f_q\) is not (inner) non-degenerate. Conjecture 2.8 implies

Conjecture 2.10

The function \(f_q(x)=x^p+x^q\), \({{\,\mathrm{char}\,}}K=p\), \(q>p\) and \(\gcd (p,q)=1\) is stably degenerate.

We note that \(f_q(x)=x^p+x^q\) is partially non-degenerate. This is caused by the monomial \(x^q\) above the Newton diagram.

4 Characteristic Zero

In this section, we give examples of stably non-degenerate and (conjecturally) degenerate singularities in the case \({{\,\mathrm{char}\,}}K =0\).

4.1 The Basic Trick

Let f be a (degenerate) function of the form \(f=g+m\varphi ^k\), where m is any function, but preferably a monomial. Then, we can remove the term \(m\varphi ^k\) after stabilisation with two new variables:

Lemma 3.1

The function \(f=g+m\varphi ^k\) is stably equivalent to \(-uv+u\varphi +mv^k+g\).

Proof

We stabilise f with the quadratic form \(-\tilde{u} \tilde{v}\) in two new variables and compute the effect of the coordinate transformation \(\tilde{u} =u-m\frac{v^k-\varphi ^k}{v-\varphi }\), \(\tilde{v} = v-\varphi \):

$$\begin{aligned} -\left( u-m\frac{v^k-\varphi ^k}{v-\varphi }\right) (v-\varphi )+m\varphi ^k= -uv+u\varphi +mv^k\;. \end{aligned}$$

\(\square \)

This formula includes the special case \(k=1\): one has that \(g+m\varphi \) is stably equivalent to \(-uv+u\varphi +vm+g\). We note also the case \(m=1\) and \(k=2\), where we have \(f=g+\varphi ^2\). The basic trick gives \(-uv+u\varphi +v^2+g\), to which we apply the coordinate transformation \(v=\bar{v}+\bar{u}\), \(u= 2\bar{u}\), yielding \({\bar{v} }^2-{\bar{u}}^2+2 \bar{u}\varphi +g\), so f is also stably equivalent to \(-{\bar{u}}^2+2 \bar{u}\varphi +g\); this is the obvious way to treat this case.

Corollary 3.2

Every polynomial is stably equivalent to a polynomial of degree three.

Proof

A product \(m\varphi ^k\) with \(\deg m = d\), \(\deg \varphi =e\) can be replaced by \(-uv+u\varphi +v^km\) with summands of degrees 2, \(e+1\) and \(d+k\). The condition that each of these degrees is less than \(d+ke\) is that \(d>1\) or \(k>1\) and that \(e>1\). A monomial of degree at least 4 can always be written as a product \(m\varphi \) with \(d,e\ge 2\), and therefore, be replaced by monomials of lower degree (this might not be the most efficient way to reduce the degree). \(\square \)

Remark 3.3

If \(f=g+m_1\varphi ^{k_1}+m_2\varphi ^{k_2}\), we can apply our basic trick twice to get

$$\begin{aligned} -u_1v_1-u_2v_2+(u_1+u_2)\varphi +m_1v_1^{k_1}+m_2v_2^{k_2}+g \end{aligned}$$

after which we make \(u_1+u_2\) into a new variable, say by replacing \(u_2\) by \(u_2-u_1\), giving

$$\begin{aligned} -u_1v_1+u_1v_2-u_2v_2+u_2\varphi +m_1v_1^{k_1}+m_2v_2^{k_2}+g \;. \end{aligned}$$

This procedure generalises to more terms.

4.2 Luengo’s Example

Example 3.4

The function \(f=x^9+y(xy^3+z^4)^2+y^{10}\) [18] with \(\mu (f)=547\) has non-smooth \(\mu \)-constant stratum. It is stably equivalent to the non-degenerate function

$$\begin{aligned}-uv+u(xy^3+z^4)+yv^2+y^{10}\;.\end{aligned}$$

The stratum Luengo [18] computed is in fact the \(\mu ^*\)-constant stratum \(S_{\mu ^*}\). Recall that \(\mu ^*\) is the sequence of the Milnor numbers of repeated hyperplane sections [28]. If we also know that the topological type is constant in a \(\mu \)-constant deformation, or the multiplicity, then it follows that the stratum \(S_{\mu ^*}\) is the whole \(\mu \)-constant stratum \(S_{\mu }\). It is known that for the function f it is least an irreducible component of \(S_{\mu }\) [27]. The stratum \(S_{\mu ^*}\) has a quadratic singularity. When Luengo [19] did his computation on an IBM 370 with a memory of 8256 K, he could not determine the decisive polynomial explicitly. Maybe nowadays it is possible, but it is clear that the result is too big to be of any use.

Starting from the first order deformation

$$\begin{aligned} f+2(a_{60}x^5+a_{51}x^4z+a_{42}x^3z^2+a_{33}x^2z^3)(xy^3+z^4) \end{aligned}$$

the obstruction to lift it to second order (see [27]) is given by

$$\begin{aligned} a_{60}a_{33}+a_{51}a_{42}=0\;. \end{aligned}$$

Some 1-parameter families are easy to describe. We can even make the \(\mu \)-constant deformation \(f+a_{60}x^5(xy^3+z^4)\) stably non-degenerate by the transformation \(u\mapsto u-a_{60}x^5\), resulting in \( -uv+u(xy^3+z^4)+yv^2+a_{60}vx^5+y^{10}\).

Furthermore, we have

$$\begin{aligned} f+a_{42}x^3z^2(xy^3+z^4)-a_{42}^2x^7y^2 \end{aligned}$$

and

$$\begin{aligned} f+a_{33}x^2z^3(xy^3+z^4)-a_{33}^2x^5y^2z^2+a_{33}^3x^7yz \;. \end{aligned}$$

For the deformation in the \(a_{51}\)-direction, we computed up to order 30 in the deformation variable, but we have not been able to find a \(\mu \)-constant deformation.

The \(a_{33}\)-deformation is stably equivalent to

$$\begin{aligned} -uv +x^9+y^{10}+ u(xy^3+z^4)+yv^2+2a_{33}vx^2z^3-a_{33}^2x^5y^2z^2 +a_{33}^3x^7yz\;. \end{aligned}$$

By changing the coefficient of \(x^5y^2z^2\), the Milnor number drops to 533. The polynomial degenerates on the face \(\varDelta \) which is the intersection of the facets with normalised weight vectors (5, 4, 1, 1, 1)/9 and (128, 99, 26, 22, 23)/220 respectively and \(f_\varDelta =u(xy^3+z^4)+yv^2+2a_{33}vx^2z^3-a_{33}^2x^5y^2z^2\). We can write this expression as symmetric determinant:

$$\begin{aligned} f_\varDelta = \begin{vmatrix} -u&\quad -v&\quad a_{33}x^2z \\ -v&\quad xy^2&\quad -z^2 \\ a_{33}x^2z&\quad -z^2&\quad -y \end{vmatrix}\;. \end{aligned}$$

This hypersurface is singular on the codimension 3 space defined by the \(2\times 2\)-minors of the above matrix. It is reducible, with one component in \(z=y=v=0\) and the other having as normalisation the cone over the rational normal curve of degree 4; it can be parametrised as \(z=-st^3\), \(y=-y^4\), \(x=s^4\), \(v=-a_{33}s^{11}t^5\) and \(u=-a_{33}^2s^{18}q^2\). Note also that the weight of the monomial \(x^7yz\) is larger than 1 for both weight vectors. We conjecture that already this polynomial provides an example of a function which is stably degenerate.

More generally, we make the following conjecture.

Conjecture 3.5

A general function on a non-smooth \(\mu \)-constant stratum is stably degenerate.

In particular, this would prove

Conjecture 3.6

There exist a stably degenerate function, which is a \(\mu \)-constant deformation of a Newton non-degenerate function.

The existence of such a function implies that the question of stable non-degeneracy cannot be decided with invariants depending only on the embedded topological type.

Besides the fact that I do not know what to do in the example above, the heuristic for conjecture 3.5 is the following. The \(\mu \)-constant stratum has very complicated equations and in fact it is not known in a single case how to write down a general function on the stratum, whereas the Newton diagram seems to be a relatively simple, combinatorial object. Furthermore, the non-degenerate functions with the same Newton diagram as a generic function on the stratum should dominate the \(\mu \)-constant stratum and this fits badly with the non-smoothness. However, this idea does not lead to a proof, as the coordinate transformations involved need not extend to the original function (as for the function g in Example 3.7 below).

4.3 du Plessis’ Examples

In [25], Andrew du Plessis gave in a systematic way examples of hypersurfaces of degree d in \({\mathbb P}^n\), whose singularities are not versally deformed by the family \(H_d(n)\) of all hypersurfaces of degree d in \({\mathbb P}^n\). Then, at the corresponding point the stratum of hypersurfaces with exactly these singularities can be smooth of dimension larger than the expected dimension or it can be singular. A classical example of the first case is Segre’s family of curves of degree 6k of the form \((f_{3m})^2+(f_{2m})^3\) with \(6k^2\) cusps (see [32], VIII.5).

If the stratum is singular, we obtain by adding a suitable form of degree \(d+1\) to the equation of the hypersurface a superisolated singularity with non-smooth \(\mu ^*\)-constant stratum (and probably also non-smooth \(\mu \)-constant stratum, but this has to be proved, as in [27] for the case of Luengo’s example). As communicated by du Plessis, the smallest example constructed this way is the following, with \(d=3\) and \(n=7\).

Example 3.7

Consider the function (cf. [25], Examples 2.7)

$$\begin{aligned} f=f_3+x_0^4= x_0x_1^2+x_2(x_2^2+x_3^2-x_0^2)+x_4(x_4^2-x_5^2+x_0^2) +x_6^3+x_7^3+x_0^4 \end{aligned}$$

with \(\mu = 272\); the projective hypersurface \(f_3=0\) has four \(D_4\)-singularities. The hyperplane section \(\{x_1=0\}\) belongs to a stratum in \(H_3(6)\) with larger than expected dimension.

The \(\mu ^*\)-constant stratum of f is singular with quadratic singularity: the obstruction to lift the first order deformation

$$\begin{aligned} f_3+2x_1(a_{67}x_6x_7+a_{57}x_5x_7+a_{56}x_5x_6+a_{57}x_3x_6+a_{37}x_3x_7+a_{35}x_3x_5) \end{aligned}$$

is \(a_{67}a_{35}+a_{57}a_{36}+a_{56}a_{37}=0\). In the chart \(x_0=1\) one completes the square \(\big (x_1+(a_{67}x_6x_7+a_{57}x_5x_7+a_{56}x_5x_6+a_{57}x_3x_6+a_{37}x_3x_7+a_{35}x_3x_5)\big )^2\) and the obstruction to find an equivalent polynomial of degree three is the coefficient of \(x_3x_5x_6x_7\). Also here some 1-parameter \(\mu \)-constant deformations are easy to write down:

$$\begin{aligned} f&+2a_{67}x_1x_6x_7\\ f&+2a_{36}x_1x_3x_6+a_{36}^2x_6^2\\ f&+2a_{35}x_1x_3x_5+a_{35}^2x_0(x_3^2+x_5^2-x_0^2) \end{aligned}$$

and similar ones obtained by symmetry, but a general deformation is not known explicitly.

The \(a_{67 }\)-deformation has non-degenerate Newton diagram. We now show that for a fixed value of \(a_{35}\) the function is equivalent to an inner non-degenerate function. As \(x_6\) and \(x_7\) do not occur in the \(a_{35 }\)-deformation we might as well leave them out. Consider, therefore, the polynomial g given by

$$\begin{aligned}&x_0x_1^2+x_2(x_2^2+x_3^2-x_0^2)+x_4(x_4^2-x_5^2+x_0^2)\\&\quad +x_0^4 +2a_{35}x_1x_3x_5+a_{35}^2x_0(x_3^2+x_5^2-x_0^2) \end{aligned}$$

with \(\mu (g)=68\). It degenerates on \(\varDelta \) with \(g_\varDelta =x_0x_1^2+2a_{35}x_1x_3x_5+a_{35}^2x_0(x_3^2+x_5^2-x_0^2)\). We write \(g_\varDelta \) as determinant:

$$\begin{aligned} g_\varDelta = -a_{35}^2 \begin{vmatrix} x_0&\quad x_3&\quad x_5 \\ x_3&\quad x_0&\quad -x_1/a_{35} \\ x_5&\quad -x_1/a_{35}&\quad x_0 \end{vmatrix}\;. \end{aligned}$$

In this case, the singular locus is reducible and consists of four linear spaces, which we can move into coordinate subspaces by a coordinate transformation:

$$\begin{aligned} x_0&\mapsto x_0+x_3+x_5-x_1/a_{35}\\ x_1&\mapsto -a_{35}x_0+a_{35}x_3+a_{35}x_5+x_1\\ x_3&\mapsto x_0+x_3-x_5+x_1/a_{35}\\ x_5&\mapsto x_0-x_3+x_5+x_1/a_{35} \end{aligned}$$

It transforms the original function into

$$\begin{aligned}&16a_{35}(x_0x_1x_3+x_0x_1x_5+x_1x_3x_5-a_{35}x_0x_3x_5)\\&\quad +x_2^3-4x_2(x_0+x_3)(x_5-x_1/a_{35})+x_4^3+4x_4(x_0+x_5)(x_3-x_1/a_{35})\\&\quad +(x_0+x_3+x_5-x_1/a_{35})^4\;. \end{aligned}$$

Now, \(\mu =\nu =68\). The polynomial still degenerates on some faces in coordinate hyperplanes. The Newton diagram has 803 compact faces (computed with Gérmenes [22]). It can be checked that the polynomial is inner non-degenerate. Presumably, it can be made non-degenerate with the basic trick. Note that the coordinate transformation only works for \(a_ {35}\ne 0\).

4.4 Small Examples

Motivated by the above examples we search for a simpler example, with low Milnor number. We start from a quasi-homogeneous singularity \(f_\delta \) with one-dimensional singular locus, which is generically reduced (to get a small example); denote by \(\varSigma \) the reduced singular locus. There should be no coordinate transformation which moves \(\varSigma \) into a coordinate hyperplane. Therefore, \(\varSigma \) should be irreducible. Moreover, it should not be a complete intersection, where our methods apply, see Example 3.10.

The condition that \(f_\delta \) is singular along \(\varSigma =V(I)\) is that \(f_\delta \in \int I\), where \(\int I\) is the primitive ideal [24, 26]:

The terminology is from Pellikaan [24].

Interesting examples can be found in the work of De Jong and Van Straten on rational quadruple points [14, Proposition 1.8]. The easiest example is the following.

Example 3.8

Let \(\varSigma =V(I)\) be the monomial curve \((t^3,t^4,t^5)\). Its ideal is given by the minors of a \(2\times 3\) matrix. We define a function \(f_\delta \) with \(\varSigma \) as singular locus by adding one row to the matrix to get the following symmetric \(3\times 3\) determinant:

$$\begin{aligned} f_\delta =- \begin{vmatrix} x&\quad y&\quad z \\ y&\quad z&\quad x^2 \\ z&\quad x^2&\quad xy \end{vmatrix}. \end{aligned}$$

All \(2\times 2\) minors lie in the ideal I, because they vanish on the curve \((t^3,t^4,t^5)\), so by the product rule the partial derivatives of \(f_\delta \) lie also in I, showing that \(f_\delta \in \int I\).

We find isolated singularities in three related series by adding suitable monomials:

$$\begin{aligned} f_{7+3k}&=f_\delta +x^{k}&=x^5+xy^3+z^3-3x^2yz&+ x^{k}\;,\\ f_{8+3k}&=f_\delta +x^{k-1}y&=x^5+xy^3+z^3-3x^2yz&+ x^{k-1}y\;,\\ f_{9+3k}&=f_\delta +x^{k-1}z&=x^5+xy^3+z^3-3x^2yz&+ x^{k-1}z\;. \end{aligned}$$

The lower index denotes the Milnor number. We can write it as \(\mu =7+v\), where v denotes the weight of the added monomial (using the weights 3, 4, 5). The smallest example is \(f_{23}\). The functions \(f_\mu \) degenerate by construction on the face \(\delta \) of the Newton diagram with vertices (5, 0, 0), (1, 3, 0) and (0, 0, 3).

To determine the resolution graph, we look at \(f_\delta \) in the chart \(x=1\). It is given by \(1+y^3+z^3-3yz=0\), on which we have the \(\mathbb {Z}_3\)-action \((1,y,z)\mapsto (1,\varepsilon y,\varepsilon ^2 z)\). We find that the singularity \(f_{7+v}\) has the same resolution graph as the maximal elliptic singularity \(z^2+y^3+y^2x^8+x^{9+v}\). It has \(Z^2=-1\), there is a cycle of \(v-15\) rational curves, all but one having self intersection \(-2\), and at the only \((-3)\)-curve a chain of three \((-2)\)-curves is attached. One has \(p_g(f_{7+v})=2\), whereas the maximal elliptic singularity has \(p_g=4\).

Conjecture 3.9

For every function \(\tilde{f}\), stably equivalent to the function \(f_{7+v}\) of Example 3.8 with \(v>15\), one has \(\mu (f)=\mu (\tilde{f})>\nu (\tilde{f})\). In particular, the function \(f_{7+v}\) is stably degenerate.

If we change the coefficients in the matrix defining \(f_\delta \) the function will define a non-degenerate function. Every transformation I tried can also be done for the non-degenerate function, leading to the same Newton diagram. These transformations involve somehow the generators of the ideal I, but by changing their coefficients the ideal will become a complete intersection. This does, however, not exclude the existence of a very strange coordinate transformation, which does the trick.

Example 3.10

If we take only two generators of the ideal I of the monomial curve \((t^3,t^4,t^5)\) we get as reduced singular locus the union of this curve and the z-axis; it is the complete intersection \(x^3-yz=xz-y^2=0\). We take the function

$$\begin{aligned} g_\delta =(x^3-yz)(xz-y^2)\;. \end{aligned}$$

We have \(\mu (g_\delta +z^k)=6k+16\) for \(k\ge 4\). The function \(g_\delta +z^k\) is stably equivalent to the non-degenerate function

$$\begin{aligned} -uv +u(xz-y^2) + v(x^3-yz)+z^k\;. \end{aligned}$$

Example 3.11

If we take \(h_\delta \in I^2\), with still the same ideal I, we can apply the basic trick as first step. Take the function

$$\begin{aligned} h_\delta =(x^3-yz)^2+(xz-y^2)(yx^2-z^2)\;. \end{aligned}$$

The reduced singular locus consists of the monomial curve \((t^3,t^4,t^5)\) and the y-axis, and is not a complete intersection. We have \(\mu (h_\delta +y^k)=23+5k\) for \(k\ge 5\), but \(\nu (h_\delta +y^k)=41+k\).

We apply the basic trick to \(h=h_\delta +y^k\) and get

$$\begin{aligned} \tilde{h}= -uv-w^2+u(xz-y^2)+v(yx^2-z^2)+2w(x^3-yz)+y^k\;. \end{aligned}$$

This polynomial degenerates on the four-dimensional face \(\delta '\) where \(\tilde{h}_{\delta '}=u(xz-y^2)+v(yx^2-z^2)+2w(x^3-yz)\). Indeed, the function \(h_\delta \) involves all five monomials of the given degree and a general non-degenerate function can be written as \(H_\delta =(ax^3-byz)^2+(cxz-dy^2)(eyx^2-fz^2)\). Therefore, the same type of transformation can be applied to \(H_\delta +y^k\), leading to the same Newton diagram.

As we have a relation \(\sum r_if_i\) between the generators \(f_i\) of I we get by deriving a relation between the partial derivatives of the \(f_i\), holding modulo I. This can be written in terms of the parameter t in the parametrisation \((t^3,t^4,t^5)\) of \(\varSigma \). If (uv, 2w) is a multiple of the relation vector we get a non-trivial solution. Note that we can write \(\tilde{h}_{\delta '}\) as determinant:

$$\begin{aligned} \tilde{h}_{\delta '}= \begin{vmatrix} x&\quad y&\quad z \\ y&\quad z&\quad x^2 \\ v&\quad -2w&\quad u \end{vmatrix}. \end{aligned}$$

This shows that the singular locus of \(\tilde{h}_{\delta '}\) is not contained in the coordinate hyperplanes.

As there seems no way to use the fact that there are relations between the generators \(f_i\) of I, we conjecture that also \(h_\delta +y^k\) is stably degenerate.

It does not suffice that \(\varSigma \) is not a complete intersection, as the following example shows.

Example 3.12

Consider

$$\begin{aligned} f_\delta =- \begin{vmatrix} x&\quad y&\quad z \\ y&\quad z&\quad x \\ z&\quad x&\quad y \end{vmatrix}. \end{aligned}$$

Then, \(f_\delta =x^3+y^3+z^3-3xyz\), which is a product of three linear factors, and \(f_\delta +l^k\) for a general linear function l (a coordinate function will do) is equivalent to a function of type \(T_{k,k,k}= x_1^k+x_2^k+x_3^k+a x_1x_2x_3\) [3, 15.1], so a simple coordinate transformation makes the function non-degenerate. But it is even possible to make it non-degenerate after stabilisation, keeping the original (xyz)-coordinates. The equation has the form \(l_1l_2l_3+l^k\). We apply the basic trick, first once:

$$\begin{aligned} -u_1v_1+u_1l_1+v_1l_2l_3 +l^k \end{aligned}$$

and then once again:

$$\begin{aligned} -u_1v_1-u_2v_2+u_1l_1+u_2l_2+v_1v_2l_3+l^k\;. \end{aligned}$$

Example 3.13

We give a non-trivial example with \(\varSigma \) a complete intersection. Consider the curve with parametrisation \((t^4,t^5,t^6)\) and equations \(y^2-xz=z^2-x^3=0\). This is the simple complete intersection curve \(W_8\) in Giusti’s notation (see [3], 9.8). We take as \(f_\delta \) a rather general element in the square of the ideal. Let

$$\begin{aligned} f=x(z^2-x^3)^2-z(z^2-x^3)(y^2-zx)+x^2(y^2-zx)^2+xy^5 \end{aligned}$$

with \(\mu (f)=103\). The (reduced) singular locus of \(f_\delta =x^7+x^2y^4-2x^4z^2+2xz^4-y^2z^3-x^3y^2z\) consists of \(W_8\) and the y-axis. We apply the basic trick several times and simplify. The result can be seen directly: a function of the form \(g=\alpha \varphi ^2 + \beta \varphi \psi + \gamma \psi ^2\) is stably equivalent to \(-uv-tw+u\varphi +t\psi +\alpha v^2 + \beta vw + \gamma w^2\), as the last expression is equal to

$$\begin{aligned} g -(u-\beta w-\alpha (v+\varphi ))(v-\varphi )-(w-\psi )(t-\gamma (w+\psi )-\beta \varphi )\;.\end{aligned}$$

Therefore, f is stably equivalent to

$$\begin{aligned} \tilde{f}=-uv-tw+u(z^2-x^3)+t(y^2-zx)+x v^2 - z vw + x^2 w^2+xy^5\;. \end{aligned}$$

The examples above show that there is no easy criterion for a function to be stably equivalent to a non-degenerate function. Our strategy is to remove \(f_\varDelta \), if f degenerates on \(\varDelta \). To increase the Newton number in this way, the function \(f_\varDelta \) should have a specific form, which is different from the generic function with the same support, as in Examples 3.4, 3.12 and 3.13. In Examples 3.8 and 3.11, the only specific structure is the existence of relations between the generators of the ideal of the singular locus, but that does not seem to help.

5 Irreducible Plane Curve Singularities

It is well known that the only non-degenerate irreducible plane curve singularities are those with one characteristic pair (\(g=1\) in the notation below). This follows from Newton’s method to find a Newton–Puiseux series (see, e.g. [7], 8.3); indeed the Newton polygon was introduced by Newton for this purpose. In this section, we give evidence that all irreducible plane curve singularities are stably non-degenerate (in characteristic zero).

We describe equations for irreducible plane curve singularities following Teissier [29], see also [9]. We look at algebroid curves over an algebraically closed field K of characteristic zero. The basic invariant is the semigroup.

Let \(S=\langle \bar{\beta }_0, \dots ,\bar{\beta }_g\rangle \) be the semigroup of the curve. Define numbers \(n_i\) by \(e_i=\gcd (\bar{\beta }_0, \dots ,\bar{\beta }_i)\) and \(e_{i-1}=n_ie_i\). The condition that S comes from a plane curve singularity, is that \(n_i\bar{\beta }_i\in \langle \bar{\beta }_0, \dots ,\bar{\beta }_{i-1}\rangle \) and \(n_i\bar{\beta }_i < \bar{\beta }_{i+1}\).

Teissier showed that every plane curve singularity with semigroup S occurs in the positive weight part of versal deformation of the monomial curve \(C_S\) with the same semigroup S. Embed \(C_S\) in \(K^{g+1}\) by \(u_i=t^{\bar{\beta }_i}\). Write

$$\begin{aligned} n_i\bar{\beta }_i= l_0^{(i)}\bar{\beta }_0+ l_1^{(i)}\bar{\beta }_1 + \dots + l_{i-1}^{(i)}\bar{\beta }_{i-1} \;. \end{aligned}$$

The curve \(C_S\) is a complete intersection with equations

$$\begin{aligned} f_1&=u_1^{n_1}-u_0^{ l_0^{(1)}}=0\\ f_2&=u_2^{n_2}-u_0^{ l_0^{(2)}}u_1^{ l_1^{(2)}}=0\\&\;\;\vdots \\ f_g&=u_g^{n_g}-u_0^{ l_0^{(g)}}\dots u_{g-1}^{ l_{g-1}^{(g)}}=0 \end{aligned}$$

A particular simple deformation of positive weight is given by \(f_i+\varepsilon u_{i+1}\), \(i<g\), and we may even take \(\varepsilon =1\). It is then easy to eliminate the \(u_i\) with \(i\ge 2\) to obtain an equation of a plane curve. Cassou-Nogués [9] has shown that one can write the whole equisingular deformation of this particular curve as \(\widetilde{f}_i+ u_{i+1}\), where \(\widetilde{f}_i\) only depends on the coordinates \(u_0,\dots ,u_i\), so it is possible to do the same elimination for the whole stratum. However, as the curve is no longer quasi-homogeneous it is not clear whether every plane curve occurs in this family.

The easiest elimination occurs when \(l_j^{(i)}=0\) for all \(j\ge 2\) and all i. Such semigroups exist for all g. They can be constructed inductively. Given \(\langle \bar{\beta }_0, \dots ,\bar{\beta }_{g-1}\rangle \) with \(\gcd (\bar{\beta }_0, \dots ,\bar{\beta }_{g-1})=1\) and such that \(l_j^{(i)}=0\) for \(j\ge 2\), take a semigroup \(\langle n_g\bar{\beta }_0, \dots ,n_g\bar{\beta }_{g-1},\bar{\beta }_g\rangle \) with \(\gcd (n_g,\bar{\beta }_g)=1\), \(\bar{\beta }_g> n_{g-1}n_g\bar{\beta }_{g-1}\) and \(\bar{\beta }_g\in \langle \bar{\beta }_0, \bar{\beta }_1\rangle \).

Conjecture 4.1

The deformed curve \(f_i+u_{i+1}\), with \(l_j^{(i)}=0\) for all \(j\ge 2\), is stably equivalent to a non-degenerate singularity.

“Proof”

In this case, the equation of the plane curve is

$$\begin{aligned} \left( \dots \Big ((u_1^{n_1}-u_0^{ l_0^{(1)}})^{n_2}-u_0^{ l_0^{(2)}}u_1^{ l_1^{(2)}}\Big )^{n_3} \dots -u_0^{ l_0^{(g-1)}}u_1^{ l_1^{(g-1)}} \right) ^{n_g}-u_0^{ l_0^{(g)}}u_1^{ l_1^{(g)}}=0 \;. \end{aligned}$$

This is of the form \(\varphi _g^{n_g}-u_0^{ l_0^{(g)}}u_1^{ l_1^{(g)}}=0\), and \(\varphi _g=\varphi _{g-1}^{n_{g-1}}-u_0^{ l_0^{(g-1)}}u_1^{ l_1^{(g-1)}}\) is itself of the same form. The principal part is a complete \(n_g\)-th power. We apply the basic trick (Lemma 3.1) and write

$$\begin{aligned} -v_gw_g + v_g\varphi _g+w_g^{n_g} -u_0^{ l_0^{(g)}}u_1^{ l_1^{(g)}}\;. \end{aligned}$$

Here \( v_g\varphi _g =v_g\left( \varphi _{g-1}^{n_{g-1}}-u_0^{ l_0^{(g-1)}}u_1^{ l_1^{(g-1)}}\right) \), so we apply the basic trick once more, now to \(v_g\varphi _{g-1}^{n_{g-1}}\), and obtain

$$\begin{aligned} -v_gw_g -v_{g-1}w_{g-1} + v_{g-1}\varphi _{g-1}+v_gw_{g-1}^{n_{g-1}}+w_g^{n_g} -v_gu_0^{ l_0^{(g-1)}}u_1^{ l_1^{(g-1)}} -u_0^{ l_0^{(g)}}u_1^{ l_1^{(g)}}\;. \end{aligned}$$

The next step takes care of \(v_{g-1}\varphi _{g-1}\) and we continue inductively. The final result is

$$\begin{aligned}&-v_gw_g -\dots - v_{2}w_{2} +v_2(u_1^{n_1}-u_0^{l_0(1)}) +v_3w_{2}^{n_{2}}+\dots +w_g^{n_g}\\&\quad {}-v_3u_0^{ l_0^{(2)}}u_1^{ l_1^{(2)}}-\dots - v_gu_0^{ l_0^{(g-1)}}u_1^{ l_1^{(g-1)}} -u_0^{ l_0^{(g)}}u_1^{ l_1^{(g)}}\;. \end{aligned}$$

It remains to show that the final function is non-degenerate. We will not do this, leaving this as conjecture. In fact, we conjecture that all facets of the Newton diagram are simplices, implying non-degeneracy. We checked this in the case \(g=3\). There are eight monomials, \(v_3w_3\), \(v_2w_2\), \(v_2u_1^{n_1}\), \(v_2u_0^{l_0^{(1)}}\), \(v_3w_2^{n_2}\), \(w_3^{n_3}\), \(v_3u_0^{ l_0^{(2)}}u_1^{ l_1^{(2)}}\) and \(u_0^{ l_0^{(3)}}u_1^{ l_1^{(3)}}\). The facets containing both \(v_2u_1^{n_1}\) and \(v_2u_0^{l_0(1)}\) are rather easy to describe, but the remaining facets, on which only one of \(v_2u_1^{n_1}\) and \(v_2u_0^{l_0(1)}\) lies, are more difficult, as they depend on the values of \(l_k^{(i)}\). Each such a facet contains exactly six points and is, therefore, a simplex. \(\square \)

Remark 4.2

Without the assumption \(l_j^{(i)}=0\) for all \(j\ge 2\), the situation is more complicated and we only give the case \(g=4\). The equation is now

$$\begin{aligned}&\left( \Big ( (u_1^{n_1}-u_0^{ l_0^{(1)}})^{n_2}- u_1^{ l_1^{(2)}}u_0^{ l_0^{(2)}} \Big )^{n_3} - (u_1^{n_1}-u_0^{ l_0^{(1)}})^{l_2^{(3)}}u_1^{ l_1^{(3)}} u_0^{ l_0^{(3)}} \right) ^{n_4}\\&\quad -\Big ( (u_1^{n_1}-u_0^{ l_0^{(1)}})^{n_2}- u_1^{ l_1^{(2)}}u_0^{ l_0^{(2)}} \Big )^{l_3^{(4)}}(u_1^{n_1}-u_0^{ l_0^{(1)}})^{l_2^{(4)}}u_1^{ l_1^{(4)}}u_0^{ l_0^{(4)}}\;. \end{aligned}$$

We start with one application of the basic trick (Lemma 3.1) to get

$$\begin{aligned}&-v_4w_4+w_4^{n_4}+v_4\Big ( (u_1^{n_1}-u_0^{ l_0^{(1)}})^{n_2}- u_1^{ l_1^{(2)}}u_0^{ l_0^{(2)}} \Big )^{n_3} - v_4(u_1^{n_1}-u_0^{ l_0^{(1)}})^{l_2^{(3)}}u_1^{ l_1^{(3)}} u_0^{ l_0^{(3)}} \\&\quad - \Big ( (u_1^{n_1}-u_0^{ l_0^{(1)}})^{n_2}- u_1^{ l_1^{(2)}}u_0^{ l_0^{(2)}} \Big )^{l_3^{(4)}}(u_1^{n_1}-u_0^{ l_0^{(1)}})^{l_2^{(4)}}u_1^{ l_1^{(4)}}u_0^{ l_0^{(4)}}\;. \end{aligned}$$

Let \(\varphi _3=(u_1^{n_1}-u_0^{ l_0^{(1)}})^{n_2}- u_1^{ l_1^{(2)}}u_0^{ l_0^{(2)}}\). Then, we have two terms involving a power of \(\varphi _3\), so we apply the basic trick twice, followed by a coordinate transformation as in Remark 3.3 to get

$$\begin{aligned}&-v_4w_4-v_{3,1}w_{3,1}+v_{3,1}w_{3,2}-v_{3,2}w_{3,2}+w_4^{n_4}+v_4w_{3,1}^{n_3} +v_{3,2}(u_1^{n_1}-u_0^{ l_0^{(1)}})^{n_2} \quad \\&\quad - v_{3,2}u_1^{ l_1^{(2)}}u_0^{ l_0^{(2)}} -v_4(u_1^{n_1}-u_0^{ l_0^{(1)}})^{l_2^{(3)}}u_1^{ l_1^{(3)}} u_0^{ l_0^{(3)}} -w_{3,2}^{l_3^{(4)}}(u_1^{n_1}-u_0^{ l_0^{(1)}})^{l_2^{(4)}}u_1^{ l_1^{(4)}}u_0^{ l_0^{(4)}}\;. \end{aligned}$$

Finally, we introduce six new variables to handle the powers of \(\varphi _2=u_1^{n_1}-u_0^{ l_0^{(1)}}\).

$$\begin{aligned}&-v_4w_4-v_{3,1}w_{3,1}+(v_{3,1}-v_{3,2})w_{3,2} -v_{2,1}w_{2,1}\\&\qquad -v_{2,2}w_{2,2}+(v_{2,1}+v_{2,2}-v_{2,3})w_{2,3}\\&\quad +w_4^{n_4}+v_4w_{3,1}^{n_3}+v_{3,2}w_{2,1}^{n_2} +v_{2,3}(u_1^{n_1}-u_0^{ l_0^{(1)}}) - v_{3,2}u_1^{ l_1^{(2)}}u_0^{ l_0^{(2)}} \\&\quad -v_4w_{2,2}^{l_2^{(3)}}u_1^{ l_1^{(3)}} u_0^{ l_0^{(3)}} -w_{3,2}^{l_3^{(4)}}w_{2,3}^{l_2^{(4)}}u_1^{ l_1^{(4)}}u_0^{ l_0^{(4)}}\;. \end{aligned}$$