1 Introduction and main results

A hypergraph \(H=(V,E)\) consists of a vertex set \(V\) and an edge set \(E\subset 2^{V}\), where \(2^{V}\) is the family of all subset of \(V\). For a hypergraph \(H\), let \(V(H)\) denote the vertex set of \(H\) and \(E(H)\) denote the edge set of \(H\). If all edges of \(H\) have the same cardinality \(r\), then \(H\) is an r-uniform hypergraph or r-graph. The set \(T(H)=\{|e|: e\in E(H)\}\) is called the set of edge types of hypergraph \(H\). Hypergraph \(H\) is non-uniform if it has at least two edge types. If \(T(H)=\{r_{1},r_{2},\ldots ,r_{l}\}(r_{1}<r_{2}<\ldots <r_{l})\), then we say that \(H\) is an \({\{r_{1}, r_{2},\ldots ,r_{l}\}-hypergraph}\). For any \(r\in T(H)\), the r-th level hypergraph \(H^{r}\) is the \(r\)-uniform hypergraph consisting of all edges of \(H\) with \(r\) vertices. For a positive integer \(r\), let \(V^{(r)}\) be the family of all \(r\)-subsets of \(V\). We write \(\overline{H}^{r}=V^{(r)}\backslash H^{r}\), then the complement of \(H\) is \(\overline{H}=(V,\overline{H}^{r_{1}}\cup \overline{H}^{r_{2}}\cup \ldots \cup \overline{H}^{r_{l}})\). For any integer \(n\in {\mathbb {N}}\), denote the set \(\{1,2,\ldots ,n\}\) by \([n]\). Let \([t]^{\{r_{1},r_{2},\ldots ,r_{l}\}}\) denote the complete \(\{r_{1},r_{2},\ldots ,r_{l}\}\)-hypergraph of order \(t\), that is, the \(\{r_{1},r_{2},\ldots ,r_{l}\}\)-hypergraph of order \(t\) containing all \(r_{i}\)-subsets of the vertex set \(V\) for \(1\le i\le l\). A hypergraph \(H\) is a subhypergraph of a hypergraph \(G\), denoted by \(H\subseteq G\) if \(V(H)\subseteq V(G)\) and \(E(H)\subseteq E(G)\). A complete subhypergraph of a hypergraph \(H\) with the same edge type as \(T(H)\) is called a clique of \(H\). If \(U\subseteq V(H)\), then the subhypergraph of \(H\) induced by \(U\) is denoted by \(H[U]\). We assume that all hypergraphs or graphs have the vertex set \([n]\) throughout the paper if it is not specified. The characteristic vector of a set \(U\), denoted by \({\vec {x}}^{U}=(x_1^{U}, x_2^{U}, \ldots , x_n^{U})\), is the vector in \(S\) defined as:

$$\begin{aligned} x_i^{U}=\frac{1_{i\in U}}{|U|} \end{aligned}$$

where \(|U|\) denotes the cardinality of \(U\) and \(1_{P}\) is the indicator function returning 1 if property \(P\) is satisfied and 0 otherwise.

Definition 1.1

Let \(H\) be an \(r\)-uniform hypergraph. Let \(S=\{\vec {x}=(x_1,x_2,\ldots ,x_n)\in \mathbb {R}^{n}: \sum _{i=1}^{n} x_i =1, x_i \ge 0 \mathrm{\ for \ } i=1,2,\ldots , n \}\) and let \(\vec {x}=(x_1,x_2,\ldots ,x_n)\in S\). The Lagrange function of \(H\), denoted by \(\lambda (H, \vec {x})\), is

$$\begin{aligned} \lambda (H,\vec {x})=\sum _{e \in E(H)}\prod \limits _{i\in e}x_{i}. \end{aligned}$$

The Lagrangian of \(H\), denoted by \(\lambda (H)\), is

$$\begin{aligned} \lambda (H) = \max \{\lambda (H, \vec {x}): \vec {x} \in S \}. \end{aligned}$$

We call \(\vec {x}=(x_{1},x_{2},\ldots ,x_{n})\in \mathbb {R}^{n}\) a feasible weighting for \(H\) if and only if \(\vec {x}\in S\). A vector \(\vec {y}\in S\) is called an optimal weighting for \(H\) if and only if \(\lambda (H,\vec {y})=\lambda (H)\).

Lagrangians were introduced for 2-graphs by Motzkin and Straus (1965). They determined the following expression for the Lagrangian of a 2-graph.

Theorem 1.1

(Motzkin and Straus (1965)) If \(G\) is a 2-graph and the order of its maximum cliques is \(t\), then

$$\begin{aligned} \lambda (G)=\lambda \left( [t]^{(2)}\right) ={1 \over 2}\left( 1 - {1 \over t}\right) . \end{aligned}$$

Moreover, the characteristic vector of a maximum clique of \(G\) is an optimal weighting for \(G\).

This result and its extensions have applications in both combinatorics and optimization. The Lagrangian of a hypergraph has been a useful tool in hypergraph extremal problems. For example, Sidorenko (1987) and Frankl and Füredi (1989) applied graph-Lagrangians of hypergraphs in finding Turán densities of hypergraphs. Frankl and Rödl (1984) applied it in disproving Erdös long standing jumping constant conjecture. The Motzkin–Straus result and its extension were successfully employed in optimization to provide heuristics for the maximum clique problem Bomze (1997); Budinich (2003); Busygin (2006); Gibbons et al. (1997); Pardalos and Phillips (1990). However, the obvious generalization of Motzkin and Straus’ result to hypergraphs is false, i.e., the Lagrangian of a hypergraph is not always the same as the Lagrangian of its maximum cliques. In fact, there are many examples of hypergraphs that do not achieve their Lagrangian on any proper subhypergraph. A generalization of Theorem 1.1 to \(r\)-uniform hypergraphs was given by Rota Bulò and Pelillo (2009) by associating the edge set of an \(r\)-uniform hypergraph \(H\) to a polynomial function of degree \(r\) different from the Lagrange function. In Rota Bulò and Pelillo (2009) considered the following non-linear program:

$$\begin{aligned} minimize&h_{H}(\vec {x})=\lambda (\overline{H},\vec {x}) +\tau \sum ^{n}_{i=1}x_{i}^{r}=\sum \limits _{e\in \overline{H}}\prod \limits _{i\in e}x_{i}+\tau \sum ^{n}_{i=1}x_{i}^{r} \nonumber \\ subject\,to&\vec {x}\in S, \end{aligned}$$
(1)

where \(\tau \in \mathbb {R}\) and \(\lambda (\overline{H},\vec {x})=\sum \limits _{e\in \overline{H}}\prod \limits _{i\in e}x_{i}\) is the Lagrangian of \(\overline{H}\). They obtained the following generalization of Motzkin–Straus Theorem.

Theorem 1.2

(Rota Bulò and Pelillo (2009)) Let \(H\) be an \(r\)-uniform hypergraph and \(0<\tau \le \frac{1}{r(r-1)}\) (with strict inequality for \(r=2\)). A vector \(\vec {x}\in S\) is a local (global) solution of (1) if and only if it is the characteristic vector of a maximal (maximum) clique of \(H\). If \(H\) has a maximum clique of order \(t\), then \(h\) attains its minimum over \(S\) at \(\tau t^{1-r}\) and the characteristic vector of a maximum clique is a global solution of (1) (This is true for \(r=2\) and \(\tau ={1 \over 2}\) as well).

If \(H\) is a 2-graph on \(n\) vertices, then

$$\begin{aligned} 1&= (x_{1}+x_{2}+\ldots +x_{n})^{2}\\&= \sum _{i=1}^{n}x_{i}^{2}+2\left( \sum _{\{i,j\}\in E(H)}x_{i}x_{j}+\sum _{\{i,j\}\in E(\overline{H})}x_{i}x_{j}\right) \\&= \sum _{i=1}^{n}x_{i}^{2}+2\left( \lambda (H,\vec {x})+\lambda (\overline{H},\vec {x})\right) . \end{aligned}$$

So

$$\begin{aligned} \lambda (H,\vec {x})=\frac{1}{2}-\left( \frac{1}{2}\sum _{i=1}^{n}x_{i}^{2}+\lambda (\overline{H},\vec {x})\right) . \end{aligned}$$
(2)

By (2), to obtain the maximal (maximum) value of \(\lambda (H,\vec {x})\) is equivalent to find the minimal (minimum) value of \(\frac{1}{2}\sum _{i=1}^{n}x_{i}^{2}+\lambda (\overline{H},\vec {x})\). So Theorem 1.2 generalizes the Motzkin–Straus result.

Very recently, the study of extremal problems of non-uniform hypergraphs have been motivated by extremal poset problems. We will study a similar problem for non-uniform hypergraphs and explore its applications in determining Turán densities of non-uniform hypergraphs. Given an \(\{r_{1},r_{2},\ldots ,r_{l}\}\)-hypergraph \(H\), consider the following non-linear programming.

$$\begin{aligned} minimize&g_{H}(\vec {x})=\sum _{r\in T(H)}\alpha _{r}\sum \limits _{e\in \overline{H}^{r}}\prod _{i\in e}x_{i}+\sum _{r\in T(H)}\beta _{r}\sum ^{n}_{i=1}x_{i}^{r} \nonumber \\ subject\,to&\vec {x}\in S, \end{aligned}$$
(3)

where \(\alpha _{r},\beta _{r}\in \mathbb {R}\), for all \(r\in T(H)\). In this paper, we study problem (3) for \(\{r_{1},r_{2}\}\)-hypergraphs and \(\{1,r\}\)-hypergraphs.

When \(r_{3}=r_{4}=\ldots =r_{l}=0\), that is, \(H\) is an \(\{r_{1},r_{2}\}\)-hypergraph, (3) can be written as

$$\begin{aligned} minimize&p_{H}(\vec {x})=\alpha \sum \limits _{e\in \overline{H}^{r_{1}}}\prod \limits _{i\in e}x_{i}+\beta \sum \limits _{e\in \overline{ H}^{r_{2}}}\prod \limits _{i\in e}x_{i}+\gamma \sum ^{n}_{i=1}x_{i}^{r_{1}}+\tau \sum ^{n}_{i=1}x_{i}^{r_{2}}\nonumber \\ subject\,to&\vec {x}\in S, \end{aligned}$$
(4)

where \(\alpha ,\beta ,\gamma ,\tau \in \mathbb {R}\). Let \(p_{H}=\min \limits _{\vec {x}\in S}p_{H}(\vec {x}).\)

Furthermore, when \(r_{1}=1\) and \(r_{2}=r\), it is clear that \(\gamma \sum ^{n}_{i=1}x_{i}=\gamma \) is a constant, so we can write \(f_{H}(\vec {x})=p_{H}(\vec {x})-\gamma \) for simplification,

$$\begin{aligned} minimize&f_{H}(\vec {x})=\alpha \sum \limits _{i\in \overline{H}^{1}}x_{i}+\beta \sum \limits _{e\in \overline{H}^{r}}\prod \limits _{i\in e}x_{i}+\tau \sum ^{n}_{i=1}x_{i}^{r}\nonumber \\ subject\,to&\vec {x}\in S. \end{aligned}$$
(5)

In order to simplify the notation we write \(f_{H}(\vec {x})\) as \(f(\vec {x})\) where the context is non ambiguous. Let \(f_{H}=\min \limits _{\vec {x}\in S}f(\vec {x}).\)

We call \(\vec {x}=(x_{1},x_{2},\ldots ,x_{n})\) a feasible weighting for a non-uniform hypergraph \(H\) with \(n\) vertices if \(\vec {x}\in S\). A local solution of (5) is a vector \(\vec {x}\in S\) for which there exists a neighborhood \(\Gamma (\vec {x})\) of \(\vec {x}\) such that \(f(\vec {y})\ge f(\vec {x})\) for all \(\vec {y}\in \Gamma (\vec {x})\). A global solution is a vector \(\vec {x}\in S\) such that \(f(\vec {y})\ge f(\vec {x})\), for all \(\vec {y}\in S\). We say that \(\vec {x}\) is a strict local (global) solution if the inequalities are strict for \(\vec {y}\ne \vec {x}\). Applying an approach similar in Rota Bulò and Pelillo (2009), we obtain the following results for \(\{1,r\}\)-hypergraphs and \(\{r_{1},r_{2}\}\)-hypergraphs.

Theorem 1.3

Let \(\tau >0,\,\alpha \ge \tau r\; and\; \beta \ge \tau r(r-1)\) (with strict inequality for \(r=2\)) be constants, let \(H\) be a \(\{1,r\}\)-hypergraph. A feasible weighting \(\vec {x}\) is a local (global) solution of (5) if and only if it is the characteristic vector of a maximal (maximum) clique of \(H\). In particular, if \(H\) has a maximum clique of order \(t\), then \(f_{H}=\tau t^{1-r}\) and the characteristic vector of a maximum clique is a global solution of (5).

Theorem 1.4

Let \(\beta ,\gamma ,\tau >0,\, r_2>r_1\) aand \(\alpha \ge \gamma r_{1}(r_{1}-1)+\tau r_{2}(r_{2}-1)\) be constants. Let \(H\) be an \(\{r_{1},r_{2}\}\)-hypergraph and \(U\) be the vertex set of a maximum clique of \(H^{r_1}\), then \(p_{H}=p_{H[U]}\).

The concept of Turán density of a non-uniform hypergraph \(F\) was given in [7]. For a non-uniform hypergraph \(H\) on \(n\) vertices, the Lubell function of \(H\) is defined to be

$$\begin{aligned} h_n(H)=\sum _{r\in T(H)}{|E(H^r)| \over {n \atopwithdelims ()r} }. \end{aligned}$$

Given a hypergraph \(F\) with edge-types \(T\), the Turán density of \(F\) is defined to be

$$\begin{aligned} \pi (F)=\lim _{n\rightarrow \infty }\max \{h_n(H): |v(H)|=n, H \subseteq K^T_n, \mathrm{\ and\ } \\ H \mathrm{\ does \ not \ contain\ } F \ \mathrm{as \ a \ subhypergraph. } \}. \end{aligned}$$

The proof of the existence of this limit can be found in [7]. Determining the Turán density of a hypergraph in general has been a very challenging problem. Very few results are known and a survey on this topic can be found in [9]. Applying Theorem 1.3, we give an upper bound of the Turán densities of complete \(\{1, r\}-\)hypergraphs.

Corollary 1.5

  1. (a)

    Let \(\alpha \ge \tau r,\beta \ge \tau r(r-1),\tau >0\) be constants, let \(H\) be a \(\{1,r\}\)-hypergraph on \(n\) vertices. If \(H\) doesn’t contain a maximum complete \(\{1,r\}\)-subhypergraph of order \(t\), then

    $$\begin{aligned} \frac{\alpha |E(H^{1})|}{n}+\frac{\beta |E(H^{r})|}{n^{r}}\le \alpha +{n \atopwithdelims ()r}\frac{\beta }{n^r}-\tau \left[ (t-1)^{1-r}-n^{1-r}\right] . \end{aligned}$$
  2. (b)

    The Turán density of \([t]^{\{1, r\}}\) satisfies \(\pi ([t]^{\{1, r\}})\le 2-{1 \over r}(t-1)^{1-r}\).

2 Proofs of the main results

The support of a vector \(\vec {x}\in S\), denoted by \(\sigma (\vec {x})\), is the set of indices corresponding to positive components of \(\vec {x}\), i.e.,

$$\begin{aligned} \sigma (\vec {x})=\{i: x_i>0, 1\le i\le n\}. \end{aligned}$$

For \(\vec {x}\in S\) and function \(g(\vec {x})\), let \(\partial _{j}g(\vec {x})\) denote the partial derivative of \(g(\vec {x})\) with respect to \(x_{j}\) and \(\partial _{jl}g(\vec {x})\) will denote the partial derivative with respect to \(x_{j}\) and \(x_{l}\). In particular,

$$\begin{aligned} \partial _{j}p_{H}(\vec {x})&\!=\alpha \!\sum \limits _{e\in \overline{H}^{r_{1}}}\!1_{j\in e}\!\prod \limits _{i\in {e\backslash \{j\}}}\!x_{i}\!+\!\beta \!\sum \limits _{e\in \overline{H}^{r_{2}}}1_{j\in e}\!\prod \limits _{i\in {e\backslash \{j\}}}\!x_{i}\!+\!\gamma r_{1}x^{r_{1}-1}_{j}\!+\!\tau r_{2}x^{r_{2}-1}_{j}\!, \end{aligned}$$
(6)
$$\begin{aligned} \partial _{jl}p_{H}(\vec {x})&\!=1_{j\ne l}\left( \alpha \sum \limits _{e\in \overline{H}^{r_{1}}}1_{j,l\in e}\prod \limits _{i\in {e\backslash \{j,l\}}}x_{i}+\beta \sum \limits _{e\in \overline{H}^{r_{2}}}1_{j,l\in e}\prod \limits _{i\in {e\backslash \{j,l\}}}x_{i}\right) \nonumber \\&\quad \quad +\, 1_{j=l}\left[ \gamma r_{1}(r_{1}-1)x^{r_{1}-2}_{j}+\tau r_{2}(r_{2}-1)x^{r_{2}-2}_{j}\right] . \end{aligned}$$
(7)
$$\begin{aligned} \partial _{j}f(\vec {x})&\!=\alpha 1_{j\in \overline{H}^{1}}+\beta \sum \limits _{e\in \overline{H}^{r}}1_{j\in e}\prod \limits _{i\in {e\backslash \{j\}}}x_{i}+\tau rx^{r-1}_{j}, \end{aligned}$$
(8)
$$\begin{aligned} \partial _{jl}f(\vec {x})&\!=\beta 1_{j\ne l}\sum \limits _{e\in \overline{H}^{r}}1_{j,l\in e}\prod \limits _{i\in {e\backslash \{j,l\}}}x_{i}+1_{j=l}\tau r(r-1)x^{r-2}_{j}. \end{aligned}$$
(9)

Lemma 2.1

(KKT necessary condition, Luenberger (1984)) If a feasible weighting \(\vec {x}=(x_{1},x_{2},\ldots ,x_{n})\) is a local solution of (3), then there exists \(\theta \in \mathbb {R}\) such that for all \(j\in [n]\),

$$\begin{aligned} \partial _{j} g_{H}(\vec {x})\left\{ \begin{array}{l} = \theta ,j \in \sigma (\vec {x}), \\ \ge \theta ,j \notin \sigma (\vec {x}). \\ \end{array}\right. \end{aligned}$$
(10)

Lemma 2.2

A sufficient condition for a feasible weighting \(\vec {x}\in S\) to be a local solution of (3) is to be a KKT point and to have the Hessian matrix of \(g_{H}(\vec {x})\) in \(\vec {x}\) positive definite on the subspace \(M(\vec {x})\) defined as

$$\begin{aligned} M(\vec {x})=\{\vec {\varepsilon }\in {\mathbb {R}}^{n}:{{\sum }_{i=1}^{n}}\varepsilon _{i}=0,\,and\,\varepsilon _{j}=0 \,for\, all\, j\, such \,that\, \partial _{j}g_{H}(\vec {x})>\theta \}, \end{aligned}$$

where the Hessian matrix of \(g_{H}(\vec {x})\) in \(\vec {x}\) is defined as

$$\begin{aligned} H(\vec {x})=\left[ \partial _{jl}g_{H}(\vec {x})\right] _{j,l\in [n]}. \end{aligned}$$

In other words, if \(\vec {x}\) is a KKT point and for all \(\vec {\varepsilon }\in M(\vec {x})\backslash \{\vec {0}\},\, \vec {\varepsilon }'H(\vec {x})\vec {\varepsilon }>0\), then \(\vec {x}\) is a (strict) local solution of (3).

Certainly, Lemmas 2.1 and 2.2 are also suitable for (4) and (5).

2.1 Proofs of Theorem 1.3 and Corollary 1.5

Lemma 2.3

Let \(\alpha ,\beta ,\tau >0\) be constants. Let \(H\) be a \(\{1,r\}\)-hypergraph and let a feasible weighting \(\vec {x}\) be a local (global) solution of (5) . If \(H[\sigma (\vec {x})]\) is a clique of \(H\), then it is a maximal (maximum) clique of \(H\) and \(\vec {x}\) is the characteristic vector of \(\sigma (\vec {x})\).

Proof

If \(\vec {x}\) is a local solution of (5), then it satisfies the KKT necessary condition in Lemma 2.1. Therefore for every \(j\in \sigma (\vec {x})\), we have that \(\theta =\partial _{j}f(\vec {x})=\tau r x_{j}^{r-1}>0\), and it follows that \(\vec {x}\) is the characteristic vector of \(\sigma (\vec {x})\). Moreover if there exists a set \(C\) that contains \(\sigma (\vec {x})\), such that \(H[C]\) is a clique of \(H\), then for any \(j\in {C\backslash {\sigma (\vec {x})}},\, j\in H^{1}\) and \(x_{j}=0\), then \(\partial _{j}f(\vec {x})=0<\theta \). This contradicts the KKT necessary condition in Lemma 2.1. Hence, \(H[C]\) is a maximal clique of \(H\).

If \(\vec {x}\) is the characteristic vector of a maximal clique of order \(s\), then by a direct calculation, \(f_{H}(\vec {x})=\tau s^{1-r}\) and \(f_{H}(\vec {x})=\tau s^{1-r}\) decreases as \(s\) increases. So \(f_{H}=\tau t^{1-r}\), where \(t\) is the maximum clique of \(H\). Combining with the conclusion obtained for a local solution in the previous paragraph, we obtain the conclusion for a global solution.

Hence Lemma 2.3 holds. \(\square \)

Lemma 2.4

Let \(\vec {x}\) be a local (global) solution of (5). If both the following conditions hold:

  1. 1.

    \(\alpha >0,\tau >0 \;and\; \beta \ge \tau r(r-1)\),

  2. 2.

    For \(r=2\), \(\alpha >0,\tau >0\), and \(\beta =2\tau \), the support size of \(\vec {x}\) is minimum among all feasible weighting \(\vec {y}\) such that \(f(\vec {y})=f(\vec {x})\), then \(H^{r}[\sigma (\vec {x})]\) is a clique in \(H^{r}\).

Proof

Suppose that there exists \(\tilde{e}\in \sigma (\vec {x})^{(r)}\) such that \(\tilde{e}\notin H^{r}\). We define a new feasible weighting \(\vec {y}\) for \(H\) as follows. Let \(j,l\in \tilde{e}\) such that \(x_{j}\le x_{l}\le x_{i}\) for all \(i\in \tilde{e}\backslash {\{j,l\}}\) and take \(y_{i}=x_{i}\) for \(i\ne j,l\), \(y_{j}=x_{j}+\varepsilon \) and \(y_{l}=x_{l}-\varepsilon \), where \(0<\varepsilon \le x_{l}\). Then \(\vec {y}\) is clearly a feasible weighting for \(H\).

We study the sign of \(f(\vec {y})-f(\vec {x})\) in a neighbourhood of \(\vec {x}\) as \(\varepsilon \rightarrow 0\) by means of the Taylor expansion of \(f\). By Lemma 2.1, \(\partial _{j}f(\vec {x})=\partial _{l}f(\vec {x})\) for \(j,l\in \sigma (x)\).

$$\begin{aligned} f(\vec {y})\!-\!f(\vec {x})&=\varepsilon \partial _{j}f(\vec {x})\!-\!\varepsilon \partial _{l}f(\vec {x})\!+\!\frac{\varepsilon ^{2}}{2!}\partial _{jj}f(\vec {x})- \frac{2\varepsilon ^{2}}{2!}\partial _{jl}f(\vec {x})\!+\!\frac{\varepsilon ^{2}}{2!}\partial _{ll}f(\vec {x})\!+\!\mathcal {O}(\varepsilon ^{3})\nonumber \\&=\frac{\varepsilon ^{2}}{2}\left[ \partial _{jj}f(\vec {x})+\partial _{ll}f(\vec {x})-2\partial _{jl}f(\vec {x})\right] +\mathcal {O}(\varepsilon ^{3})\nonumber \\&=\frac{\varepsilon ^{2}}{2}\left[ \tau r(r-1)(x_{j}^{r-2}+x_{l}^{r-2})-2\beta \sum \limits _{e\in \overline{H}^{r}}1_{j,l\in e}\prod \limits _{i\in {e\backslash {\{j,l\}}}}x_{i}\right] +\mathcal {O}(\varepsilon ^{3}).\nonumber \\ \end{aligned}$$
(11)

We will distinguish 2 cases, each of which yields a contradiction, hence proving that \(H^{r}[\sigma (\vec {x})]\) is a clique in \(H^{r}\).

Case a: \(\beta >\tau r(r-1)\).

In this case, since at least \(\tilde{e}\in \overline{H}^{r}\) and \(x_{j}\le x_{l}\le x_{i}\) for all \(i\in \tilde{e}\backslash {\{j,l\}}\), (11) can be written as

$$\begin{aligned} f(\vec {y})-f(\vec {x})&\le \frac{\varepsilon ^{2}}{2}[\tau r(r-1)(x_{l}^{r-2}+x_{l}^{r-2})-2\beta x_{l}^{r-2}]+\mathcal {O}(\varepsilon ^{3})\nonumber \\&\le \varepsilon ^{2}[\tau r(r-1)-\beta ]x_{l}^{r-2}+{\mathcal {O}}(\varepsilon ^{3}). \end{aligned}$$
(12)

So \(f(\vec {y})-f(\vec {x})<0\) for sufficiently small value of \(\varepsilon \). This contradicts the assumption that \(\vec {x}\) is a local solution of \(f\).

Case b: \(\beta =\tau r(r-1)\).

Then

$$\begin{aligned} f(\vec {y})\!-\!f(\vec {x})&= \frac{\varepsilon ^{2}}{2}\left[ \tau r(r\!-\!1)\left( x_{j}^{r-2}\!+\!x_{l}^{r-2}\right) \!-\!2\beta \sum \limits _{e\in \overline{H}^{r}}1_{j,l\in e}\prod \limits _{i\in {e\backslash {\{j,l\}}}}x_{i}\right] \!+\!\mathcal {O}(\varepsilon ^{3})\nonumber \\&= \frac{\varepsilon ^{2}}{2}\tau r(r-1)\left[ \left( x_{j}^{r-2}+x_{l}^{r-2}\right) -2\sum \limits _{e\in \overline{H}^{r}}1_{j,l\in e}\prod \limits _{i\in {e\backslash {\{j,l\}}}}x_{i}\right] +\mathcal {O}(\varepsilon ^{3}).\nonumber \\ \end{aligned}$$
(13)

Let \(\mu =2\sum \limits _{e\in \overline{H}^{r}}1_{j,l\in e}\prod \limits _{i\in {e\backslash {\{j,l\}}}}x_{i}-(x_{j}^{r-2}+x_{l}^{r-2})\). It is obvious that \(\mu \ge 0\) since at least \(\tilde{e}\in {\overline{H}^{r}}\) and \(x_{j}\le x_{l}\le x_{i}\) for all \(i\in {\tilde{e}\backslash {\{j,l\}}}\).

Case b1: \(\mu >0\).

In this case, \(f(\vec {y})-f(\vec {x})<0\) for sufficiently small value of \(\varepsilon \). But \(\vec {x}\) is a local solution of \(f\), this is a contradiction.

Case b2: \(\mu =0\) and \(r=2\).

Since \(r=2,\, {\mathcal {O}}(\varepsilon ^{3})\) is missing in (13), then \(f(\vec {y})-f(\vec {x})=0\). Taking \(\varepsilon =x_{l}\), we get another local solution \(\vec {y}\) with smaller support size. This contradicts to the minimality of the support size of \(\vec {x}\).

Case b3: \(\mu =0\) and \(r\ge 3\).

If \(\mu =0\), then \(\tilde{e}\) is the only edge in \(\overline{H}^{r}\) with vertices in \(\sigma (x)^{(r)}\) that contains both \(j\) and \(l\), and \(x_{i}=x_{j}\) for all \(i,k\in \tilde{e}\). Setting \(x_{i}=\xi \) for all \(i\in \tilde{e}\).

We define a new feasible weighting \(\vec {z}\) as follows. Let \(m\in {\tilde{e}\backslash {\{j,l\}}}\) and take \(z_{i}=x_{i}\) for \(i\ne j,l,m,\,z_{j}=x_{j}+\frac{\varepsilon }{2}\), \(z_{l}=x_{l}+\frac{\varepsilon }{2}\) and \(z_{m}=x_{m}-\varepsilon \) where \(0<\varepsilon \le x_{m}\). We study the sign of \(f(\vec {z})-f(\vec {x})\) in a neighbourhood of \(\vec {x}\) as \(\varepsilon \rightarrow 0\) by means of the Taylor expansion of \(f\). By Lemma 2.1 \(\partial _{j}f(\vec {x})=\partial _{l}f(\vec {x})=\partial _{m}f(\vec {x})\) for \(j,l,m\in \sigma (x)\).

$$\begin{aligned} f(\vec {z})\!-\!f(\vec {x})\!\!&=\!\!\frac{\varepsilon ^{2}}{2}\left[ \frac{\partial _{jj}f(\vec {x})\!+\!\partial _{ll}f(\vec {x})}{4}\!+\!\partial _{mm}f(\vec {x})\!-\!\partial _{jm}f(\vec {x})\!-\! \partial _{lm}f(\vec {x})\!+\!\frac{\partial _{jl}f(\vec {x})}{2}\right] \\&\quad +\, \frac{\varepsilon ^{3}}{6}\left[ \frac{\partial _{jjj}f(\vec {x})+\partial _{lll}f(\vec {x})}{8}-\partial _{mmm}f(\vec {x})- \frac{3\partial _{jlm}f(\vec {x})}{2}\right] +\mathcal {O}(\varepsilon ^{4}), \end{aligned}$$

where \(\partial _{jlm}\) denotes the partial derivative of \(f\) with respect to \(x_{j}\), \(x_{l}\), and \(x_{m}\), i.e.

$$\begin{aligned} \partial _{jlm}f(\vec {x})=1_{j\ne l}1_{l\ne m}1_{j\ne m}\beta \sum \limits _{e\in \overline{H}^{r}}1_{j,l,m\in e}\prod \limits _{i\in {e\backslash {\{j,l,m\}}}}x_{i}+1_{j=l=m}\beta (r-2)x_{j}^{r-3}. \end{aligned}$$

Recall that \(x_{j}=\xi \) for all \(j\in \tilde{e}\), then \(\forall j,l\in \tilde{e},\,\partial _{jj}f(\vec {x})=\tau r(r-1)\xi ^{r-2}=\beta \xi ^{r-2} ,\,\partial _{jl}f(\vec {x})=\beta \xi ^{r-2}\), and \(\forall j,l,m\in \tilde{e},\,\partial _{jjj}f(\vec {x})=\tau r(r-1)(r-2)\xi ^{r-3}=\beta (r-2)\xi ^{r-3}\) and \(\partial _{jlm}f(\vec {x})=\beta \xi ^{r-3}\). Hence, the sign of \(f(\vec {z})-f(\vec {x})\) for sufficiently small values of \(\varepsilon \) is given by the sign of \(-\frac{r-1}{4}\beta \varepsilon ^{3}\xi ^{r-3}\) which is clearly negative and this contradicts the local minimality of \(\vec {x}\).

Hence Lemma 2.4 holds. \(\square \)

Lemma 2.5

Let \(\beta >0,\;\tau >0\;and\;\alpha \ge \tau r\) be constants. If \(\vec {x}\) is a local (global) solution of (5), then \(\sigma (\vec {x})^{(1)}\subseteq H^{1}\).

Proof

If \(\sigma (\vec {x})^{(1)}\nsubseteq H^{1}\), then there are two possible cases to consider.

Case 1: \(\exists j,l\in \sigma (\vec {x})\) such that \(j\in H^{1}\) but \(l\notin H^{1}\).

Since \(\vec {x}\) is a local solution of (5), it satisfies the KKT necessary condition in Lemma 2.1. Therefore \(\forall j,l\in \sigma (\vec {x})\), we have \(\theta =\partial _{j}f(\vec {x})=\partial _{l}f(\vec {x})\). But \(\partial _{j}f(\vec {x})=\tau rx_{j}^{r-1}\), \(\partial _{l}f(\vec {x})=\alpha +\tau rx_{l}^{r-1}\), so \(\tau rx_{j}^{r-1}=\alpha +\tau rx_{l}^{r-1}\) i.e. \(\tau r(x_{j}^{r-1}-x_{l}^{r-1})=\alpha \), then \(x_{j}^{r-1}=\frac{\alpha }{\tau r}+x_{l}^{r-1}>1\) since \(\alpha \ge \tau r\) and \(0<x_{i}<1\) for all \(i\in \sigma (x)\). It is a contradiction.

Case 2: \(\forall j\in \sigma (x)\), \(j\notin H^{1}\).

For \(j\in \sigma (x)\) but \(j\in \overline{H}^{1}\), \(\theta =\partial _{j}f(\vec {x})=\alpha +\tau rx_{j}^{r-1}\). Let \(l\notin \sigma (x)\), then \(\partial _{l}f(\vec {x})=\alpha <\theta \), this contradicts the KKT condition.

Hence Lemma 2.5 holds. \(\square \)

Applying Lemmas 2.4 and 2.5, we will obtain the following claim:

Claim 2.6

  1. (a)

    Let \(\tau >0,\;\alpha \ge \tau r \;and\; \beta \ge \tau r(r-1)\) (with strict inequality if \(r=2\)) be constants. If \(\vec {x}\) is a local (global) solution of (5), then \(H[\sigma (x)]\) is a clique of \(H\).

  2. (b)

    For \(r=2,\, \alpha >0,\tau >0\), and \(\beta =2\tau \), if \(\vec {x}\) is a local (global) solution of (5) such that the support size of \(\vec {x}\) is minimum among all feasible weighting \(\vec {y}\) such that \(f(\vec {y})=f(\vec {x})\), then \(H[\sigma (x)]\) is a clique of \(H\).

By Lemma 2.3 and Claim 2.6, we can get the following claim:

Claim 2.7

  1. (a)

    Let \(\tau >0,\;\alpha \ge \tau r \;and\; \beta \ge \tau r(r-1)\) (with strict inequality if \(r=2\)) be constants. If \(\vec {x}\) is a local (global) solution of (5), then \(H[\sigma (x)]\) is a maximal (maximum) clique of \(H\) and \(\vec {x}\) is the characteristic vector of \(\sigma (x)\).

  2. (b)

    For \(r=2\), \(\alpha >0,\tau >0\), and \(\beta =2\tau \), if \(\vec {x}\) is a local (global) solution of (5) such that the support size of \(\vec {x}\) is minimum among all feasible weighting \(\vec {y}\) such that \(f(\vec {y})=f(\vec {x})\), then \(H[\sigma (x)]\) is a maximal (maximum) clique of \(H\) and \(\vec {x}\) is the characteristic vector of \(\sigma (x)\).

Lemma 2.8

Let \(\tau >0,\;\alpha \ge \tau r \;and \;\beta \ge \tau r(r-1)\) be constants. Let \(H\) be a \(\{1,r\}\)-hypergraph. If \(\vec {x}\) is the characteristic vector of a maximal (maximum) clique \(C\) of \(H\), then \(\vec {x}\) is a strict local (global) solution of (5).

Proof

We will show that \(\vec {x}\) is a strict local solution of (5), by showing that it satisfies the sufficient conditions in Lemma 2.2. First we prove that \(\vec {x}\) satisfies the KKT necessary condition in Lemma 2.1, for all \(j\in \sigma (x)\) we have \(\theta =\partial _{j}f(\vec {x})=\tau rx_{j}^{r-1}=\tau r|V(C)|^{1-r}\), since \(\vec {x}\) is the characteristic vector of \(C\). For all \(j\notin \sigma (x)\), since \(C\) is a maximal clique of \(H\), then there exists at least one edge in \(\overline{H}^{r}\) joining \(j\) and \(r-1\) vertices in \(C\), so \(\partial _{j}f(\vec {x})\ge \beta |V(C)|^{1-r}\ge \tau r(r-1)|V(C)|^{1-r}\ge (r-1)\theta \ge \theta \). Hence \(\vec {x}\) is a KKT point.

Next we show that \(H(\vec {x})\) is positive definite on the subspace \(M(\vec {x})\). Since \(H(\vec {x})|_{\sigma (x)}\) is a diagonal matrix with positive diagonal entries,

$$\begin{aligned} H(\vec {x})|_{\sigma (x)}=\tau r(r-1)|V(C)|^{r-2}I , \end{aligned}$$

where \(I \) is the identity matrix. So all eigenvalues of \(H(\vec {x})|_{\sigma (x)}\) are positive.This implies that \(H(\vec {x})\) is positive definite on the subspace \(M(\vec {x})\).

By a direct calculation, \(f(\vec {x})=\tau |V(C)|^{1-r}\) attains its global minimum when \(\vert C \vert \) is as large as possible, i.e., a maximum clique. \(\square \)

Proof of Theorem 1.3

Following from Claim 2.7 and Lemma 2.8, we can obtain the first part. Let \(\vec {x}\) be global solution of (5) with the minimum support size. Then by Claim 2.7, \(\vec {x}\) is the characteristic vector of a maximum clique of \(H\). By a direct calculation, \(f_H=f(\vec {x})=\tau t^{1-r}\). \(\square \)

Proof of Corollary 1.5

(a) Since \(H\) doesn’t contain a clique of order \(t\), then the order of its maximum cliques is at most \(t-1\). By Theorem 1.3, \(f(\vec {x})\ge \tau (t-1)^{1-r}\). In particular if we take \(\vec {x}=(\frac{1}{n},\frac{1}{n},\ldots ,\frac{1}{n})\), then \(f((\frac{1}{n},\frac{1}{n},\ldots ,\frac{1}{n}))\ge \tau (t-1)^{1-r}\). By a direct calculation, we obtain that

$$\begin{aligned} \frac{\alpha |E(H^{1})|}{n}+\frac{\beta |E(H^{r})|}{n^{r}}\le \alpha +{n \atopwithdelims ()r}\frac{\beta }{n^r}-\tau \left[ (t-1)^{1-r}-n^{1-r}\right] . \end{aligned}$$

(b) Taking \(\alpha =1,\,\beta =r!\), and \(\tau ={1 \over r}\), we obtain that

$$\begin{aligned} \pi ([t]^{\{1,r\}})&=\lim _{n\rightarrow \infty }\left( \frac{|E(H^{1})|}{{n\atopwithdelims ()1}}+\frac{|E(H^{r})|}{{n \atopwithdelims ()r}}\right) \\&=\lim _{n\rightarrow \infty }\left( \frac{ |E(H^{1})|}{n}+\frac{ r!|E(H^{r})|}{n^{r}}\right) \\&\quad \le \lim _{n\rightarrow \infty }\left\{ 1+{n \atopwithdelims ()r}\frac{r!}{n^r}-{1 \over r}\left[ (t-1)^{1-r}-n^{1-r}\right] \right\} \\&=2-{1 \over r}(t-1)^{1-r}. \end{aligned}$$

\(\square \)

2.2 Proof of Theorem 1.4

Lemma 2.9

Let \(H\) be an \(\{r_{1},r_{2}\}\)-hypergraph. If \(\vec {x}\) is a local (global) solution of (4), then \(H^{r_{1}}[\sigma (x)]\) is a clique of \(H^{r_{1}}\) provided that \(\beta ,\gamma ,\tau >0 \;and\; \alpha \ge \gamma r_{1}(r_{1}-1)+\tau r_{2}(r_{2}-1)\).

Proof

Suppose that there exists \(\tilde{e}\in \sigma (x)^{(r_{1})}\) such that \(\tilde{e}\notin H^{r_{1}}\). We define a new feasible weighting \(\vec {y}\) for \(H\) as follows. Let \(j,l\in \tilde{e}\) such that \(x_{j}\le x_{l}\le x_{i}\) for all \(i\in \tilde{e}\backslash {\{j,l\}}\) and take \(y_{i}=x_{i}\) for \(i\ne j,l\), \(y_{j}=x_{j}+\varepsilon \) and \(y_{l}=x_{l}-\varepsilon \), where \(0<\varepsilon \le x_{l}\). Then \(\vec {y}\) is clearly a feasible weighting for \(H\).

We study the sign of \(p_{H}(\vec {y})-p_{H}(\vec {x})\) in a neighbourhood of \(\vec {x}\) as \(\varepsilon \rightarrow 0\) by means of the Taylor expansion of \(p_{H}\). By Lemma 2.1, \(\partial _{j}p_{H}(\vec {x})=\partial _{l}p_{H}(\vec {x})\) for \(j,l\in \sigma (x)\).

$$\begin{aligned} p_{H}(\vec {y})-p_{H}(\vec {x})&=\varepsilon \partial _{j}p_{H}(\vec {x})-\varepsilon \partial _{l}p_{H}(\vec {x})+\frac{\varepsilon ^{2}}{2!}\partial _{jj}p_{H}(\vec {x})\nonumber \\&\quad -\, \frac{2\varepsilon ^{2}}{2!}\partial _{jl}p_{H}(\vec {x})+\frac{\varepsilon ^{2}}{2!}\partial _{ll}p_{H}(\vec {x})+\mathcal {O}(\varepsilon ^{3})\nonumber \\&=\frac{\varepsilon ^{2}}{2}\left[ \partial _{jj}p_{H}(\vec {x})+\partial _{ll}p_{H}(\vec {x})-2\partial _{jl}p_{H}(\vec {x})\right] +\mathcal {O}(\varepsilon ^{3})\nonumber \\&=\!\frac{\varepsilon ^{2}}{2}\Bigg [\!\gamma r_{1}(r_{1}\!-\!1)\left( x_{j}^{r_{1}-2}\!+\!x_{l}^{r_{1}-2}\right) \!+\!\tau r_{2}(r_{2}\!-\!1)\left( x_{j}^{r_{2}-2}\!+\!x_{l}^{r_{2}-2}\right) \nonumber \\&\quad -\, \!2\left( \alpha \!\sum \limits _{e\in \overline{H}^{r_{1}}}1_{j,l\in e}\!\prod \limits _{i\in {e\backslash {\{j,l\}}}}x_{i}\!+\!\beta \!\sum \limits _{e\in \overline{H}^{r_{2}}}1_{j,l\in e}\!\prod \limits _{i\in {e\backslash {\{j,l\}}}}x_{i}\right) \!\Bigg ]\!+\!\mathcal {O}(\varepsilon ^{3}).\nonumber \\ \end{aligned}$$
(14)

Since \(x_{j}\le x_{l}\le x_{i}\) for all \(i\in \tilde{e}\backslash {\{j,l\}}\) and \(r_{1}<r_{2}\), we can estimate (14) as

$$\begin{aligned} p_{H}(\vec {y})\!-\!p_{H}(\vec {x})\!&< \!\frac{\varepsilon ^{2}}{2}\left[ 2\gamma r_{1}(r_{1}\!-\!1)x_{l}^{r_{1}-2}\!+\!2\tau r_{2}(r_{2}\!-\!1)x_{l}^{r_{2}-2}\!-\!2\alpha x_{l}^{r_{1}-2}\right] \!+\!\mathcal {O}(\varepsilon ^{3})\nonumber \\&=\varepsilon ^{2}\left[ \gamma r_{1}(r_{1}-1)x_{l}^{r_{1}-2}+\tau r_{2}(r_{2}-1)x_{l}^{r_{1}-2}-\alpha x_{l}^{r_{1}-2}\right] \nonumber \\&\quad +\, \varepsilon ^{2}\tau r_{2}(r_{2}-1)\left( x_{l}^{r_{2}-2}-x_{l}^{r_{1}-2}\right) +\mathcal {O}(\varepsilon ^{3})\nonumber \\&=\varepsilon ^{2}\left[ \gamma r_{1}(r_{1}-1)+\tau r_{2}(r_{2}-1)-\alpha \right] x_{l}^{r_{1}-2}\nonumber \\&\quad +\, \varepsilon ^{2}\tau r_{2}(r_{2}-1)\left( x_{l}^{r_{2}-2}-x_{l}^{r_{1}-2}\right) +\mathcal {O}(\varepsilon ^{3}). \end{aligned}$$
(15)

The value of \(p_{H}(\vec {y})-p_{H}(\vec {x})\) is less than zero for small enough \(\varepsilon \) since \(r_{2}>r_{1}\) and \(\alpha \ge \gamma r_{1}(r_{1}-1)+\tau r_{2}(r_{2}-1)\). This contradicts to that \(\vec {x}\) is a local solution of \(p_{H}(\vec {x})\). Hence Lemma 2.9 holds. \(\square \)

Proof of Theorem 1.4

Clearly, \(p_{H}\le p_{H[U]}\). By Lemma 2.9, \(\sigma (x)^{(r_{1})}\subseteq H^{r_{1}}\), then \(\sigma (x)\subseteq U\) since \(U\) is the vertex set of a maximum complete \(r_{1}\)-subhypergraph in \(H\). So we obtain that \(p_{H}=p_{H[\sigma (x)]}\ge p_{H[U]}\). Hence \(p_{H}=p_{H[U]}\). \(\square \)

3 Remarks

For \(\{r_1, r_2\}\)-hypergraphs, we are not able to obtain a result to similar to Theorem 1.3. The obstruction is that we are not able to verify that the vertices corresponding to the support of a solution (with minimum number of positive weights) induce a complete \(r_2\)-subgraph in the \(r_2\)-level hypergraph. There is no evidence that this is not true, but we are not able to confirm it by the current method. A result similar to Theorem 1.4 can be obtained for \(\{r_1, r_2, \ldots , r_l\}\)-hypergraphs.