1 Introduction

Our primary aim in this note is to demonstrate that most of the fundamental algebraic properties of Ramanujan sums can be deduced using the theory of supercharacters, recently developed by Diaconis-Isaacs and André. In fact, the machinery of supercharacter theory frequently yields one-line proofs of many difficult identities and provides an array of new tools that can be used to derive various novel formulas. Our approach is entirely systematic, relying on a flexible and general framework. Indeed, we hope to convince the reader that supercharacter theory provides a natural framework for the study of Ramanujan sums. In addition to exhibiting a novel application of supercharacter theory, this article also serves as a blueprint for future work since some of the abstract results we develop are applicable in much greater generality (see [10] for recent developments).

1.1 Ramanujan sums

In what follows, we let e(x)=exp(2πix), so that the function e(x) is periodic with period 1. For integers n,x with n≥1, the expression

$$ c_n(x) = \sum _{ \substack{ j = 1 \\ (j,n) = 1} }^n e \biggl( \frac {jx}{n} \biggr) $$
(1.1)

is called a Ramanujan sum (or sometimes Ramanujan’s sum). Ramanujan himself (1918) [53, Paper 21] noted that Dirichlet and Dedekind had already considered such expressions in their famed text Vorlesungen über Zahlentheorie (1863). Moreover, certain related identities were already known to von Sterneck (1902) [63], Kluyver (1906) [36], Landau (1909) [39], and Jensen (1915) [28]. Nevertheless, “Ramanujan was the first to appreciate the importance of the sum and to use it systematically,” according to G.H. Hardy [19, p. 159].

Ramanujan’s interest in the sums (1.1) originated in his desire to “obtain expressions for a variety of well-known arithmetical functions of n in the form of a series ∑ s a s c s (n).” This particular analytic aspect of the subject has flourished in the intervening years and is discussed at length in [41, 43, 56, 57]. On the other hand, in classical character theory Ramanujan sums can be used to establish the integrality of the character values for the symmetric group [27, Cor. 22.17]. However, perhaps the most famous appearance of Ramanujan sums is their crucial role in Vinogradov’s proof that every sufficiently large odd number is the sum of three primes [46, Chap. 8].

In more recent years, Ramanujan sums have appeared in the study of Waring-type formulas [37], the distribution of rational numbers in short intervals [33], equirepartition modulo odd integers [8], the large sieve inequality [54], graph theory [16], symmetry classes of tensors [59], combinatorics [55], cyclotomic polynomials [17, 44, 47, 62], and Mahler matrices [40]. In physics, Ramanujan sums have applications in the processing of low-frequency noise [49] and of long-period sequences [48] and in the study of quantum phase locking [50]. We should also remark that various generalizations of the classical Ramanujan sum (1.1) have arisen over the years [2, 11, 12, 58] and that Ramanujan sums involving matrix variables have also been considered [45, 52].

1.2 Supercharacters

The theory of supercharacters, of which classical character theory is a special case, was recently introduced by Diaconis and Isaacs (2008) [14] to generalize the basic characters of André [35]. Here we summarize a few important facts. Further details can be found in [14, 23].

Definition

(Diaconis–Isaacs [14])

Let G be a finite group, let \(\mathcal{K}\) be a partition of G, and let \(\mathcal{X}\) be a partition of the set \(\operatorname{Irr}(G)\) of irreducible characters of G. We call the ordered pair \((\mathcal {X}, \mathcal{K})\) a supercharacter theory if

  1. (1)

    \(\{1\} \in\mathcal{K}\);

  2. (2)

    \(|\mathcal{X}| = |\mathcal{K}|\);

  3. (3)

    for each \(X \in\mathcal{X}\), the character

    $$ \sigma_X = \sum_{\chi\in X} \chi(1)\chi $$

    is constant on each \(K \in\mathcal{K}\).

The characters σ X are called supercharacters, and the elements K of \(\mathcal{K}\) are called superclasses.

As [14, Lemma 2.1] shows, the preceding definition is equivalent to the following.

Definition

(André [6])

Let G be a finite group, let \(\mathcal{K}\) be a partition of G, and let \(\mathcal{X}\) be a collection of complex characters of G. We call the ordered pair \((\mathcal{X}, \mathcal{K})\) a supercharacter theory if

  1. (1)

    every irreducible character of G is a constituent of a unique \(\chi\in\mathcal{X}\);

  2. (2)

    \(|\mathcal{X}| = |\mathcal{K}|\);

  3. (3)

    each character \(\chi\in\mathcal{X}\) is constant on K for each \(K \in\mathcal{K}\).

The elements χ of \(\mathcal{X}\) are called supercharacters, and the sets K are called superclasses.

Regardless of which definition one chooses to work with, it is straightforward to verify that each K in \(\mathcal{K}\) is a union of conjugacy classes of G and that each of the partitions \(\mathcal{K}\) and \(\mathcal{X}\) determines the other. The only significant difference between these two definitions is that the second approach can yield supercharacters that are multiples of the σ X defined above.

In the literature to date, the main use of supercharacter theory has been to perform computations when a complete character theory is difficult or impossible to determine. For instance, André developed a successful supercharacter theory for the unipotent matrix groups U n (q) whose representation theories are known to be wild (see also [64, 65]). Supercharacter theories have also proven to be relevant outside the realm of finite group theory. For instance, these notions can be used to obtain a more general theory of spherical functions and Gelfand pairs [14]. In a different direction, recent work has revealed deep connections between supercharacter theory and the Hopf algebra of symmetric functions of noncommuting variables [1]. Another application may be found in [7], where the authors use supercharacter theory to study random walks on upper triangular matrices. Other recent work on supercharacters concerns connections with Schur rings [23, 25] and with their combinatorial properties [15, 60, 61]. We should also remark that similar constructions surfaced independently in the study of quasigroups and association schemes in the form of fusions of character tables [25, 29, 32].

1.3 General approach

We proceed along a different course, turning our attention to the group \(\mathbb{Z}/n\mathbb{Z}\), whose classical representation theory is already well understood. It turns out that a natural supercharacter theory for \(\mathbb{Z}/n\mathbb{Z}\) can be developed for which Ramanujan sums appear as values of the corresponding supercharacters. In this manner, the modern machinery of supercharacter theory can be used to generate a wide variety of formulas and identities for Ramanujan sums. Along the way, we also develop a notion of superclass arithmetic, which generalizes the standard arithmetic of conjugacy classes from classical character theory.

2 A supercharacter theory for \(\mathbb{Z}/n\mathbb{Z}\)

In this section we introduce a supercharacter theory for \(\mathbb {Z}/n\mathbb{Z}\), which arises naturally from the action of \(\operatorname{Aut}(\mathbb{Z}/n\mathbb{Z})\) on \(\mathbb{Z}/n\mathbb{Z}\) (see [9] for a description of all possible supercharacter theories on \(\mathbb{Z}/n\mathbb{Z}\)). Before proceeding, we require a few preliminaries. Although some of this material is well known in certain circles, we include the details since the theory of supercharacters is not yet common knowledge among the general mathematics community. Moreover, for many of the upcoming applications, we require a particular unitary rescaling of our supercharacter tables which is not widely used.

2.1 Supercharacter tables

Suppose that G is a finite group of order |G| and that \((\mathcal{X},\mathcal{K})\) is a supercharacter theory for G. In other words, suppose that we have a partition \(\mathcal{X}= \{X_{1},X_{2},\ldots,X_{r}\}\) of \(\operatorname{Irr}(G)\) with corresponding supercharacters

$$ \sigma_i = \sum_{\chi\in X_i} \chi(1) \chi $$
(2.1)

and a compatible partition \(\mathcal{K}= \{K_{1},K_{2},\ldots,K_{r}\}\) of G into superclasses. The supercharacter table for G corresponding to \((\mathcal{X},\mathcal{K})\) is the r×r array

(2.2)

whose (i,j) entry is σ i (K j ). We let

$$ S = \bigl( \sigma_i(K_j) \bigr)_{i,j=1}^r $$

denote the r×r matrix that encodes the data in (2.2). In what follows we frequently identify supercharacter tables with their matrix representations, and we often refer to the matrix S itself as a supercharacter table.

Recall that a function \(f:G\to\mathbb{C}\) is called a class function if f is constant on each conjugacy class of G. The space of complex-valued class functions on G is endowed with the natural inner product

$$ \bigl\langle\chi,\chi' \bigr \rangle = \frac{1}{|G|} \sum_{g \in G} \chi(g) \overline{ \chi'(g)}, $$
(2.3)

with respect to which the irreducible characters of G form an orthonormal basis. In light of the fact that supercharacters are constant on superclasses (i.e., they are superclass functions), (2.3) implies that

$$ \langle\sigma_i,\sigma_j \rangle = \frac{1}{|G|} \sum_{\ell=1}^r |K_{\ell}| \sigma_i(K_{\ell})\overline{\sigma_j(K_{\ell})}. $$

It follows from (2.1) and the orthogonality of irreducible characters that

$$ \langle\sigma_i, \sigma_j \rangle = \biggl\langle\sum _{\chi\in X_i} \chi(1) \chi, \sum_{\chi' \in X_j} \chi'(1) \chi' \biggr\rangle = \delta_{i,j} \|X_i\|_2^2, $$

where

$$ \|X_i\|_2 = \sqrt{\sum_{\chi\in X_i} \bigl| \chi(1)\bigr|^2} $$

is a convenient shorthand. Putting this all together, we see that

$$ \frac{1}{|G|}\sum_{\ell=1}^r |K_{\ell}| \sigma_i(K_{\ell})\overline{ \sigma_j(K_{\ell})} = \delta_{i,j} \|X_i\|_2^2. $$
(2.4)

It is helpful to interpret the preceding result matricially. Letting

$$ R = \operatorname{diag} \bigl( \sqrt{ |K_1|}, \sqrt{ |K_2|},\ldots, \sqrt{ |K_r| } \bigr), $$
(2.5)

we see that (2.4) is equivalent to asserting that

$$ (SR) (SR)^* = |G| \operatorname{diag} \bigl( \|X_1 \|_2^2, \|X_2\|_2^2, \ldots, \|X_r\|_2^2 \bigr). $$

Letting

$$ L = \frac{1}{\sqrt{|G|}} \operatorname {diag} \bigl( \|X_1\|_2^{-1}, \|X_2\|_2^{-1}, \ldots, \|X_r\|_2^{-1} \bigr), $$
(2.6)

we conclude that the matrix

$$ U = L S R $$
(2.7)

satisfies UU =I. In other words, the r×r matrix

$$ U = \frac{1}{\sqrt{|G|}} \biggl[ \frac {\sigma_i(K_j)\sqrt{ |K_j| }}{ \|X_i\|_2} \biggr]_{i,j=1}^r $$
(2.8)

is unitary. Since U U=I, we now obtain the column orthogonality relation

$$ \frac{\sqrt{|K_i||K_j|}}{|G|} \sum _{\ell=1}^r \frac{\sigma_{\ell}(K_i)\overline{\sigma_{\ell }(K_j)}}{\|X_{\ell}\|_2^2} = \delta_{i,j}. $$
(2.9)

Similarly, we see that the equation UU =I encodes (2.4).

2.2 A supercharacter table for \(\mathbb{Z}/n\mathbb{Z}\)

As remarked in [14], each subgroup H of \(\operatorname{Aut}(G)\) determines a corresponding supercharacter theory \((\mathcal{X}_{H},\mathcal{K}_{H})\) for G. To be more specific, H induces a permutation of \(\operatorname{Irr}(G)\) while also permuting the conjugacy classes of G. By Brauer’s lemma [26, Thm. 6.32, Cor. 6.33], the number of H-orbits of \(\operatorname{Irr}(G)\) equals the number of H-orbits induced on the set of conjugacy classes of G. This decomposition yields a supercharacter theory for G where the elements X i of \(\mathcal{X}_{H} = \{X_{1},X_{2},\ldots,X_{r}\}\) are H-orbits in \(\operatorname{Irr}(G)\) and the superclasses K i of \(\mathcal{K}_{H} = \{K_{1},K_{2},\ldots,K_{r}\}\) are unions of H-orbits of conjugacy classes of G. By construction, each supercharacter (2.1) is constant on each member of \(\mathcal{K}_{H}\).

Fix a positive integer n and let τ(n) denote the number of divisors of n. Let d 1,d 2,…,d τ(n) denote the divisors of n, the exact order being unimportant for our purposes at the moment. Recall that the irreducible characters of \(\mathbb{Z}/n\mathbb{Z}\) are precisely the functions

$$ \chi_a(x) = e \biggl(\frac{ax}{n} \biggr) $$

for a=1,2,…,n and that each automorphism of \(\mathbb{Z}/n\mathbb{Z}\) is of the form

$$ \psi_u(a) = ua $$

for some u in \((\mathbb{Z}/n\mathbb{Z})^{\times}\). In particular, we note that \(\operatorname{Aut}(\mathbb{Z}/n\mathbb{Z}) \cong (\mathbb{Z}/n\mathbb{Z})^{\times}\).

Next we observe that there exists a u in \((\mathbb{Z}/n\mathbb {Z})^{\times}\) such that ψ u (a)=b if and only if (a,n)=(b,n). In light of the fact that χ a ψ u =χ au , it is clear that the action of \(\operatorname{Aut}(\mathbb{Z}/n\mathbb{Z})\) on \(\operatorname{Irr}(\mathbb{Z}/n\mathbb{Z})\) partitions the irreducible characters into a disjoint collection \(\mathcal{X}= \{X_{1},X_{2},\ldots,X_{\tau (n)}\}\) of orbits

$$ X_i = \biggl\{ \chi_a : (a,n) = \frac{n}{d_i}\biggr\}, $$

each of which satisfies

$$ |X_i| = \phi(d_i), $$
(2.10)

where ϕ denotes the Euler totient function. Thus,

$$ \sigma_i(x) = \sum_{\chi\in X_i} \chi(x) = \sum_{\substack{j=1 \\ (j,n)=\frac{n}{d_i}}}^n e \biggl( \frac{jx}{n} \biggr) = \sum_{\substack{k=1\\(k,d_i)=1}}^{d_i} e \biggl( \frac{kx}{d_i} \biggr) = c_{d_i}(x), $$
(2.11)

each of which is a Ramanujan sum. On the other hand, the action of \(\operatorname{Aut}(\mathbb {Z}/n\mathbb{Z})\) on \(\mathbb{Z}/n\mathbb{Z}\) results in a partition \(\mathcal{K}= \{K_{1},K_{2},\ldots,K_{\tau(n)}\}\) of \(\mathbb{Z}/n\mathbb{Z}\) into disjoint orbits

$$ K_j = \biggl\{ a \in\mathbb{Z}/n\mathbb{Z}: (a,n)= \frac{n}{d_j} \biggr\}, $$
(2.12)

each of which satisfies

$$ |K_j| = \phi(d_j). $$
(2.13)

Since each conjugacy class of \(\mathbb{Z}/n\mathbb{Z}\) is a singleton, it is clear that the proposed superclass (2.12) is the union of conjugacy classes.

Let us pause briefly to note that since c n (x) is a superclass function with respect to the variable x, we obtain the following useful fact:

$$ c_n \bigl( (x,n) \bigr) = c_n(x) . $$
(2.14)

In other words, c n (x) is an even function modulo n [43, p. 79], [57, p. 15].

Putting this all together, we obtain the τ(nτ(n) supercharacter table S(n)

whose (i,j) entry is given by

$$ \bigl[S(n) \bigr]_{i,j} = c_{d_i} \biggl(\frac{n}{d_j} \biggr). $$
(2.15)

When there is no chance of confusion, we simply write S=S(n). We leave the particular order in which the divisors d 1,d 2,…,d τ(n) of n are listed unspecified, noting only that any pair of orderings lead to two matrices that are similar via a suitable permutation matrix. Such relationships between matrices will arise frequently in what follows, and we therefore introduce the following notation. We shall write AB whenever A and B are square matrices such that B=P −1 AP for some permutation matrix P. Although we use the same symbol to denote the isomorphism of groups, the meaning should be clear from context.

Before moving on, let us recall that pre- and post-multiplying S by the diagonal matrices L, given by (2.6), and R, given by (2.5), yield the unitary matrix U=LSR. In light of (2.10) and (2.13), it turns out that \(L = (\sqrt{n} R)^{-1}\), whence

$$ \sqrt{n} U = R^{-1}SR. $$
(2.16)

In other words, the supercharacter table S(n) is similar to a multiple of the unitary matrix U=U(n) given by

$$ \frac{1}{\sqrt{n}} \begin{bmatrix} c_{d_1}( \frac{n}{d_1}) & c_{d_1}( \frac{n}{d_2}) \sqrt{ \frac{\phi (d_2)}{\phi(d_1)} }& \cdots& c_{d_1}( \frac{n}{d_{\tau(n)}}) \sqrt { \frac{\phi(d_{\tau(n)})}{\phi(d_1)} } \\ c_{d_2}( \frac{n}{d_1}) \sqrt{ \frac{\phi(d_1)}{\phi(d_2)} } & c_{d_2}( \frac{n}{d_2}) & \cdots& c_{d_2}( \frac{n}{d_{\tau(n)}}) \sqrt{ \frac{\phi(d_{\tau(n)})}{\phi(d_2)} }\\ \vdots& \vdots& \ddots& \vdots\\ c_{d_{\tau(n)}}( \frac{n}{d_1}) \sqrt{ \frac{\phi(d_1)}{\phi (d_{\tau(n)})} }& c_{d_{\tau(n)}}( \frac{n}{d_2}) \sqrt{ \frac {\phi(d_2)}{\phi(d_{\tau(n)})} }& \cdots& c_{d_{\tau(n)}}( \frac {n}{d_{\tau(n)}}) \\ \end{bmatrix}. $$
(2.17)

Example 1

If p is a prime number, then the two divisors d 1=1 and d 2=p of p lead to the corresponding superclasses K 1={p} and K 2={1,2,…,p−1} of \(\mathbb{Z}/ p\mathbb{Z}\). A short computation now reveals that

$$ S(p) = \begin{bmatrix} 1&1\\p-1&-1 \end{bmatrix} ,\qquad U(p) = \frac{1}{\sqrt{p}} \begin{bmatrix}1&\sqrt{p-1} \\\sqrt{p-1} &-1 \end{bmatrix} . $$
(2.18)

In particular, observe that U(p) is a selfadjoint unitary involution which has only real entries. It turns out, as we shall see, that this is true for general U(n).

2.3 Orthogonality relations

Using the row and column orthogonality relations (2.4) and (2.9), we immediately obtain

$$ \sum_{k|n} \phi(k) c_{d_i} \biggl(\frac{n}{k} \biggr) c_{d_j} \biggl( \frac{n}{k} \biggr) = \begin{cases} 0 & \text{if $i \neq j$},\\ n\phi(d_i) & \text{if $i =j$}, \end{cases} $$
(2.19)

and

$$ \sum_{k|n} \frac{1}{\phi(k)} c_k \biggl( \frac{n}{d_i} \biggr)c_k \biggl( \frac{n}{d_j} \biggr)= \begin{cases} 0 & \text{if $i \neq j$},\\ \frac{n}{\phi(d_i)} & \text{if $i =j$}, \end{cases} $$
(2.20)

respectively. The first is a well-known identity [57, Thm. 3.1.e, p. 16], and the second is somewhat lesser known [43, Ex. 2.22]. If d|n, then letting d i =d and d j =1 in (2.19), we obtain [43, Ex. 2.24]

$$ \sum_{k|n} \phi(k) c_{d} \biggl(\frac{n}{k} \biggr) = \begin{cases} 0 & \text{if $d \neq1$},\\ n & \text{if $d = 1$}. \end{cases} $$

3 Multiplicativity and Kronecker products

In this section, we consider product supercharacter theories and their ramifications for the study of Ramanujan sums. In particular, it turns out that many of the peculiar multiplicative properties of Ramanujan sums can be easily derived by examining Kronecker products of supercharacter tables. In addition to providing simple proofs of many standard identities, our techniques will ultimately permit the derivation of many novel identities as well (e.g., the bizarre determinantal formula (3.23) and the power sum identities of Sect. 4.5).

3.1 Prime powers

In the following, we let p denote a fixed prime number. For each α≥1, let us identify the supercharacter table S=S(p α) that arises from the action of \(\operatorname{Aut}(\mathbb{Z}/ p^{\alpha}\mathbb{Z}) \cong(\mathbb{Z}/ p^{\alpha} \mathbb{Z})^{\times}\) on the group \(\mathbb{Z}/p^{\alpha}\mathbb{Z}\). Before proceeding, it is helpful to note that

$$ c_{p^m}(x) = \begin{cases} p^{m-1}(p-1) & \text{if $p^{m} | x$},\\ -p^{m-1} & \text{if $p^{m} \nmid x$ but $p^{m-1} | x$},\\ 0 & \text{otherwise}, \end{cases} $$
(3.1)

which can be computed easily from the definition (1.1) and the formula for the sum of a finite geometric series. Among other things, we note that

$$ c_{p^m}(1) = \mu \bigl(p^m \bigr),\qquad c_{p^m} \bigl(p^m \bigr) = \phi \bigl(p^m \bigr), $$
(3.2)

where μ(n) denotes the Möbius μ-function

$$ \mu(n) = \begin{cases} 1 & \text{if $n=1$},\\ 0 & \text{if $n$ is not square-free},\\ (-1)^{\omega} & \text{if $n$ is the product of $\omega$ distinct primes}. \end{cases} $$
(3.3)

Since the divisors of p α are precisely the numbers d i =p i−1 for i=1,2,…,α+1, (2.15) and (3.1) tell us that the (i,j) entry of the (α+1)×(α+1) matrix S=S(p α) is given by

$$ \bigl[S \bigl(p^{\alpha} \bigr) \bigr]_{i,j} = c_{p^{i-1}} \bigl( p^{\alpha-j+1} \bigr) = \begin{cases} 1 & \text{if $i =1$},\\ p^{i-2}(p-1) & \text{if $i+j \leq\alpha+2$},\\ -p^{i-2} & \text{if $i+j = \alpha+3$},\\ 0 & \text{if $i+j > \alpha+3$}. \end{cases} $$
(3.4)

For instance, the supercharacter tables S(p 2) and S(p 3) are given by

(3.5)

respectively. In general, S(p α−1) appears as the upper-right hand corner of S(p α), as illustrated in (3.5). Let us also note, for future reference, that

$$ \operatorname{tr}S \bigl(p^{\alpha} \bigr) = \begin{cases} p^{\frac{\alpha}{2}} & \text{if $\alpha$ is even},\\ 0 & \text{if $\alpha$ is odd}, \end{cases} $$
(3.6)

follows from (3.4) and a telescoping series argument.

For some purposes, it is more fruitful to consider the associated unitary matrix U=U(p α), defined by (2.17), in place of S(p α) itself. Setting n=p α, d i =p i−1, and d j =p j−1 in (2.17), we find that

$$ \bigl[U \bigl(p^{\alpha} \bigr) \bigr]_{i,j} = \frac{ c_{p^{i-1}} ( p^{\alpha-j+1}) \sqrt{ \phi(p^{j-1})} }{p^{\alpha/2}\sqrt{ \phi(p^{i-1})}} . $$
(3.7)

A few simple computations reveal that

$$ \bigl[U \bigl(p^{\alpha} \bigr) \bigr]_{i,j} = \begin{cases} p^{-\frac{\alpha}{2}} & \text{if $i = j = 1$},\\ p^{\frac{j-\alpha-2}{2}}\sqrt{p-1} & \text{if $i =1$ and $j > 1$},\\ p^{\frac{i-\alpha-2}{2}}\sqrt{p-1} & \text{if $j =1$ and $i > 1$},\\ (p-1)p^{\frac{i+j-\alpha-4}{2}} & \text{if $3\leq i+j \leq\alpha +2$},\\ -\frac{1}{\sqrt{p}} & \text{if $3 \leq i+j = \alpha+3$},\\ 0 & \text{if $i+j > \alpha+3$}. \end{cases} $$
(3.8)

Despite its somewhat imposing appearance, the preceding expression tells us that the (α+1)×(α+1) unitary matrix U is selfadjoint and that its lower right α×α submatrix is a Hankel matrix. This is illustrated in the following example.

Example 2

Since U=U(n) is unitary and selfadjoint, it follows that

$$ U^2 = I. $$
(3.9)

Therefore the only possible eigenvalues of U are ±1. The exact multiplicities of these eigenvalues can be determined using (2.16), which asserts that p α/2 U is similar to S. It follows from (3.6) that

$$ \operatorname{tr}U = \begin{cases} 1 & \text{if $\alpha$ is even},\\ 0 & \text{if $\alpha$ is odd}. \end{cases} $$
(3.10)

Since U is (α+1)×(α+1), it follows that the eigenvalues of U are −1 (multiplicity \(\frac{\alpha}{2}\)) and 1 (multiplicity \(\frac{\alpha}{2}+1\)) if α is even; and ±1 (both with multiplicity \(\frac{\alpha+1}{2}\)) if α is odd. Since

$$ \biggl\lfloor\frac{\alpha+1}{2} \biggr\rfloor= \begin{cases} \frac{\alpha}{2} & \text{if $\alpha$ is even},\\ \frac{\alpha+1}{2} & \text{if $\alpha$ is odd}, \end{cases} $$

we conclude that

$$ \det U \bigl(p^{\alpha} \bigr) = (-1)^{ \lfloor\frac {\alpha+1}{2} \rfloor}. $$
(3.11)

We will make use of this formula later on.

3.2 Kronecker products

Recall that the Kronecker product AB of an m×n matrix A and a p×q matrix B is the mp×nq matrix given by

$$ A \otimes B = \begin{bmatrix} a_{11} B & \cdots& a_{1n}B \\ \vdots& \ddots& \vdots \\ a_{m1} B & \cdots& a_{mn} B \end{bmatrix} . $$

Whenever the dimensions of the matrices involved are compatible, we have ABBA and

$$ (A\otimes B) (C\otimes D) \cong \mathit{AC} \otimes \mathit{BD}, $$

where, as briefly mentioned in Sect. 2.2, ≅ denotes similarity via a permutation matrix. Finally, we also recall that if A is m×m and B is n×n, then

$$ \det(A \otimes B) = (\det A)^n(\det B)^m $$
(3.12)

and

$$ \operatorname{tr}(A \otimes B) = (\operatorname{tr}A) (\operatorname{tr}B). $$
(3.13)

3.3 Products of supercharacter theories

Given two supercharacter theories \((\mathcal{X}_{1},\mathcal{K}_{1})\) and \((\mathcal{X}_{2},\mathcal{K}_{2})\) on two finite groups G 1 and G 2, one can construct a natural product supercharacter theory on G 1×G 2. Writing

$$ \mathcal{K}_1 = \bigl\{ K_1^{(1)}, K_2^{(1)}, \ldots,K_r^{(1)} \bigr\}, \qquad\mathcal{K}_2 = \bigl\{ K_1^{(2)}, K_2^{(2)}, \ldots,K_s^{(2)} \bigr\}, $$

and

$$ \mathcal{X}_1 = \bigl\{ X_1^{(1)}, X_2^{(1)}, \ldots,X_r^{(1)} \bigr\}, \qquad\mathcal{X}_2 = \bigl\{ X_1^{(2)}, X_2^{(2)}, \ldots,X_s^{(2)} \bigr\}, $$

we first define

$$ \mathcal{K}= \mathcal{K}_1 \times \mathcal{K}_2. $$
(3.14)

On the other hand, since [26, Thm. 4.21] tells us that

$$ \operatorname{Irr}(G_1\times G_2) = \operatorname{Irr}(G_1) \times \operatorname{Irr} (G_2), $$

it is natural for us to define

$$ \mathcal{X}= \mathcal{X}_1 \times \mathcal{X}_2. $$
(3.15)

A straightforward computation now shows that

$$ \sigma_{X_i^{(1)} \times X_j^{(2)}} \bigl( (g_1,g_2) \bigr) = \sigma_{X_i^{(1)}}(g_1) \sigma_{X_j^{(2)}}(g_2) $$
(3.16)

whenever g 1 and g 2 belong to G 1 and G 2, respectively. In particular, this implies that \(\sigma_{X_{1} \times X_{2}}\) is constant on each element of \(\mathcal{K}\). Since \(|\mathcal{X}| = |\mathcal{X}_{1}||\mathcal{X}_{2}| = |\mathcal {K}_{1}||\mathcal{K}_{2}| = |\mathcal{K}|\), we conclude that \((\mathcal{X},\mathcal{K})\) is a supercharacter theory on G 1×G 2.

Putting this all together, (3.16) tells us that if S 1 and S 2 are the matrices that encode the supercharacter tables corresponding to the supercharacter theories \((\mathcal{X}_{1},\mathcal{K}_{1})\) and \((\mathcal{X}_{2},\mathcal{K}_{2})\) on G 1 and G 2, respectively, then the Kronecker product S 1S 2 encodes the product supercharacter theory \((\mathcal{X},\mathcal{K})\) on G 1×G 2. To be more specific, we list the elements of \(\mathcal{K}\) and \(\mathcal{X}\) in their respective lexicographic orders induced by the product structures (3.14) and (3.15). In light of (3.16), we see that the resulting supercharacter table S for the product theory \((\mathcal{X},\mathcal{K})\) on G 1×G 2 satisfies SS 1S 2.

The details of the preceding construction were worked out by A.O.F. Hendrickson, a student of Isaacs, in his doctoral thesis [22, Sect. 2.6]. We refer the reader there and to his recent paper [23] for further information.

3.4 Multiplicativity

Recall that if G 1 and G 2 are finite groups, then

$$ \operatorname{Aut}(G_1) \times\operatorname{Aut}(G_2) \subseteq \operatorname{Aut}(G_1 \times G_2), $$

although equality does not hold in general. We are interested here in the special case where \(G_{1} = \mathbb{Z}/ m \mathbb{Z}\), \(G_{2} = \mathbb{Z}/ n \mathbb{Z}\), and (m,n)=1. In this setting,

$$ (\mathbb{Z}/ m \mathbb{Z}) \times(\mathbb{Z}/ n \mathbb{Z}) \cong \mathbb{Z}/ mn \mathbb{Z}, $$
(3.17)

so that if we indulge in a slight abuse of language, we obtain

$$ \operatorname{Aut}( \mathbb{Z}/ m \mathbb{Z}) \times\operatorname {Aut}( \mathbb{Z}/ n \mathbb{Z}) \subseteq\operatorname{Aut}(\mathbb {Z}/ mn \mathbb{Z}) $$

or, equivalently,

$$ (\mathbb{Z}/ m\mathbb{Z})^{\times} \times(\mathbb{Z}/n\mathbb {Z})^{\times} \subseteq(\mathbb{Z}/ mn \mathbb{Z})^{\times}. $$

Since the orders of the preceding groups are ϕ(m), ϕ(n), and ϕ(mn), respectively, it follows from the multiplicativity of the Euler totient function that

$$ \operatorname{Aut}( \mathbb{Z}/ m \mathbb{Z}) \times\operatorname {Aut}( \mathbb{Z}/ n \mathbb{Z}) \cong\operatorname{Aut}(\mathbb {Z}/ mn \mathbb{Z}). $$

Thus the product supercharacter theory for \(\mathbb{Z}/mn \mathbb {Z}\), obtained from the supercharacter theories for \(\mathbb{Z}/m\mathbb{Z}\) and \(\mathbb{Z}/n\mathbb{Z}\) induced by \(\operatorname{Aut}(\mathbb{Z}/m\mathbb{Z})\) and \(\operatorname {Aut}(\mathbb{Z}/n\mathbb{Z})\), respectively, is the same supercharacter theory for \(\mathbb{Z}/mn\mathbb{Z}\) that arises from the action of \(\operatorname{Aut}(\mathbb{Z}/mn\mathbb{Z})\). In other words,

$$ S(mn) \cong S(m) \otimes S(n) $$
(3.18)

whenever (m,n)=1. In particular, it follows from (3.16) and the Chinese Remainder Theorem that

$$ c_{mn} \bigl(dd' \bigr) = c_m(d) c_n \bigl(d' \bigr) $$
(3.19)

whenever d and d′ are positive divisors of m and n, respectively. Indeed, first let \(G_{1} = \mathbb{Z}/m\mathbb{Z}\) and \(G_{2} = \mathbb {Z}/n\mathbb{Z}\), with

$$ K_{\tau(m)}^1 = \bigl\{ a \in\mathbb{Z}/m\mathbb{Z}: (a,m) = 1 \bigr\},\qquad K_{\tau(n)}^2 = \bigl\{ b \in\mathbb{Z}/n \mathbb{Z}: (a,n) = 1 \bigr\}, $$

and observe that the map \(\varPhi: \mathbb{Z}/m\mathbb{Z}\times \mathbb{Z}/n\mathbb{Z}\to\mathbb{Z}/mn\mathbb{Z}\) defined by \(\varPhi( (a,b) ) = ab\, (\operatorname{mod} mn)\) is an isomorphism. In particular, this implies that

$$ \varPhi \bigl( K_{\tau(m)}^1 \times K_{\tau(n)}^2 \bigr) = \bigl\{ c \in\mathbb{Z}/mn\mathbb{Z}: (c,mn) = 1 \bigr\} $$

and that

$$ \varPhi \bigl( \bigl(d,d' \bigr) \bigr) = dd'\,( \operatorname{mod} mn) $$

whenever d|m and d′|n. Putting this all together and using (3.16) leads to the desired formula (3.19).

Now recall that when we originally defined the supercharacter table S(n) for \(\mathbb{Z}/n\mathbb{Z}\) (see Sect. 2.2), we were not particular about the manner in which the divisors of n were listed. The reason for this lack of specificity is due to the fact that even though the Kronecker product S(m)⊗S(n) represents a supercharacter table for \(\mathbb{Z}/ mn \mathbb{Z}\) arising from the action of \(\operatorname{Aut}(\mathbb{Z}/ mn\mathbb{Z})\), the ordering of the superclasses and supercharacters in the product table might differ from what one might consider a “natural” ordering (e.g., the ordering induced by listing the divisors of mn in increasing or decreasing order). However, this poses no difficulty in practice since much of our work will involve similarity invariants of matrices.

Example 3

For m=4 and n=5 and using the ordered divisor lists {1,2,4} and {1,5}, respectively, from (3.4) we obtain

$$ S(4) = \begin{bmatrix} 1 & 1 & 1 \\ 1 & 1 & -1 \\ 2 & -2 & 0 \end{bmatrix} ,\qquad S(5) = \begin{bmatrix}1&1\\4&-1 \end{bmatrix} . $$

Using the ordered divisor list {1,2,4,5,10,20} for mn=20 and computing S(20) directly from (2.15) and the definition of Ramanujan sums yields

In the preceding we have partitioned the matrix S(20) for clarity.

An important consequence of (3.19) is the fact that Ramanujan sums c n (x) are multiplicative with respect to the subscript n.

Theorem 3.1

If (m,n)=1, then

$$ c_{mn}(x) = c_m(x) c_n(x) $$

for all x in \(\mathbb{Z}\).

Proof

Since (m,n)=1, it follows from (3.19) that

$$ c_{mn} \bigl( (x,mn) \bigr) = c_{mn} \bigl( (x,m) (x,n) \bigr) = c_m \bigl( (x,m) \bigr) c_n \bigl( (x,n) \bigr), $$

whence c mn (x)=c m (x)c n (x) by (2.14). □

Corollary 3.2

For n≥1, we have

$$ c_n(1) = \mu(n),\qquad c_n(n) = \phi(n). $$

Furthermore, c n (x) is always an integer.

Proof

The boxed statements follow from (3.2) and Theorem 3.1. The integrality of c n (x) follows from Theorem 3.1 and the explicit values given in (3.4). □

The proof of Theorem 3.1 suggests an interesting variant [43, Ex. 2.2, p. 89]:

Theorem 3.3

If (mx,ny)=1, then

$$ c_{mn}(xy) = c_m(x) c_n(y). $$
(3.20)

Proof

Since (m,n)=(x,n)=(y,m)=1, it follows from (3.19) that

$$ c_{mn} \bigl( (xy,mn) \bigr) = c_{mn} \bigl( (x,m) (y,n) \bigr) = c_m \bigl( (x,m) \bigr)c_n \bigl( (y,n) \bigr), $$

whence c mn (xy)=c m (x)c n (y) by (2.14). □

Corollary 3.4

If \(n = p_{1}^{\alpha_{1}} p_{2}^{\alpha_{2}}\cdots p_{r}^{\alpha_{r}}\) is the canonical factorization of n into distinct primes p 1,p 2,…,p r and \(d = p_{1}^{\beta_{1}} p_{2}^{\beta_{2}} \cdots p_{r}^{\beta_{r}}\) is a divisor of n, then

$$ c_n(d) = \prod_{\ell=1}^r c_{p_{\ell}^{\alpha_{\ell}}} \bigl(p_{\ell}^{\beta_{\ell}} \bigr). $$

3.5 Piecing things together

The following useful result permits us to deduce a variety of results about Ramanujan sums by piecing together our observations from Sect. 3.1. In the present setting, recall that product supercharacter theories correspond to Kronecker products of supercharacter tables.

Theorem 3.5

If \(n = p_{1}^{\alpha_{1}} p_{2}^{\alpha_{2}}\cdots p_{r}^{\alpha_{r}}\) is the canonical factorization of n into distinct primes p 1,p 2,…,p r , then

$$ S(n) \cong\bigotimes_{i=1}^{r} S \bigl(p_i^{\alpha_i} \bigr), \qquad U(n) \cong\bigotimes _{i=1}^{r} U \bigl(p_i^{\alpha_i} \bigr). $$
(3.21)

In particular, U(n) is a selfadjoint, unitary involution whose (i,j) entry is given by

$$ \bigl[U(n) \bigr]_{i,j} = \frac{1}{\sqrt{n}}c_{d_i} \biggl( \frac{n}{d_j} \biggr) \sqrt{ \frac{\phi(d_j)}{\phi(d_i)} }, $$
(3.22)

where d 1,d 2,…,d τ(n) are the positive divisors of n. We also have S(n)2=nI.

Proof

The first matrix identity in (3.21) follows immediately from (3.18). The second identity in (3.21) follows from the first identity, the multiplicativity of (3.7), and the basic properties of the Kronecker product, along with (2.16), (2.5), and (2.13). The fact that U(n) is a selfadjoint involution follows from the fact that each \(U(p_{i}^{\alpha_{i}})\) is a selfadjoint involution. Formula (3.22) is a simple consequence of (3.7) and the multiplicativity of the Euler totient function. Finally, we note that (2.16) now implies that S(n)2=nI. □

As a trivial consequence of Theorem 3.5, we obtain [43, Ex. 2.10]:

Corollary 3.6

For n≥1, we have

$$ \sum_{d|n} c_d \biggl( \frac{n}{d} \biggr) = \begin{cases} 0 & \text{\textit{if} $n$ \textit{is not a perfect square}}, \\ \sqrt{n} & \text{\textit{if} $n$ \textit{is a perfect square}}. \end{cases} $$

Proof

Writing \(n = p_{1}^{\alpha_{1}} p_{2}^{\alpha_{2}}\cdots p_{r}^{\alpha_{r}}\) and applying (3.21), we find that

$$ \sum_{d|n} c_d \biggl( \frac{n}{d} \biggr) = \operatorname{tr}S(n) = \operatorname {tr} \Biggl( \bigotimes_{i=1}^{r} S \bigl(p_i^{\alpha_i} \bigr) \Biggr) = \prod _{i=1}^{r} \operatorname{tr}S \bigl(p_i^{\alpha_i}\bigr). $$

The result now follows immediately from (3.6). □

The following corollary of Theorem 3.5 appears to be novel, as we were unable to find it in our extensive search of the literature. In particular, although the magnitude of the following determinant is possible to conjecture based on numerical evidence, the sign of the determinant is determined by a rather complicated formula that seems difficult to arrive at using other means.

Corollary 3.7

If \(n = p_{1}^{\alpha_{1}} p_{2}^{\alpha_{2}}\cdots p_{r}^{\alpha_{r}}\) is the canonical factorization of n into distinct primes p 1,p 2,…,p r , then

$$ \det \biggl[ c_{d_i} \biggl( \frac{n}{d_j} \biggr) \biggr]_{i,j=1}^{\tau(n)} = n^{\frac{\tau(n)}{2}} (-1)^{\textstyle\sum_{i=1}^r\lfloor\frac {\alpha_i+1}{2} \rfloor \frac{\tau(n)}{\alpha_i+1}}, $$
(3.23)

where τ(n) denotes the number of positive divisors d 1,d 2,…,d τ(n) of n.

Proof

Simply observe that

 □

3.6 Reciprocity and von Sterneck’s formula

Our next result requires no proof, for it follows immediately from (2.17) and the fact that U=U T.

Theorem 3.8

(Reciprocity formula)

If d and dare positive divisors of n, then

$$ c_{d} \biggl( \frac{n}{d'} \biggr) \phi \bigl(d' \bigr) = c_{d'} \biggl( \frac{n}{d} \biggr) \phi(d). $$
(3.24)

Although we have been unable to find (3.24) in the literature, given the long and storied history of the Ramanujan sum, it is certainly possible that we are not the first to have discovered it. In fact, various other reciprocity formulas have also been discussed [30, 31]. For our purposes, the importance of (3.24) lies in the fact that it provides a one-line proof of the following important formula [20, Thm. 272], [43, Cor. 2.4], [57, p. 40].

Corollary 3.9

(von Sterneck’s formula)

For \(n,x \in\mathbb{Z}\) with n≥1, we have

$$ c_n(x) = \frac{ \mu( \frac {n}{(n,x)} ) \phi(n) }{ \phi( \frac{n}{(n,x)} )}. $$
(3.25)

Proof

Let d′=n and d=n/(n,x) in (3.24) and use (2.14). □

Let us make a few historical remarks concerning the preceding formula. The peculiar arithmetic function on the right-hand side of (3.25) is sometimes called von Sterneck’s function. It was first studied by von Sterneck (1902) [63], independently of Ramanujan sums, which first rose to prominence with Ramanujan’s seminal paper (1918) [53, Paper 21]. It has frequently been claimed that the fact that von Sterneck’s function equals c n (x) was first observed by Hölder (1936) [24]. However, Peter van der Kamp was kind enough to inform us that Kluyver (1906) [36, p. 410] had already discovered the equality (3.25) some thirty years before Hölder’s paper appeared.

Before proceeding, let us also remark that Corollary 3.2 follows immediately from von Sterneck’s formula by setting x=1 and x=n, respectively, in (3.25).

Corollary 3.10

If n≥1, then

$$ \sum_{d|n,d'|n} c_d \biggl( \frac{n}{d'} \biggr)c_{d'} \biggl( \frac{n}{d} \biggr) =n \tau(n). $$

Proof

Letting I denote the τ(nτ(n) identity matrix, it follows from (3.22) that

$$ \tau(n) = \operatorname{tr}I = \operatorname{tr}U^*U = \sum_{i,j=1}^{\tau(n)} \bigl( \bigl[ U(n) \bigr]_{i,j} \bigr)^2 = \sum_{d|n,d'|n} \frac{1}{n} \biggl(c_d \biggl( \frac{n}{d'} \biggr) \biggr)^2 \frac{ \phi(d')}{\phi(d)}. $$

The desired formula now follows from (3.24). □

3.7 A mixed orthogonality relation

An immediate consequence of Theorem 3.5 is the following orthogonality relation in which the parameters d and d′ play quite different roles. The text [43, Thm. 2.8] devotes almost two pages to its proof.

Theorem 3.11

(Mixed orthogonality relation)

If n≥1, d|n, and d′|n, then

$$ \frac{1}{n} \sum_{k|n} c_{d} \biggl( \frac{n}{k} \biggr) c_{k} \biggl( \frac{n}{d'} \biggr) = \delta_{d,d'}. $$
(3.26)

Proof

Compute the (i,j) entry in the equation S(n)2=nI. □

Corollary 3.12

If x is an integer, then

$$ \sum_{k|n} c_k(x) = \begin{cases} n &\text{\textit{if}\ $n|x$},\\ 0 & \text{\textit{if}\ $n\nmid x$}. \end{cases} $$
(3.27)

Proof

Set d=1 and d′=n/(x,n) in (3.26) to obtain \(\sum_{k|n} c_{k} ( (x,n) ) = n \delta_{ \frac{n}{(x,n)},1}\). Now apply (2.14) to obtain (3.27). □

Corollary 3.13

If n≥1 and d|n, then

$$ \sum_{k|n} c_d \biggl( \frac{n}{k} \biggr) \mu(k) = \begin{cases} n & \text{\textit{if} $d = n$},\\ 0 & \text{\textit{if} $d \neq n$}. \end{cases} $$

Proof

Let d′=n in (3.26) and use Corollary 3.2. □

One says that a function \(f:\mathbb{Z}\to\mathbb{C}\) is even modulo n if

$$ f \bigl( (n,x) \bigr) = f(x) $$

for all integers x [43, p. 79], [57, p. 15]. As we noted in (2.14), since c n (x) is a superclass function with respect to the variable x, it follows that c n (x) is even modulo n. We remark that the following theorem [43, Thm. 2.9] has a trivial proof based upon supercharacter theory. In contrast, the standard proof requires several pages of straightforward but tedious manipulations.

Theorem 3.14

If \(f:\mathbb{Z}\to\mathbb{C}\) is an even function modulo n, then f can be written uniquely in the form

$$ f(x) = \sum_{d|n} \alpha(d) c_d(x), $$
(3.28)

where the coefficients α(d) are given by

$$ \alpha(d) = \frac{1}{n} \sum_{k|n} f \biggl( \frac{n}{k} \biggr) c_{k} \biggl( \frac{n}{d} \biggr). $$

Proof

Since the Ramanujan sums \(c_{d_{i}}(x) = \sigma_{i}(x)\) form a basis for the space of all superclass functions by [14, Thm. 2.2], a unique expansion of the form (3.28) exists. The formula for the coefficients follows immediately from (3.26). □

As Hardy observes in the notes to [19, Paper 21], the following important result was first obtained by Kluyver (1906) [36, p. 410]. It also appears in the more recent texts [43, Prop. 2.1], [46, Thm. A.24], and [57, Thm. 3.1b].

Theorem 3.15

(Kluyver)

If \(n,x \in\mathbb{Z}\) and n≥1, then

$$ c_n(x) = \sum _{ d | (n,x) } \mu \biggl( \frac{n}{d} \biggr)d. $$
(3.29)

Proof

Applying the Möbius inversion formula to (3.27), it follows that

$$ c_n(x) = \sum_{d|n} \mu \biggl( \frac{n}{d} \biggr) \sum_{k|d} c_k(x) = \sum_{d|n, d|x} \mu \biggl( \frac{n}{d} \biggr) d = \sum_{ d | (n,x) } \mu \biggl( \frac{n}{d} \biggr)d. $$

 □

Corollary 3.16

For all m,n≥1, the inequality

$$ \bigl|c_n(m)\bigr| \leq\sigma(m) $$

holds. Here σ(m) denotes the sum of the divisors of m.

4 Superclass arithmetic

In this section, we explore several further properties of Ramanujan sums that can be deduced by studying superclass arithmetic on \(\mathbb{Z}/n\mathbb{Z}\). Although this approach has been attempted sporadically throughout the years [21, 34, 35, 42, 51], these earlier authors did not have the benefit of the general theory of supercharacters.

We state the following preparatory lemmas in full generality, noting that they apply to any finite group G (we require only the case \(G = \mathbb{Z}/n\mathbb{Z}\)). In fact, a more primitive approach was recently undertaken in [18], where Kloosterman sums were considered in the context of classical character theory.

4.1 Superclass constants and simultaneous diagonalization

Suppose that G is a finite group with supercharacter theory \((\mathcal{X}_{H},\mathcal{K}_{H})\) generated by the action of some subgroup H of \(\operatorname{Aut}(G)\) (see Sect. 2.2). In particular, we obtain from the action of H on G a partition \(\mathcal{X}= \{X_{1},X_{2},\ldots,X_{r}\}\) of \(\operatorname{Irr}(G)\) with corresponding supercharacters (2.1) and a compatible partition \(\mathcal{K}= \{ K_{1},K_{2},\ldots,K_{r}\}\) of G into superclasses. We also note that the superclass sums

$$ \hat{K}_i = \sum_{g \in K_i} g $$

belong to the center \({\bf Z}(\mathbb{C}[G])\) of the group algebra \(\mathbb{C}[G]\). Indeed, each K i is a union of conjugacy classes of G and it is well known that the corresponding class sums each belong to \({\bf Z}(\mathbb{C}[G])\) [26, Thm. 2.4].

Although the following lemma is a special case of [14, Cor. 2.3], we provide a brief explanation for the sake of completeness. Since we are, for the moment, working under the assumption that G is an arbitrary finite group, we write the group operation multiplicatively. When we later apply the following two results to \(\mathbb{Z}/n\mathbb{Z}\), the group operation will be addition modulo n.

Lemma 4.1

Fix some element z in K k and let a i,j,k denote the number of solutions (x i ,y j )∈K i ×K j to the equation xy=z. The superclass constants a i,j,k satisfy

$$ \hat{K}_i \hat{K}_j = \sum_{k=1}^r a_{i,j,k} \hat{K}_k $$
(4.1)

for 1≤i,j,kr.

Proof

It suffices to prove that a i,j,k does not depend upon the particular chosen representative z of K k . Suppose that z 1 and z 2 belong to K k . By the definition of the supercharacter theory \((\mathcal{X}_{H}, \mathcal{K}_{H})\), there exists an automorphism φ:GG belonging to H and an element g of G such that

$$ \varphi(z_1) = \varphi(g)^{-1}z_2\varphi(g). $$

If (x 1,y 1)∈K i ×K j is a solution to the equation xy=z 1, then

$$ \varphi(x_1)\varphi(y_1) =\varphi(z_1)= \varphi(g)^{-1} z_2 \varphi(g), $$

from which it follows that the elements x 2=φ(gx 1 g −1) of K i and y 2=φ(gy 1 g −1) of K j satisfy x 2 y 2=z 2. Thus there is a bijection between solutions (x 1,y 1)∈K i ×K j of xy=z 1 and solutions (x 2,y 2)∈K i ×K j of xy=z 2. □

The following theorem is partly inspired by the corresponding result from classical character theory [13, Sect. 33], [18, Lem. 3.1], [38, Lem. 4]. Since we require a supercharacter version of this result, we provide a detailed proof. As with the preceding lemma, we work with an arbitrary finite group G, maintaining the notation and conventions established at the beginning of this subsection.

Theorem 4.2

Let \(M_{i} = (a_{i,j,k})_{j,k=1}^{r}\). If \(W = (w_{j,k})_{j,k=1}^{r}\) denotes the r×r matrix with entries

$$ w_{j,k} = \frac{|K_j|\sigma_k(K_j)}{ \sum_{\chi\in X_k} \chi(1)}, $$
(4.2)

and \(D_{i} = \operatorname{diag}(w_{i,1}, w_{i,2}, \ldots, w_{i,r})\), then W is invertible and

$$ M_i W = W D_i $$
(4.3)

for i=1,2,…,r. In particular, the matrices M 1,M 2,…,M r are simultaneously diagonalizable and commute with each other.

Proof

Applying σ k to the superclass sum \(\hat{K}_{j}\), we first note that

$$ \sigma_k(\hat{K}_j) = |K_j| \sigma_k(K_j) $$
(4.4)

since σ k is a superclass function that assumes the constant value σ k (K j ) on the superclass K j . Next let π k be the matrix representation of G given by

$$ \pi_k = \bigoplus_{\chi\in X_k} \chi(1) \pi_{\chi}, $$
(4.5)

where π χ is an irreducible matrix representation whose character χ belongs to X k . Since each \(\hat{K}_{j}\) belongs to \({\bf Z}(\mathbb{C}[G])\) and each π χ is irreducible, there exist constants \(w_{j,k}^{\chi}\) such that

$$ \pi_{\chi}(\hat{K}_j) = \frac{w_{j,k}^{\chi }}{\chi(1)} I_{\chi(1)}, $$
(4.6)

where I χ(1) denotes the χ(1)×χ(1) identity matrix. Taking the trace of both sides of the preceding equation, we find that

$$ \chi(\hat{K}_j) = w_{j,k}^{\chi}. $$
(4.7)

We now claim that \(w_{j,k}^{\chi}\) is independent of which particular irreducible character χ in X k is chosen. Indeed, if χ and χ′ belong to X k , then there exists an automorphism φ:GG belonging to H such that χ=χ′∘φ. Therefore,

$$ \chi(\hat{K}_j)= \sum_{g \in K_j} \chi(g) = \sum_{g\in K_j} \chi' \bigl(\varphi(g) \bigr) = \sum_{h \in\varphi(K_j)} \chi'(h) =\sum _{h \in K_j} \chi'(h) = \chi'(\hat{K}_j) $$

since φ permutes the conjugacy classes that constitute the superclass K j . In the following, we now write w j,k in place of \(w_{j,k}^{\chi}\).

Substituting (4.6) into (4.5), we find that

$$ \pi_k(\hat{K}_j) =\bigoplus _{\chi\in X_k} w_{j,k} I_{\chi(1)}. $$
(4.8)

Taking the trace of the preceding yields

$$ \operatorname{tr}\pi_k(\hat{K}_j) = \sum _{\chi\in X_k} \chi(1) w_{j,k} = \sum _{\chi\in X_k} \chi(1)\chi(\hat{K}_j) = \sigma_k(\hat{K}_j) $$

by the definition (2.1) of the supercharacter σ k . Returning to (4.4) and using the preceding, we find that

$$ |K_j| \sigma_k(K_j) = \sigma_k(\hat{K}_j) = w_{j,k} \sum _{\chi\in X_k} \chi(1), $$

from which we obtain the desired formula (4.2) for w j,k .

We now need to verify that the simultaneous diagonalization (4.3) holds. Applying π to (4.1) and using (4.8), we see that

$$ \biggl(\bigoplus_{\chi\in X_{\ell}} w_{i,\ell}I_{\chi(1)} \biggr) \biggl( \bigoplus_{\chi\in X_{\ell}} w_{j,\ell}I_{\chi(1)} \biggr) = \sum_{k=1}^r a_{i,j,k} \biggl( \bigoplus_{\chi\in X_{\ell}} w_{k,\ell}I_{\chi(1)} \biggr). $$

Considering the direct summand corresponding to an arbitrary χ in X , we see that

$$ w_{i,\ell} I_{\chi(1)} w_{j,\ell} I_{\chi(1)} =\sum _{k=1}^r a_{i,j,k} w_{k,\ell}I_{\chi(1)}, $$

which in turn implies that

$$ \sum_{k=1}^r a_{i,j,k} w_{k,\ell} = w_{j,\ell} w_{i,\ell}. $$

To conclude the proof, we observe that the preceding equation is simply the (j,) entry of the matrix equation (4.3). □

4.2 The matrices M i (p α)

We are now ready to discuss the matrices M i =M i (n), as defined in Theorem 4.2, which arise from the supercharacter theory for \(G=\mathbb{Z}/n\mathbb {Z}\) induced by \(H = \operatorname{Aut}(G)\) (described in Sect. 2.2). We do this first for prime powers n=p α.

Let us first note that the divisors of p α are the α+1 numbers d i =p i−1 for i=1,2,…,α+1. In light of (2.12), this yields the corresponding superclasses

$$ K_i = \bigl\{ a \in\mathbb{Z}/p^{\alpha} \mathbb{Z}: \bigl(a,p^{\alpha} \bigr) = p^{{\alpha}-i+1} \bigr\} = \bigl\{ xp^{\alpha-i+1} \in\mathbb{Z}/p^{\alpha}\mathbb{Z}: p \nmid x \bigr\}, $$
(4.9)

each of which satisfies

$$ |K_i| = \phi \bigl(p^{i-1} \bigr) $$

by (2.13). Fixing some z in K k , we let a i,j,k denote the number of solutions (x,y) in K i ×K j to the equation

$$ x+y = z. $$
(4.10)

Recall that since Lemma 4.1 concerns general groups, the corresponding equation xy=z was written in multiplicative notation. However, since we are considering only abelian groups, we choose now to employ additive notation.

We claim that the matrix \(M_{i}(p^{\alpha}) = [a_{i,j,k}]_{j,k=1}^{\alpha+1}\) is given by

(4.11)

where the ith row and column of M i are singled out. Although the computations involved are elementary, we feel compelled to provide a complete justification of (4.11) since so many of our upcoming results depend upon this formula. The reader is invited to consult Appendix for the details.

The matrix W=W(p α) described by Theorem 4.2 is somewhat easier to describe. In fact, we claim that

$$ W \bigl(p^{\alpha} \bigr) = S \bigl(p^{\alpha}\bigr), $$
(4.12)

the supercharacter table for \(\mathbb{Z}/p^{\alpha}\mathbb{Z}\) discussed in Sect. 3.1. To see this, simply note that

Finally, there are the diagonal matrices D i =D i (p α). By Theorem 4.2 we have

$$ \bigl[D_i \bigl(p^{\alpha} \bigr) \bigr]_{j,k} = \delta_{j,k} c_{d_i} \biggl( \frac{p^{\alpha}}{d_k} \biggr) = \delta_{j,k} c_{p^{i-1}} \bigl(p^{\alpha-k+1} \bigr). $$
(4.13)

Example 4

The case p=3 and α=4 yields the divisors d 1=1, d 2=3, d 3=9, d 4=27, and d 5=81 of p α=81. The corresponding matrices M 1,M 2,M 3,M 4,M 5 are displayed below.

These matrices each satisfy M i W=WD i where

$$ W = \begin{bmatrix} 1 & 1 & 1 & 1 & 1 \\ 2 & 2 & 2 & 2 & -1 \\ 6 & 6 & 6 & -3 & 0 \\ 18 & 18 & -9 & 0 & 0 \\ 54 & -27 & 0 & 0 & 0 \end{bmatrix} $$

is the supercharacter table S(81) and

$$\begin{aligned} D_1 &= \operatorname{diag}(1,1,1,1,1) , \\ D_2 &= \operatorname{diag}(2,2,2,2,-1), \\ D_3 &= \operatorname{diag}(6,6,6,-3,0), \\ D_4 &= \operatorname{diag}(18,18,-9,0,0), \\ D_5 &= \operatorname{diag}(54,27,0,0,0). \end{aligned}$$

Before proceeding, let us make a few remarks about the matrices (4.11). First observe that M i (p α) is a multiple of a stochastic matrix since each column of M i (p α) sums to ϕ(p i−1). Indeed, we only need verify this for the ith column of M i (p α):

$$\begin{aligned} &1 + \phi(p) + \phi \bigl(p^2 \bigr) + \cdots+ \phi \bigl(p^{i-2} \bigr) + \bigl(p^{i-1} - 2p^{i-2} \bigr) \\ &\quad= 1 + (p-1) + \bigl(p^2-p \bigr)+\cdots+ \bigl(p^{i-2}-p^{i-3} \bigr) + \bigl(p^{i-1} - 2p^{i-2} \bigr) \\ &\quad= p^{i-1} - p^{i-2} \\ &\quad= \phi \bigl(p^{i-1} \bigr). \end{aligned}$$

Let us also note that

$$ \sum_{i=1}^{\alpha+1} M_i \bigl(p^{\alpha} \bigr) = \begin{bmatrix} 1 & 1 & \cdots& 1 \\ \phi(p) & \phi(p) & \cdots& \phi(p) \\ \vdots& \vdots& \ddots& \vdots\\ \phi(p^{\alpha}) & \phi(p^{\alpha}) & \cdots& \phi(p^{\alpha}) \\ \end{bmatrix} . $$
(4.14)

This can be deduced from (4.11) and the fact that for jk, the corresponding off-diagonal matrix entry [M i (p α)] j,k is nonzero for only a single value of i. Finally, we observe that

$$ 1\leq i<j \leq\alpha+1\quad\implies\quad M_i M_j = \phi \bigl(p^{i-1} \bigr)M_j = M_j M_i $$
(4.15)

follows from a straightforward computation.

4.3 The matrices M i (n) for general n

Having computed the matrices M i (p α) for prime powers, we now turn to the problem of computing M i (n) for general n and harnessing the power of Theorem 4.2 to produce new identities for Ramanujan sums. The approach is straightforward enough, for the simultaneous diagonalization (4.3) “tensors” in the expected manner, much as the supercharacter tables did in Theorem 3.5. The difficulty in establishing this is mostly notational. We therefore take a moment to introduce the somewhat elaborate notation which is required.

Suppose that (m,n)=1 and let d 1,d 2,…,d τ(m) and e 1,e 2,…,e τ(n) be ordered lists of the divisors of m and n, respectively. Having specified an order to the divisors of m and n, we can generate the matrices M i (m),W(m),D i (m) and M i(n),W(n),D i(n), respectively, as defined by Theorem 4.2 (i.e., apply Theorem 4.2 separately to the two groups \(\mathbb{Z}/m\mathbb{Z}\) and \(\mathbb{Z}/n\mathbb{Z}\), each endowed with the supercharacter theory described in Sect. 2.2).

Next we observe that the divisors of mn are clearly the τ(mn)=τ(m)τ(n) numbers d i e i where 1≤iτ(m) and 1≤i′≤τ(n). However, we need the divisors of mn to be listed in some specific linear order since such an ordering will allow us to label the corresponding superclasses and supercharacters on \(\mathbb{Z}/mn\mathbb{Z}\). We therefore impose the lexicographic order on the set

$$ \bigl\{ d_i e_{i'} : 1 \leq i \leq\tau(m), 1 \leq i' \leq\tau(n) \bigr\} $$

of divisors of mn that is induced by the lexicographic ordering of the Cartesian product {1,2,…,τ(m)}×{1,2,…,τ(n)}. This gives rise to an order-preserving bijection

$$ \sigma: \bigl\{1,2,\ldots,\tau(m) \bigr\}\times \bigl\{1,2,\ldots ,\tau(n) \bigr \} \to \bigl\{1,2,\ldots,\tau(mn) \bigr\}, $$

which implicitly provides us with an ordered list of the divisors of mn. We now consider the matrices M 1(mn),M 2(mn),…,M τ(mn)(mn), the diagonal matrices D 1(mn),D 2(mn),…,D τ(mn)(mn), and the matrix W(mn) that intertwines them.

Lemma 4.3

Maintaining the notation and conventions described above,

$$\begin{aligned} \bigl[M_{\sigma(i,i')}(mn) \bigr]_{\sigma(j,j'), \sigma(k,k')} &= \bigl[M_i(m) \bigr]_{j,k} \bigl[M_{i'}(n) \bigr]_{j',k'}, \end{aligned}$$
(4.16)
$$\begin{aligned} \bigl[D_{\sigma(i,i')}(mn) \bigr]_{\sigma(j,j'), \sigma(k,k')} &= \bigl[D_i(m) \bigr]_{j,k} \bigl[D_{i'}(n) \bigr]_{j',k'}, \end{aligned}$$
(4.17)
$$\begin{aligned} \bigl[W(mn) \bigr]_{\sigma(j,j'), \sigma(k,k')} &= \bigl[W(m) \bigr]_{j,k} \bigl[W(n) \bigr]_{j',k'}, \end{aligned}$$
(4.18)

for all 1≤i,j,kτ(m) and 1≤i′,j′,k′≤τ(n). In other words, the simultaneous diagonalization (4.3) of Theorem 4.2 is compatible with Kronecker products in the sense that

$$ \bigl(\underbrace{M_i(m) \otimes M_{i'}(n)}_{M_{\sigma(i,i')}(mn)} \bigr) \bigl(\underbrace{W(m) \otimes W(n)}_{W(mn)} \bigr) = \bigl( \underbrace{W(m) \otimes W(n)}_{W(mn)} \bigr) \bigl( \underbrace{D_i(m) \otimes D_{i'}(n) }_{D_{\sigma(i,i')(mn)}} \bigr) $$

whenever (m,n)=1. In particular, W(mn)=S(mn), the corresponding supercharacter table for \(\mathbb{Z}/mn\mathbb{Z}\).

Proof

The given lists d 1,d 2,…,d τ(m) and e 1,e 2,…,e τ(n) provide us with corresponding superclasses

$$ K_i(m) = \biggl\{ a \in\mathbb{Z}/m\mathbb{Z}: (a,m) = \frac {m}{d_i} \biggr\},\qquad K_{i'}(n) = \biggl\{ b \in \mathbb{Z}/n\mathbb{Z}: (b,n) = \frac{n}{e_{i'}} \biggr\}, $$

in \(\mathbb{Z}/m\mathbb{Z}\) and \(\mathbb{Z}/n\mathbb{Z}\). For each pair (i,i′) satisfying 1≤iτ(m) and 1≤i′≤τ(n), we define

$$ K_{\sigma(i,i')}(mn) = \biggl\{ c \in\mathbb{Z}/mn\mathbb{Z}: (c,mn) = \frac{mn}{d_i e_{i'}} \biggr\}, $$

yielding a partition of \(\mathbb{Z}/mn\mathbb{Z}\) into τ(m)τ(n)=τ(mn) superclasses. By the Chinese Remainder Theorem, the map \(\varPhi: \mathbb {Z}/m\mathbb{Z}\times\mathbb{Z}/n\mathbb{Z}\to\mathbb {Z}/mn\mathbb{Z}\) defined by \(\varPhi( (a,b) ) = ab\, (\operatorname{mod} mn)\) is a ring isomorphism that satisfies

$$ \varPhi \bigl( K_i(m) \times K_{i'}(n) \bigr) = K_{\sigma(i,i')}(mn). $$
(4.19)

Fix elements z,z′ in K k (m) and K k(n), respectively, and let

$$\begin{aligned} a_{i,j,k}(m) &=\bigl| \bigl\{ (x,y) \in K_i(m) \times K_j(m) : x + y = z \bigr\}\bigr|, \\ a_{i',j',k'}(n) &=\bigl| \bigl\{ \bigl(x',y' \bigr) \in K_{i'}(n) \times K_{j'}(n) : x' + y' = z' \bigr\}\bigr|. \end{aligned}$$

Clearly the product a i,j,k (m)a i′,j′,k(n) equals the number of solutions to

$$ \bigl(x,x' \bigr) + \bigl(y,y' \bigr) = \bigl(z,z' \bigr), $$

where (x,x′) belongs to K i (mK i(n), and (y,y′) belongs to K j (mK j(n). Applying Φ to both sides of the preceding and using (4.19), it follows that a i,j,k (m)a i′,j′,k(n) equals the number of solutions to X+Y=Z where Z=Φ(z,z′) is a fixed element of K σ(k,k′)(mn), X belongs to K σ(i,i′)(mn), and Y belongs to K σ(j,j′)(mn), respectively. In other words,

$$ a_{\sigma(i,i'), \sigma(j,j'), \sigma(k,k')}(mn) = a_{i,j,k}(m)a_{i',j',k'}(n), $$
(4.20)

so that

$$ \bigl[M_{\sigma(i,i')}(mn) \bigr]_{\sigma(j,j'), \sigma(k,k')} = \bigl[M_i(m) \bigr]_{j,k} \bigl[M_{i'}(n) \bigr]_{j',k'}. $$

This establishes (4.16).

Turning our attention to (4.17), we recall from Theorem 4.2 that the matrix D σ(i,i′)(mn) is diagonal. In fact, its diagonal entries are precisely

The proof of (4.18) is similar, and in fact much easier, since W(m), W(n), and W(mn) do not depend upon the indices i,i′. The fact that W(mn)=S(mn) follows immediately from (4.12) and (3.21). □

Our primary interest in the preceding lemma lies in the following straightforward generalization. Let \(n=p_{1}^{\alpha_{1}} p_{2}^{\alpha_{2}} \cdots p_{r}^{\alpha_{r}}\) denote the factorization of n into distinct primes, and let d 1,d 2,…,d τ(n) denote the divisors of n. Noting that

$$ \tau(n) = \prod_{\ell=1}^r ( \alpha_{\ell}+1), $$

we let

$$ \sigma: \prod_{\ell=1}^r \{1,2,\ldots, \alpha_{\ell}+1\} \to \bigl\{1,2,\ldots,\tau(n) \bigr\} $$

be a bijection that preserves the lexicographic order on the Cartesian product \(\prod_{\ell=1}^{r} \{1,2,\ldots,\alpha_{\ell}+1\}\). According to this labeling scheme,

$$ d_{\sigma(i_1,i_2,\ldots,i_r)}=p_1^{i_1-1} p_2^{i_2-1}\cdots p_r^{i_r-1} $$
(4.21)

is the σ(i 1,i 2,…,i r )th divisor of n. In light of Lemma 4.3, it follows that

$$ D_{\sigma(i_1,i_2,\ldots,i_r)}(n) \cong\bigotimes_{\ell=1}^r D_{i_{\ell}} \bigl(p_{\ell}^{\alpha_{\ell}} \bigr) \sim\bigotimes_{\ell=1}^r M_{i_{\ell}} \bigl(p_{\ell}^{\alpha_{\ell}} \bigr) \cong M_{\sigma(i_1,i_2,\ldots,i_r)}(n) , $$
(4.22)

where ∼ denotes similarity, and ≅ denotes similarity by a permutation matrix. In particular, the eigenvalues of the diagonal matrix \(D_{\sigma (i_{1},i_{2},\ldots,i_{r})}(n)\) are precisely the τ(n) numbers

$$ c_{d_{\sigma(i_1,i_2,\ldots,i_r)}}(d_j) $$

for j=1,2,…,τ(n). We can therefore obtain from (4.22) a variety of formulas involving Ramanujan sums by utilizing the detailed information about the matrices \(M_{i_{\ell }}(p_{\ell}^{\alpha_{\ell}})\) we derived in Sect. 4.2.

4.4 Ramanujan sums and superclass constants

Fix a positive integer n and let a i,j,k =a i,j,k (n), M i =M i (n), and so forth. Now let us observe that

Comparing the (j,k) entry in the equality nM i =SD i S yields the formula

$$ n a_{i,j,k} = \sum_{d|n} c_{d_i}(d) c_{d_j}(d) c_{\frac{n}{d}} \biggl( \frac{n}{d_k} \biggr). $$

In light of (3.24), we may rewrite the preceding in the more symmetric form

$$ n \phi(d_k) a_{i,j,k} = \sum _{d|n} \phi \biggl( \frac{n}{d} \biggr) c_{d_i}(d) c_{d_j}(d) c_{d_k}(d). $$
(4.23)

Since the right side of (4.23) is symmetric in i,j,k, it follows that ϕ(d k )a i,j,k =ϕ(d j )a k,i,j . By definition, the superclass constants satisfy a k,i,j =a i,k,j , whence

$$ \phi(d_k) a_{i,j,k} = \phi(d_j) a_{i,k,j}. $$
(4.24)

Indeed, this pattern is evident in the structure (4.11) of the matrices M i (p α). Another consequence of (4.23) is the following result.

Theorem 4.4

If \(n = p_{1}^{\alpha_{1}} p_{2}^{\alpha_{2}}\cdots p_{r}^{\alpha_{r}}\) is the canonical factorization of n into distinct primes p 1,p 2,…,p r and \(d = p_{1}^{\beta_{1}} p_{2}^{\beta_{2}} \cdots p_{r}^{\beta_{r}}\) is a divisor of n, then

$$ \sum_{d'|n} \phi \biggl( \frac{n}{d'} \biggr) \bigl(c_d \bigl(d' \bigr) \bigr)^3 = \begin{cases} 0 & \text{\textit{if}\ $2|d$},\\ n\phi(d) \prod_{\ell=1}^r \lceil p_{\ell}^{\beta _{\ell}} - 2 p_{\ell}^{\beta_{\ell}-1} \rceil & \text{\textit{if}\ $2\nmid d$}, \end{cases} $$
(4.25)

where ⌈ ⋅ ⌉ denotes the ceiling function.

Proof

By Corollary 3.4 and the multiplicativity of the Euler totient function, it suffices to establish the desired identity when n=p α is a prime power. Let d=d i =p i−1 and recall from (4.11) that a i,i,i (p α)=1 if i=1 and a i,i,i (p α)=p i−1−2p i−2 otherwise. Setting i=j=k in (4.23), we obtain (4.25) for n=p α. □

4.5 Power sum identities

The preceding material now allows us to rapidly produce a wide variety of power sum identities involving Ramanujan sums, all of which appear to be novel. This highlights our larger argument, namely that supercharacter theory and superclass arithmetic are powerful tools, which can yield new insights, even when applied to the most elementary of groups (i.e., the cyclic groups \(\mathbb{Z}/n\mathbb{Z}\)).

Theorem 4.5

If \(n = p_{1}^{\alpha_{1}} p_{2}^{\alpha_{2}}\cdots p_{r}^{\alpha_{r}}\) is the canonical factorization of n into distinct primes p 1,p 2,…,p r and \(d = p_{1}^{\beta _{1}} p_{2}^{\beta_{2}} \cdots p_{r}^{\beta_{r}}\) is a divisor of n, then for s=0,1,2,…, we have

$$ \sum_{k|n} \bigl(c_d(k) \bigr)^s = \prod_{\ell=1}^{r} \bigl( (\alpha_{\ell} - \beta_{\ell} +1 ) \phi \bigl(p_{\ell}^{\beta_{\ell}} \bigr)^s + (-1)^s \bigl\lfloor p_{\ell}^{\beta_{\ell}-1} \bigr\rfloor^s \bigr). $$

Here ⌊ ⋅ ⌋ denotes the floor function.

Proof

By Corollary 3.4 it suffices to establish the desired formula when n=p α is a prime power. Recall from (4.11) that for 2≤iα+1, we have

$$ M_i \bigl(p^{\alpha} \bigr) = \begin{bmatrix}0&\mathbf{b}\\\mathbf{c}&x \end{bmatrix} \oplus\phi \bigl(p^{i-1} \bigr) I_{\alpha-i+1}, $$
(4.26)

where b and c are certain (i−1)×1 and 1×(i−1) matrices that satisfy

$$ \mathbf{c} \mathbf{b} = \phi \bigl(p^{i-1} \bigr) \bigl( 1 + (p-1) + \bigl(p^2-p \bigr) + \cdots+ \bigl(p^{i-2} - p^{i-3} \bigr)\bigr) = \phi \bigl(p^{i-1} \bigr)p^{i-2}, $$

and where x=p i−1−2p i−2. A short inductive argument confirms that for each s=0,1,2,…, there exist corresponding polynomials f 11,f 12,f 21,f 22 such that

$$ \begin{bmatrix}0&\mathbf{b}\\\mathbf{c}&x \end{bmatrix} ^s = \begin{bmatrix}f_{11}(\mathbf{b}\mathbf{c}) &f_{12}(\mathbf{b} \mathbf{c})\mathbf{b}\\\mathbf{c} f_{21}(\mathbf{b}\mathbf{c}) &f_{22}(\mathbf{c}\mathbf{b}) \end{bmatrix} . $$
(4.27)

In particular, note that the preceding formula also holds when b and c are replaced by positive real numbers b and c and when the matrices involved are regarded simply as 2×2 matrices with real entries. Letting b,c>0 satisfy

$$ cb = \mathbf{c}\mathbf{b}= \phi \bigl(p^{i-1} \bigr)p^{i-2}, $$
(4.28)

it follows from (4.27) and an elementary diagonalization argument that

$$\begin{aligned} \operatorname{tr} \begin{bmatrix}0&\mathbf{b}\\\mathbf{c}&x \end{bmatrix} ^s &= \operatorname{tr} f_{11}(\mathbf{b}\mathbf{c}) + \operatorname{tr}f_{22}( \mathbf {c}\mathbf{b}) \\ &= \operatorname{tr}f_{11}(\mathbf{c}\mathbf{b}) + \operatorname {tr}f_{22}(\mathbf{c}\mathbf{b}) \\ &= \operatorname{tr}f_{11}(cb) + \operatorname{tr}f_{22}(cb) \\ &= \operatorname{tr} \begin{bmatrix}0&b\\c&x \end{bmatrix} ^s \\ &= \biggl( \frac{x + \sqrt{4bc+x^2}}{2} \biggr)^s + \biggl( \frac {x - \sqrt{4bc+x^2}}{2} \biggr)^s \\ &= \biggl[ \frac{(p^{i-1} - 2p^{i-2}) + p^{i-1}}{2} \biggr]^s + \biggl[ \frac{(p^{i-1} - 2p^{i-2}) - p^{i-1}}{2} \biggr]^s \\ &= \bigl( p^{i-1}-p^{i-2} \bigr)^s + \bigl( -p^{i-2} \bigr)^s \\ &= \phi \bigl( p^{i-1} \bigr)^s + \bigl( -p^{i-2} \bigr)^s. \end{aligned}$$

Returning to (4.26), we find that

$$ \operatorname{tr}M_i^s \bigl(p^{\alpha} \bigr) = (\alpha- i + 2)\phi \bigl( p^{i-1} \bigr)^s + \bigl(-p^{i-2} \bigr)^s $$
(4.29)

when 2≤iα+1. Since M 1(p α) is the (α+1)×(α+1) identity matrix, setting β=i−1, it follows that

as required. □

In light of (4.15), it is not hard to generate more complicated variants of the preceding formula. For instance, since

$$ \operatorname{tr}M_i^s \bigl(p^{\alpha} \bigr) M_j^s \bigl(p^{\alpha} \bigr) = \begin{cases} \phi(p^{i-1})^s \operatorname{tr}M_j^s(p^{\alpha}) & \text{if $i < j$}, \\ \operatorname{tr}M_i^{2s}(p^{\alpha}) & \text{if $i = j$},\\ \phi(p^{j-1})^s \operatorname{tr}M_i^s(p^{\alpha}) & \text{if $i > j$}, \end{cases} $$

the quantity \(\operatorname{tr}M_{i}^{s}(p^{\alpha}) M_{j}^{s}(p^{\alpha})\) can be evaluated using similar methods. A little algebra then yields the following generalization of Theorem 4.5.

Theorem 4.6

Let \(n = p_{1}^{\alpha_{1}} p_{2}^{\alpha_{2}}\cdots p_{r}^{\alpha_{r}}\) be the canonical factorization of n into distinct primes p 1,p 2,…,p r . If \(d = p_{1}^{\beta _{1}} p_{2}^{\beta_{2}} \cdots p_{r}^{\beta_{r}}\) and \(d' = p_{1}^{\gamma_{1}} p_{2}^{\gamma_{2}} \cdots p_{r}^{\gamma_{r}}\) are divisors of n, then for s=0,1,2,…, we have

$$\begin{aligned} &\sum_{k|n} \bigl(c_d(k) c_{d'}(k) \bigr)^s \\ &\quad= \prod_{\ell=1}^{r} \bigl( \bigl( \alpha_{\ell} - \max\{\beta_{\ell},\gamma_{\ell}\} + 1 \bigr) \phi \bigl(p_{\ell}^{\beta_{\ell}} \bigr)^s \phi \bigl(p_{\ell}^{\gamma_{\ell}} \bigr)^s \\ &\qquad{}+ \bigl( \bigl\lfloor p^{\min\{\beta_{\ell},\gamma _{\ell}\}-1} \bigr\rfloor- (1- \delta_{\beta_{\ell},\gamma_{\ell}}) p^{\min\{\beta_{\ell },\gamma_{\ell}\}} \bigr)^s \bigl\lfloor p^{ \max\{ \beta_{\ell}, \gamma_{\ell}\}-1} \bigr\rfloor^s \bigr), \end{aligned}$$

where δ denotes the Kronecker delta function.

Although it might at first appear that the preceding results could be adapted to handle negative exponents s, there are a few minor obstacles. First is the fact that M i (p α) is invertible if and only if i=1 or i=2. This corresponds to the fact that c d (k) vanishes for certain values of k if d is not square-free. Indeed, this can be seen directly from von Sterneck’s formula (3.25) and the definition (3.2) of the Möbius μ-function. The correct adaptation of Theorem 4.5 for negative exponents is the following.

Theorem 4.7

If \(n = p_{1}^{\alpha_{1}} p_{2}^{\alpha_{2}}\cdots p_{r}^{\alpha_{r}}\) is the canonical factorization of n into distinct primes p 1,p 2,…,p r and \(d = p_{1}^{\beta _{1}} p_{2}^{\beta_{2}} \cdots p_{r}^{\beta_{r}}\) is a square-free divisor of n (i.e., 0≤β i ≤1 for i=1,2,…,r), then for s=0,1,2,…, we have

$$ \sum_{k|n} \frac{1}{ (c_{d}(k) )^s} = \prod_{\ell=1}^r \biggl( \frac{\alpha_{\ell}}{ (p_{\ell}-1)^{\beta _{\ell} s} } + (-1)^{\beta_{\ell}s} \biggr). $$

Proof

As before, it suffices to prove the desired formula when n=p α is a prime power. The upper-left 2×2 submatrix of the (α+1)×(α+1) matrix

has the eigenvalues −1 and p−1. On the other hand, M 1(p α) is the identity matrix and hence has the eigenvalue 1 with multiplicity α+1. Therefore,

as required. □

Along similar lines, we have the following.

Theorem 4.8

If \(n = p_{1}^{\alpha_{1}} p_{2}^{\alpha_{2}}\cdots p_{r}^{\alpha_{r}}\) is the canonical factorization of n into distinct primes p 1,p 2,…,p r and \(d = p_{1}^{\beta _{1}} p_{2}^{\beta_{2}} \cdots p_{r}^{\beta_{r}}\) is a square-free divisor of n (i.e., 0≤β i ≤1 for i=1,2,…,r), then for z in \(\mathbb{C}\), we have

$$ \sum_{k|n} \frac{1}{|c_{d}(k) |^z} = \prod_{\ell=1}^r \biggl(1+ \frac{\alpha_{\ell}}{ (p_{\ell}-1)^{\beta_{\ell} z} } \biggr). $$
(4.30)

Proof

Noting that \(|c_{d}(k)|^{z} = (c_{d}(k)^{2})^{\frac{z}{2}}\) and that

$$ \operatorname{tr} \bigl(M^2_{\beta+1} \bigl(p^{\alpha} \bigr) \bigr)^{-\frac{z}{2}} = 1+ \frac{\alpha}{ (p-1)^{\beta z} }, $$

the proof is similar to the proof of Theorem 4.5. □

Since the expression (4.30) bears some resemblance to the classical Riemann ζ-function and its variants, it is natural to ask whether this expression obeys the analogue of the Riemann hypothesis. The following result shows that this occurs if and only if n and d satisfy some rather peculiar hypotheses.

Corollary 4.9

The complex roots of the function (4.30) all lie on the line \(\Re z = \frac{1}{2}\) if and only if each prime p which divides d is of the form α 2+1 where p α is the highest power of p that divides n.

Proof

The complex roots of (4.30) are precisely the numbers

$$ z = \frac{ \log|\alpha_{\ell}| + i (2m+1)\pi}{\log (p_{\ell}-1)},\quad m\in\mathbb{Z}, $$

for those such that β =1 (i.e., for those primes p that divide d). The real part of the preceding clearly equals \(\frac{1}{2}\) if and only if \(p_{\ell}=\alpha_{\ell}^{2}+1\). □

Example 5

Let n=52×174×376=5,357,300,885,152,225 and d=5×17×37=3,145. Since 5=22+1, 17=42+1, and 37=62+1, it follows that the corresponding “ζ-function” (4.30) satisfies the Riemann hypothesis.

Using (4.15) and some of the preceding computations, it is not hard to explicitly evaluate the trace of any word composed using M 1(p α),M 2(p α),…,M α+1(p α), and M 2(p α)−1. Consequently, the motivated individual could in principle provide an explicit formula for the sum

$$ \sum_{k|n} \frac{ c_{f_1}(k)^{s_1} c_{f_2}(k)^{s_2} \cdots c_{f_{\eta }}(k)^{s_{\eta}} }{ c_{g_1}(k)^{t_1} c_{g_2}(k)^{t_2} \cdots c_{g_{\nu}}(k)^{t_{\nu}} }, $$

where f 1,f 2,…,f η are divisors of n, and g 1,g 2,…,g ν are square-free divisors of n. We make no attempt to do so here, having made our point that the arithmetic of superclasses can be used to deduce a variety of identities for Ramanujan sums.

5 Conclusion

We have demonstrated that reexamining even the most elementary of groups, namely the cyclic groups \(\mathbb{Z}/n\mathbb{Z}\), from the perspective of supercharacter theory can yield surprising results. In particular, almost the entire algebraic theory of Ramanujan sums can be derived, in a systematic manner, using this approach. Many of the familiar classical identities for these fascinating sums, along with a variety of new ones, can be obtained with minimal effort once the basic machinery has been developed.

All of our results follow directly from a general theoretical framework without ad hoc arguments. More importantly, many of the ideas developed in this note can be applied to arbitrary finite groups. In particular, the arithmetic of superclasses (Sect. 4) and the simultaneous diagonalization theorem (Theorem 4.2), which yielded some of the more elaborate identities for Ramanujan sums, hold in much greater generality. We therefore hope that revisiting other families of elementary groups (e.g., dihedral groups, Frobenius groups, symmetric groups, p-groups,…) from the perspective of supercharacter theory might yield further information about other exponential sums (e.g., Gauss sums, Kloosterman sums, Jacobi sums, and their variants), which are of interest in number theory (see [10]).