1 Introduction

Self-avoiding walks (SAWs) are the set of walks on a graph, typically the integer lattice \({{\mathbb {Z}}}^d\), where each step is between nearest neighbours, and no vertex is visited twice. The model has a long history, and has played a pivotal role in our understanding of polymers and critical phenomena more broadly, and in the development of conformal field theory, the lace expansion, and Schramm–Loewner Evolution [1, 20].

In three dimensions SAWs are in the same universality class as real polymers, and provide an ideal laboratory for studying universal properties such as critical exponents.

In four dimensions the model is no longer physically relevant, but nonetheless it is worthy of study as it provides a useful test of renormalization group techniques which have made various predictions. It can also be regarded as a test-bed problem for physically relevant models which have logarithmic corrections to scaling, for example the \(\theta \)-transition in three dimensions is believed to be identified with a tricritical point, and has mean-field behavior with logarithmic corrections. Four-dimensional SAWs have been studied by Monte Carlo [14, 24] and enumeration [6, 11, 19] methods, and we note that rigorous results have recently been obtained for the 4-dimensional continuous-time weakly self-avoiding walk via the rigorous renormalization group [2,3,4].

In this work we will restrict ourselves to the study of SAWs on \({{\mathbb {Z}}}^4\). The key quantities of interest are the number of SAWs of N steps, which we denote \(c_N\), and the mean size of SAWs of length N. We can formally describe a walk of N steps as a mapping \(\omega \) from the integers \(0,1,\cdots ,N\) to sites on \({{\mathbb {Z}}}^4\), with \(\vert \omega (i+1) - \omega (i) \vert = 1\) \(\forall i\in [0,N-1]\), and \(\omega (i) \ne \omega (j)\) \(\forall i \ne j\). We calculate the two most common measures of size, the squared end-to-end distance, \(R_{\mathrm {E}}^2\), and the squared radius of gyration, \(R_{\mathrm {G}}^2\), which are defined as:

$$\begin{aligned} R_{\mathrm {E}}^2&= \vert \omega (N) - \omega (0) \vert ^2; \end{aligned}$$
(1)
$$\begin{aligned} R_{\mathrm {G}}^2&= \frac{1}{2 (N+1)^2} \sum _{i,j} \vert \omega (i)-\omega (j) \vert ^2. \end{aligned}$$
(2)

Our goal is to understand the asymptotic behavior of the mean squared end-to-end distance and mean squared radius of gyration, in the long-chain limit. Renormalization group arguments [12] give the following asymptotic forms [12, 14] for the relevant quantities in four dimensions:

$$\begin{aligned} c_N&= A \mu ^N [\log (N/a)]^{1/4} \left( 1 - \frac{17\log (4\log (N/a))-3}{64\log (N/a)} + \cdots \right) ; \end{aligned}$$
(3)
$$\begin{aligned} \langle R_{\mathrm {E}}^2 \rangle&= D_{\mathrm{E}} N [\log (N/a)]^{1/4} \left( 1 - \frac{17\log (4\log (N/a))+31}{64\log (N/a)} + \cdots \right) ; \end{aligned}$$
(4)
$$\begin{aligned} \langle R_{\mathrm {G}}^2 \rangle&= D_{\mathrm{G}} N [\log (N/a)]^{1/4} \left( 1 - \frac{17\log (4\log (N/a))+97/3}{64\log (N/a)} + \cdots \right) . \end{aligned}$$
(5)

Here, \(\langle \cdots \rangle \) indicates the expectation of the observable over the set of SAWs of length N, \(\mu = 6.774\;043(5)\) [24] is the growth constant, A, \(D_{\mathrm {E}}\), and \(D_{\mathrm {G}}\) are lattice dependent amplitudes, and a is a common scale factor. The overall factor of \([\log N]^{1/4}\) in Eqs. 4 and 5 is universal, and would occur for any observable which measures the squared size of the walk, such as the square of the hydrodynamic radius, and the mean squared monomer-to-end distance. While the amplitudes themselves are not universal, it is expected that the amplitude ratio \(D_{\mathrm {E}}/D_{\mathrm {G}}\) is a universal quantity. For SAWs in four dimensions this should have the same value as for simple random walks, i.e. we should have \(D_{\mathrm {E}}/D_{\mathrm {G}}= 6\). Note that for the four-dimensional continuous-time weakly SAW, the leading-order behavior has recently been proved to be of the forms given in Eq. 3 (see [3]) and Eq. 4 (see [4]).

The logarithmic corrections arise because \(d=4\) is the upper critical dimension for SAWs. For \(d \ge 5\) SAWs are in the same universality class as simple random walks [15, 16], or Brownian motion, where there are no logarithmic corrections and, for example, \(\langle R_{\mathrm {E}}^2 \rangle = O(N)\).

Logarithmic corrections are notoriously hard to observe, but recent developments in Monte Carlo simulation methods have made it conceivable that we could reach the asymptotic regime and carefully test the asymptotic forms given in Eqs. 4 and 5. The pivot algorithm [18, 21] is a Markov chain Monte Carlo algorithm which is the most efficient known method for sampling SAWs of fixed length. Recent improvements [7, 8, 17] have reduced the CPU time necessary to attempt a pivot move to \(O(\log N)\), which makes it possible to reach the regime of truly large N, where the limit is due to computer memory (RAM) availability, rather than any constraint of available computer time.

We have applied the SAW-tree implementation [8] of the pivot algorithm for the first time to the problem of sampling four-dimensional SAWs, and so generated high precision estimates of \(\langle R_{\mathrm {E}}^2 \rangle \) and \(\langle R_{\mathrm {G}}^2 \rangle \) for SAWs of up to one billion steps. In the process we also studied the behaviour of the SAW-tree implementation, and the characteristics of the pivot algorithm for sampling SAWs.

In the remainder of this paper, we first describe our Monte Carlo simulations in Sect. 2, which includes data on the performance of the SAW-tree implementation, analyse the results in Sect. 3, study the acceptance fraction of the pivot algorithm in Sect. 4, and conclude in Sect. 5.

2 Monte Carlo Simulation

The pivot algorithm is a Markov chain Monte Carlo algorithm that samples SAWs of fixed length N. The basic move is a pivot, which involves the application of a lattice symmetry (rotation or reflection) to one piece of a walk around a chosen site of the walk. A pivot move is successful if it results a walk that is self-avoiding, and so generates a correlated sequence of SAW configurations. In the seminal work of Madras and Sokal [21], the pivot algorithm was proved to sample SAWs uniformly at random, and it was also shown to be remarkably efficient at sampling global observables, such as \(R_{\mathrm{E}}^2\), due to the fact that with each successful pivot move a large change is made to global observables. As mentioned in the introduction, recent improvements [7, 8, 17] have increased the efficiency of the pivot algorithm still further, to the point that it is now possible to sample SAWs with \(10^9\) steps. See [21] for many more details about the pivot algorithm.

For the present computer experiment, we chose pivot sites uniformly at random along the chain, and the pivot symmetry operations were chosen uniformly at random from amongst the 383 lattice symmetries of \({{\mathbb {Z}}}^4\) that do not correspond to the identity.

To initialize the system we used the pseudo_dimerize procedure, and to eliminate any initialization bias we then performed approximately 20N successful pivots.

For the longest walks, with \(N=10^9\), this initialization procedure took approximately 360 h (or 15 days) to complete. To reduce this computational burden, we performed an additional trick. We generated seed walks for lengths from \(10^6\hbox {--}10^9\); we expect these walks to be indistinguishably close to equilibrium. We then used these seed walks for separate simulations (8, in the case of \(N=10^9\)) where instead of performing 20N successful pivots to warm up the system we only performed approximately N successful pivots. Even though these are not enough pivot moves to bring an initial configuration such as a straight rod to equilibrium, given that we are at equilibrium this is more than sufficient to ensure that correlations between estimates of global observables between batches are completely negligible.

After 20N successful pivots had been applied for walks with \(N < 10^6\), or N successful pivots had been applied to seed walks for \(N \ge 10^6\), we started collecting data for our observables for each time step, and aggregated the results in batches of \(10^8\).

We sampled SAWs on \({{\mathbb {Z}}}^4\) over a range of lengths from 2000 to one billion steps. We used the SAW-tree implementation of the pivot algorithm as described in [8] to collect data for the observables \(\langle R_{\mathrm {E}}^2 \rangle \), \(\langle R_{\mathrm {G}}^2 \rangle \), the probability of a pivot move being successful, f, and the amount of CPU time for each batch. The computer experiment was run for 130,000 CPU hours on Dell PowerEdge FC630 machines with Intel Xeon E5-2680 CPUs. In total there were \(4.4 \times 10^5\) batches of \(10^8\) attempted pivots, and thus there were a grand total of \(4.4 \times 10^{13}\) attempted pivots across all walk sizes.

Our data for \(\langle R_{\mathrm {E}}^2 \rangle \),\(\langle R_{\mathrm {G}}^2 \rangle \), f, and mean CPU time per pivot attempt, are collected in Tables 1 and 2 in Appendix A. We also include there the estimates for the universal amplitude ratio \(\langle R_{\mathrm {E}}^2 \rangle /\langle R_{\mathrm {G}}^2 \rangle \), as the positive correlation between two observables results in variance reduction in the final estimates. That is, the error for the ratio is smaller than would naively be expected from the original observables.

We plot the mean CPU time per attempted pivot against N in Fig. 1, with a logarithmic scale on the x-axis. The small amount of visible scatter should not be taken too seriously, as the computers executing the code obey scheduling algorithms which mean that the CPU time to run computations may vary somewhat. But, the overall trend is consistent with the behavior demonstrated in our earlier work [8] on SAWs in two and three dimensions: the CPU time increases logarithmically with N, although there is some degradation in performance at large N, which is likely due to the larger memory demand, which in turn means that memory accesses are more frequently out of cache.

Interestingly, the equivalent plots in [8] demonstrated quite starkly the impact of memory cache on performance by exhibiting a clear kink, whereas here the degradation is far more gradual. It is plausible that this effect is due to changes in CPU architecture over the past seven years, i.e. since the earlier computer experiment was performed.

In absolute terms, the performance is exceptionally fast: each pivot attempt takes, on average, approximately \(5~ \mu s\) for N of the order of thousands, and only takes ten times as long, or roughly \(50 \mu s\), for walks of one billion steps.

Fig. 1
figure 1

CPU time per attempted pivot move as a function of N

3 Analysis

Our method of analysis is to perform weighted non-linear fits of our data to the presumed asymptotic form, by using the ‘nls’ routine of the statistical programming language R. We truncated our data by only fitting values with \(N \ge N_\mathrm {min}\). We then varied \(N_\mathrm {min}\) to get a sequence of estimates for which we expect the systematic error due to unfitted corrections-to-scaling to decrease. We varied \(N_{\text {min}}\) from 2000, which is the smallest value of N for which data were collected, to \(N_{\text {min}}= 2\times 10^{6}\), which is the largest value for which we felt that the parameter estimates were informative (i.e. with sufficiently small error bars that they aided in extrapolation).

We plotted our estimates against a variable which is of the same relative size as the first neglected correction-to-scaling term. In principle, this should result in linear convergence in the plots as the asymptotic regime is reached, although it may be too optimistic to hope for linear convergence in our case, as the neglected correction terms are \(O([\log N]^{-1})\) and \(O([\log N]^{-2})\), which suggests that some curvature would likely still be apparent even for \(N=10^9\). We extrapolate the fits from the right to where they intersect the y-axis, which corresponds to the \(N_{\mathrm {min}} \rightarrow \infty \) limit. There is a significant degree of subjectivity in performing these extrapolations, so we present all of our fits graphically to allow readers to judge for themselves whether the extrapolations are suitably cautious.

We make the general observation that such analysis is fraught with danger, especially when one “knows” the answer. It is exceedingly easy to exhibit confirmation bias [22] when selecting appropriate analysis methods. Given that various methods of analysis are available, and also variants of the same method,Footnote 1 it is natural that some of them will result in estimates which are closer to the expected answer. From there it is a slippery slope to the conclusion that somehow the methods which give the “correct” answer are the right methods for the problem at hand. At this point it is possible that only the results from the preferred method are presented in a research article, so introducing bias, or, equally insidiously, it may be that the confidence intervals for the preferred methods are estimated to be smaller than for the non-preferred “wrong” methods.

This is exactly our situation, as the asymptotic forms for \(\langle R_{\mathrm {E}}^2 \rangle \) and \(\langle R_{\mathrm {G}}^2 \rangle \) given in Eqs. 4 and 5 respectively are widely believed to be correct. We have endeavored to reduce the likelihood of confirmation bias occurring by including estimates from all of the methods of analysis which have been used. We are fortunate that the power of the leading logarithm, found in Eqs. 4 and 5, occurs for two observables, which naturally gives us two somewhat independent estimates.

We now proceed with our analysis, starting with the raw data for \(\langle R_{\mathrm {E}}^2 \rangle \) and \(\langle R_{\mathrm {G}}^2 \rangle \) from Table 1, to which we will fit the asymptotic forms in Eqs. 4 and 5. We start by fitting the leading behavior only, where we assume that there is indeed a factor of N, but where we fit for the amplitude D, the parameter a, and the power of the logarithm, which we denote as \(\kappa \). That is, our statistical models are:

$$\begin{aligned} \langle R_{\mathrm {E}}^2 \rangle&= D_{\mathrm {E}}N [\log (N/a)]^{\kappa }; \end{aligned}$$
(6)
$$\begin{aligned} \langle R_{\mathrm {G}}^2 \rangle&= D_{\mathrm {G}}N [\log (N/a)]^{\kappa }. \end{aligned}$$
(7)

We find that this fitting form reproduces the Monte Carlo data quite well for both \(\langle R_{\mathrm {E}}^2 \rangle \) and \(\langle R_{\mathrm {G}}^2 \rangle \), giving residual standard deviation values of approximately 15–20 for \(N_{\text {min}}= 2000\), and declining to approximately 1 for \(N_{\text {min}}\ge 5\times 10^{5}\). The fact that the statistical models are capable of reproducing the data to within our error bars is evidence in support of the correctness of the expressions given in Eqs. 4 and 5.

We show the estimates of \(\kappa \) from our fits in Fig. 2. The estimates appear to smoothly converge to a value that is above 1 / 4, and our best estimate from extrapolating the trend is \(\kappa = 0.258(2)\), which is shown on the y-axis of the plot. If taken at face value this would seem to exclude \(\kappa = 1/4\), but we know that the unfitted corrections to scaling may be large and so are mindful that this may be misleading. (Admittedly, if we did not have prior information that \(\kappa = 1/4\), or of the details of the correction-to-scaling terms, then the smoothness of the trend would have given no hints that anything was amiss, and we would have been happy in that case to report \(\kappa = 0.258(2)\) as our final estimate.)

Fig. 2
figure 2

Variation of fitted value of \(\kappa \) with \(N_{\mathrm{min}}\) when only the leading term is fitted. The lines of best fit to the final six values are shown, and our final estimate is plotted on the y-axis

We now fit the correction term, where we once again assume that there is indeed a factor of N, but fit D, the parameter a, and \(\kappa \). Our statistical models are:

$$\begin{aligned} \langle R_{\mathrm {E}}^2 \rangle&= D_{\mathrm{E}} N [\log (N/a)]^{\kappa } \left( 1 - \frac{17\log (4\log (N/a))+31}{64\log (N/a)} \right) ; \end{aligned}$$
(8)
$$\begin{aligned} \langle R_{\mathrm {G}}^2 \rangle&= D_{\mathrm{G}} N [\log (N/a)]^{\kappa } \left( 1 - \frac{17\log (4\log (N/a))+97/3}{64\log (N/a)} \right) . \end{aligned}$$
(9)

We find that this fitting form reproduces the Monte Carlo data extremely well for both \(\langle R_{\mathrm {E}}^2 \rangle \) and \(\langle R_{\mathrm {G}}^2 \rangle \). For \(\langle R_{\mathrm {E}}^2 \rangle \), the value of residual standard deviation is only 4.5 for \(N_{\text {min}}= 2000\), and hovers around 1 for \(N_{\text {min}}\ge 15{,}000\). For \(\langle R_{\mathrm {G}}^2 \rangle \), the residual standard deviation is only 3.7 for \(N_{\text {min}}= 2000\), and hovers around 1.3 for \(N_{\text {min}}\ge 70{,}000\). We feel that the most likely explanation for the spectacular fit for \(\langle R_{\mathrm {E}}^2 \rangle \) is happenstance, due perhaps to fortuitous cancellation between competing neglected correction-to-scaling terms over the range of N that is fitted. We plot \(\langle R_{\mathrm {E}}^2 \rangle /N\) versus N in Fig. 3 together with the fit from \(N_{\text {min}}= 15{,}000\), and \(\langle R_{\mathrm {G}}^2 \rangle /N\) versus N in Fig. 4 together with the fit from \(N_{\text {min}}= 70{,}000\). The quality of the data is such that error bars are invisibly small, and the quality of the fits is such that they pass exactly through each data point to visible precision.

Fig. 3
figure 3

Variation of \(\langle R_{\mathrm {E}}^2 \rangle /N\) with N, together with the fit with correction term from \(N_{\text {min}}= 15{,}000\). Note that error bars for the data points are invisible on the scale of the plot

Fig. 4
figure 4

Variation of \(\langle R_{\mathrm {G}}^2 \rangle /N\) with N, together with the fit with correction term from \(N_{\text {min}}= 70{,}000\). Note that error bars for the data points are invisible on the scale of the plot

We show the estimates of \(\kappa \) from our fits in Fig. 5. There we observe that the estimates are much closer to 1 / 4, and if we extrapolate the trends we obtain an estimate of \(\kappa = 0.2516(14)\) which is plotted on the y-axis. The value of 1/4 is only just outside our confidence interval, and we conclude that \(\kappa = 1/4\) is consistent with these fits.

Fig. 5
figure 5

Variation of fitted value of \(\kappa \) with \(N_{\mathrm{min}}\) when correction term is fitted. The lines of best fit to the final six values are shown, and our final estimate is plotted on the y-axis

We now proceed to obtain estimates of the amplitudes by biasing our fits based on the assumption that \(\kappa = 1/4\). Once again we include the correction term, and in this case our statistical models are:

$$\begin{aligned} \langle R_{\mathrm {E}}^2 \rangle&= D_{\mathrm{E}} N [\log (N/a)]^{1/4} \left( 1 - \frac{17\log (4\log (N/a))+31}{64\log (N/a)} \right) ; \end{aligned}$$
(10)
$$\begin{aligned} \langle R_{\mathrm {G}}^2 \rangle&= D_{\mathrm{G}} N [\log (N/a)]^{1/4} \left( 1 - \frac{17\log (4\log (N/a))+97/3}{64\log (N/a)} \right) . \end{aligned}$$
(11)

We find that this fitting form reproduces the Monte Carlo data quite well for both \(\langle R_{\mathrm {E}}^2 \rangle \) and \(\langle R_{\mathrm {G}}^2 \rangle \), but not as well as when \(\kappa \) was allowed to vary. For \(\langle R_{\mathrm {E}}^2 \rangle \), the value of the residual standard deviation is 31 for \(N_{\text {min}}= 2000\), and declines to 2.3 for \(N_{\text {min}}= 2\times 10^{6}\). For \(\langle R_{\mathrm {G}}^2 \rangle \), the residual standard deviation is 11 for \(N_{\text {min}}= 2000\), and hovers around 1.5 for \(N_{\text {min}}\ge 50{,}000\). Now, we see from Fig. 5 that for \(\langle R_{\mathrm {G}}^2 \rangle \) that the unbiased estimates for \(\kappa \) are very close to 1 / 4 over a wide range of values for \(N_{\text {min}}\), and so it makes sense that the effect of biasing \(\kappa = 1/4\) degrades the fits for \(\langle R_{\mathrm {G}}^2 \rangle \) to a lesser extent than for \(\langle R_{\mathrm {E}}^2 \rangle \).

We plot estimates for the amplitudes \(D_{\mathrm {E}}\) and \(D_{\mathrm {G}}\) in Figs. 6 and 7. For \(D_{\mathrm {E}}\), we see smooth convergence, and some visible upwards curvature leads us to estimate a value of \(D_{\mathrm {E}}= 1.3112(3)\) which is a little above the linear extrapolation shown. For \(D_{\mathrm {G}}\), the curvature over the plot range shown is somewhat larger, and it is difficult to tell to what extent the trend has really turned upwards (estimates with successive values of \(N_{\text {min}}\) are strongly correlated and so it is necessary to be careful not to read too much into “trends” from just a few data points). Thus in this case our central estimate of \(D_{\mathrm {G}}= 0.21849(2)\) is somewhat below the linear extrapolation, to guard against the possibility that the upward trend is due to statistical noise, which could arise from statistical errors in the data points that we are fitting. N.B., we expect the curvature to decrease as we reach the large N limit, as unfitted corrections to scaling become relatively smaller in magnitude, and this is another reason to think that the actual curvature may be somewhat less than it appears to the eye.

We calculate the ratio of amplitudes, with our confidence interval calculated by treating them as statistically independent quantities, obtaining \(D_{\mathrm {E}}/D_{\mathrm {G}}= 6.0012(14)\). This confirms our expectation that the amplitude ratio assumes the same value as for simple random walks. The estimates of \(D_{\mathrm {E}}\) and \(D_{\mathrm {G}}\) were obtained separately, and so the consistency of the ratio with the exact value of 6 verifies, to some extent, the soundness of the method of analysis.

Fig. 6
figure 6

Variation of fitted value of \(D_{\mathrm {E}}\) with \(N_{\mathrm{min}}\) when correction term is fitted. The line of best fit to the final six values is shown, and our final estimate is plotted on the y-axis

Fig. 7
figure 7

Variation of fitted value of \(D_{\mathrm {G}}\) with \(N_{\mathrm{min}}\) when correction term is fitted. The line of best fit to the final six values is shown, and our final estimate is plotted on the y-axis

We now seek to calculate the amplitude ratio \(D_{\mathrm {E}}/D_{\mathrm {G}}\) directly from the ratio \(\langle R_{\mathrm {E}}^2 \rangle /\langle R_{\mathrm {G}}^2 \rangle \), where the corresponding data are found in Table 1. First we note that

$$\begin{aligned} \frac{\langle R_{\mathrm {E}}^2 \rangle }{\langle R_{\mathrm {G}}^2 \rangle }&= \frac{D_{\mathrm{E}} N [\log (N/a)]^{1/4} \left( 1 - \frac{17\log (4\log (N/a))+31}{64\log (N/a)} + \cdots \right) }{D_{\mathrm{G}} N [\log (N/a)]^{1/4} \left( 1 - \frac{17\log (4\log (N/a))+97/3}{64\log (N/a)} + \cdots \right) } \end{aligned}$$
(12)
$$\begin{aligned}&= \frac{D_{\mathrm{E}}}{D_{\mathrm{G}}} \left( 1 + \frac{1}{48\log (N/a)} + \cdots \right) . \end{aligned}$$
(13)

In Fig. 8 we plot the ratio \(\langle R_{\mathrm {E}}^2 \rangle /\langle R_{\mathrm {G}}^2 \rangle \) versus \((\log N)^{-1}\); from the equation above we expect that the ratio will converge linearly to \(D_{\mathrm{E}}/D_{\mathrm{G}}\), and by performing a linear extrapolation we obtain the estimate \(D_\mathrm{E}/D_{\mathrm{G}} = 5.997(2)\).

We now fit the correction term in Eq. 13, allowing the parameter a to vary as before. For these fits, the residual standard deviation value declines from 70 when \(N_{\text {min}}= 2000\), to 1.7 when \(N_{\text {min}}= 2\times 10^{6}\), indicating that there is still a degree of systematic error due to neglected corrections to scaling even for the largest values of \(N_{\text {min}}\). Extrapolating the fits we obtain the estimate \(D_{\mathrm{E}}/D_{\mathrm{G}} = 6.00001(10)\) which is consistent with the expected value of 6 from the simple random walk, and considerably more accurate than the value of 6.0012(14) which was obtained from the ratios of the independently estimated amplitudes (Fig. 9).

Fig. 8
figure 8

Variation of \(\langle R_{\mathrm {E}}^2 \rangle /\langle R_{\mathrm {G}}^2 \rangle \) with N. The line of best fit to the final six values is shown, and our final estimate is plotted on the y-axis. All data up to \(N=10^9\) are shown, and no error bars are visible because they are vanishingly small on the scale of the plot

Fig. 9
figure 9

Variation of fitted value of \(D_{\mathrm {E}}/D_{\mathrm {G}}\) with \(N_{\mathrm{min}}\) when correction term is fitted. The line of best fit to the final six values is shown, and our final estimate is plotted on the y-axis

Finally, we note that estimates for a vary dramatically between fits; this was already observed by Grassberger et al. [14]. In principle, it should have the same value regardless of observable, and regardless of the number of terms used in the fits. We will not report in full the values obtained, but note that for the fits of \(\langle R_{\mathrm {E}}^2 \rangle \) the estimates of a change monotonically from 2.56 to 3.53 for the leading order fits as \(N_{\text {min}}\) is increased, from 0.201 to 0.207 for the fits with a correction term, and from 0.219 to 0.231 for the fits with a correction term and biasing \(\kappa =1/4\). The corresponding trends for the same fits for \(\langle R_{\mathrm {G}}^2 \rangle \) are 3.54 to 4.45, 0.273 to 0.254, and 0.265 to 0.263 respectively (the last sequence of estimates is not monotonic). We plot estimates for \(\kappa \) versus estimates for a for the fits with a correction term in Fig. 10, as these are the fits which most accurately describe the data. There we observe approximately linear covariance between the estimates, perhaps indicating that \(\kappa \) and a are simultaneously capturing the effect of unfitted higher order corrections to scaling. The approximate agreement between estimates coming from \(\langle R_{\mathrm {E}}^2 \rangle \) and \(\langle R_{\mathrm {G}}^2 \rangle \) could mean that these estimates of a are meaningful and converging to their true values, but we do not have a high degree of confidence in this statement. If the next correction-to-scaling term could be determined via theory then the resulting fits would likely reveal whether convergence is truly occurring.

Fig. 10
figure 10

Variation of fitted value of \(\kappa \) with fitted values of a when the correction term is fitted

4 Analysis of the Pivot Algorithm Acceptance Fraction

We now consider the behavior of f, the probability that a pivot move is successful, as a function of N. In the literature f is referred to as the acceptance fraction [21].

In [21], it is argued that a good first approximation for the probability that a pivot on a walk of 2N steps is successful, is the probability that two independently chosen N-step walks are self-avoiding when they are concatenated. For SAWs in dimension \(d \ne 4\), it is believed that \(c_N \sim N^{\gamma -1} \mu ^N\), with \(\gamma = 43/32 = 1.34375\) for \(d=2\) [23], \(\gamma = 1.15695300(95)\) for \(d=3\) [9], and \(\gamma = 1\) for \(d \ge 5\) [15, 16]. This argument suggests that f is \(O(N^{1-\gamma })\), but in practice it is observed that \(f = O(N^{-p})\) with \(p \approx 0.19\) for \(d=2\) (c.f. \(\gamma -1 \approx 0.344\)) and with \(p \approx 0.11\) for \(d=3\) (c.f. \(\gamma -1 \approx 0.157\)). Thus it seems that for \(d=2\) and \(d=3\) this heuristic is overly pessimistic and the two parts of the walk are less likely to intersect than independently chosen SAWs which are concatenated.

For \(d \ge 5\) the above argument strongly suggests that the probability of a pivot move being successful should be O(1), although to the best of our knowledge this has not been tested.

But, it is less clear what should happen for \(d=4\), and hence we will attempt to extract information about the asymptotic behavior of f from the data. Firstly, the probability that two independently chosen N-step walks are self-avoiding when they are concatenated is exactly \(c_{2N}/c_N^2\). From Eq. 3 we expect that the asymptotic behavior of this ratio is given by:

$$\begin{aligned} \frac{c_{2N}}{c_N^2}&= \frac{A \mu ^{2N} [\log (2N/a)]^{1/4} \left( 1 - \frac{17\log (4\log (2N/a))-3}{64\log (2N/a)} + \cdots \right) }{ \left( A \mu ^N [\log (N/a)]^{1/4} \left( 1 - \frac{17\log (4\log (N/a))-3}{64\log (N/a)} + \cdots \right) \right) ^2} \end{aligned}$$
(14)
$$\begin{aligned}&= \frac{1}{A}[\log (N/a)]^{-1/4} \left( 1 + \cdots \right) . \end{aligned}$$
(15)

This expression, together with the heuristic argument given above, implies that f should be no larger than \(O([\log N]^{-1/4})\) for \(d=4\).

Now, it is possible to calculate correction terms for Eq. 15, but because the probability of a pivot move being successful is, at best, only loosely related to \(c_{2N}/c_N^2\), there would not be any point in doing so. Instead, we fit f with a statistical model which is inspired by the functional form of Eq. 15, given by:

$$\begin{aligned} f&= [\log (N/a)]^{-\lambda } \left( 1 + O([ \log N ]^{-1}) \right) . \end{aligned}$$
(16)

We perform fits of the data for f in Table 2 for a and \(\lambda \), finding that the residual standard deviation of the fits declines from 66 when \(N_{\text {min}}= 2000\), to 2.1 when \(N_{\text {min}}= 2\times 10^{6}\). Clearly there are still significant systematic errors even for the largest value of \(N_{\text {min}}\), but the fact that the residual standard deviation is steadily declining, presumably towards 1, suggests that the statistical model is correct, and that deviations from the model are due to large unfitted correction terms. The assumption that the first neglected term was of magnitude \(O([ \log N ]^{-1})\) appears reasonable based on the plot, which appears to a good approximation to be behaving linearly as the fitted values approach the y-axis. Extrapolating the trend results in the estimate \(\lambda = 0.257(2)\) (Fig. 11).

Fig. 11
figure 11

Variation of \(\lambda \) with \(N_{\mathrm{min}}\) from fitting data for f

If the previous heuristic argument holds in a weakened sense, and a pivot move is at least as likely to be successful as the probability that two independent SAWs can be concatenated, then this would imply that \(\lambda \le 1/4\). Since \(\lambda \) is very close to 1 / 4, and we are not able to imagine any plausible reason for there to be fine tuning with \(\lambda \) close to but different from 1 / 4, we make the conjecture that \(\lambda = 1/4\) exactly.

As was the case for our leading order fits of \(\langle R_{\mathrm {E}}^2 \rangle \) and \(\langle R_{\mathrm {G}}^2 \rangle \), our estimated value is somewhat different from the conjectured exact value, but as there are likely to be strong corrections to scaling it is perfectly reasonable to suppose that if we had a better statistical model (as we do for \(\langle R_{\mathrm {E}}^2 \rangle \) and \(\langle R_{\mathrm {G}}^2 \rangle \)) then we would obtain an estimate for \(\lambda \) that is closer to 1 / 4.

5 Discussion and Conclusion

We found that model fits are far poorer for four-dimensional SAWs than for comparable computer experiments involving SAWs in three dimensions [9, 10]. In those cases, fitting the leading correction-to-scaling term (or eliminating it by creating an improved observable) resulted in “perfect” fits with residual standard deviation approximately 1 for \(N_{\text {min}}\) of the order of thousands, whereas here we found that although fits were acceptable there were still significant systematic errors even for \(N_{\text {min}}= 2\times 10^{6}\). This is perhaps not surprising given the occurrence of logarithmic corrections, but it does makes the task of extracting reliable parameter estimates considerably more difficult.

Nonetheless, by designing the computer experiment to collect high quality data for N as high as \(10^9\), we have found that we are able to get a good handle on these logarithmic corrections. This was especially so when correction-to-scaling terms were included in the fits, but we had modest success in our analyses for the leading exponent of \(\langle R_{\mathrm {E}}^2 \rangle \) and \(\langle R_{\mathrm {G}}^2 \rangle \), \(\kappa \), and the exponent for the probability of a successful pivot move, \(\lambda \), even when only the leading behavior was fitted.

Our best estimate of \(\kappa = 0.2516(14)\) is consistent with the renormalization group estimate of \(\kappa = 1/4\). If further evidence were needed, the coincidence of our best estimate of the ratio \(D_{\mathrm {E}}/D_{\mathrm {G}}= 6.00001(10)\) with the simple random walk value of 6 is confirmation that four-dimensional SAWs are indeed at the boundary of the simple random walk universality class.

We should mention that the parameter a, found in Eqs. 35, which was expected to be a constant, turned out to vary between different choices of fits, and choices of observables. This is perhaps not terribly surprising given the logarithmic corrections to scaling, but does warrant further investigation. If a further correction-to-scaling term could be determined via theory this may help resolve the question of whether estimates of a from different observables are converging.

We studied the behavior of the pivot algorithm for four-dimensional SAWs. We found that the probability of a pivot move being successful, f, for a SAW of length N, is consistent with \(f = O([\log N]^{-1/4})\), and conjectured that this relation is exact.

Finally, one of our key motivations for studying four-dimensional SAWs was to act as a test-case for attacking problems with logarithmic corrections to scaling, of which the \(\theta \)-transition in three dimensions [5, 13, 25] is another example. Although the capacity for quantitative understanding of the logarithmic corrections for four-dimensional SAWs would seem to be a necessary pre-condition for coming to grips with the \(\theta \)-transition, it is not sufficient. This is because the pivot algorithm for SAWs is remarkably efficient, whereas at the present time no comparably efficient method exists for sampling interacting self-avoiding walks (ISAWs) in the vicinity of the \(\theta \)-transition. If progress is to be made for the quantitative understanding of the \(\theta \)-transition, then significant improvements will need to be made in the efficiency of Monte Carlo sampling algorithms.