Keywords

1 Introduction

The formation and characterization of self-assembled monolayers (SAMs) on solid surfaces has been extensively studied for several decades. The easy preparation of SAMs with different terminal chemical functionalities has made them convenient for far-reaching and numerous applications, including bio-related technologies such as biosensors and medical implants, nano- and microfabrication, nanodevices, and corrosion protection. Experimental microscopy studies have long shown that SAMs have high concentrations of defects [13]; in some cases, as with the nanofabrication method of microcontact printing, naturally occurring imperfections in the SAMs were shown to play a beneficial role in the process [4]. In most cases, however, defects in the monolayers can have unexpected and perhaps undesirable consequences. Two commonly occurring defects arise from imperfections in the substrate (leading to increased surface roughness after self-assembly) and imperfections in the self-assembly process (i.e., the so-called film defects).

Though molecular simulation can offer unique insights into the consequences of SAM structural imperfections, it has only rarely been done [59]; limitations of small simulation cell sizes and/or insufficient sampling times have prevented the explicit exploration of defects in typical SAM modeling studies [4]. We have employed the enhanced sampling method parallel tempering metadynamics using the well-tempered ensemble (PTMetaD-WTE), which we have used successfully in several prior studies to study peptide/protein adsorption at interfaces [1012]. A description of other simulation approaches to studying these types of problems can be found elsewhere [11].

In this work, we build on our prior simulations [11] of the model peptide LKα14 [13] adsorbing onto an ideal SAM. Past work focused on obtaining structural and thermodynamic information of adsorbed peptides, with a specific emphasis on quantitative comparison to experimental measurements of side chain orientation. However, the systems studied were very idealized due to their lack of SAM structural imperfections. In this work, we take the logical next step by studying the impact of incorporating surface defects and provide new insights into the consequences of SAM imperfections on the structure and binding thermodynamics of adsorbed biomolecules. Herein, we have performed a series of molecular dynamics (MD) simulations with PTMetaD-WTE of LKα14 adsorbing onto a carboxyl-terminated alkanethiol SAM with both substrate and film naturally occurring defects incorporated to mimic experimental observations. In addition to the simplicity of the peptide (the alpha helix organizes the side chains into a hydrophobic and charged, hydrophilic face with sequence LKKLLKLLKKLLKL), this combination of surface and peptide was chosen owing to the ease with which future experiments could be performed related to further understanding defects in SAMs. With an idealized SAM as a control, two types of defects are introduced, namely a gold depression that creates shortened alkyl chain lengths to mimic a characteristic defect in the underlying gold substrate and a characteristic film defect arising from faulty packing of the SAM (i.e., chains pointed toward and away from each other), creating domain boundary effects. We also used an advanced clustering analysis and reweighting technique to reveal large differences in surface-bound peptide conformations caused by the presence and type of incorporated SAM defect. As we discuss, this analysis is quite general and can be applied to any type of biased protein/surface simulation.

2 Methods

2.1 System Setup

System specifications are reported in Table 1, including information from a control simulation without defects from our past work [11]. Systems consisted of one LKα14 peptide, a SAM surface functionalized with a carboxylic acid/carboxylate head group, explicit TIP3P waters, and sodium ions to achieve system charge neutrality. The LKα14 peptide structure was generated with the VMD psfgen plug-in [14], and the defect-free SAM surface was based on our prior studies. LKα14 was capped with a deprotonated carboxylate group to match experimental conditions [1523], imparting it an overall peptide charge of +5. Two types of defects were introduced into the SAM surfaces. The first type of defect mimics an experimentally observed defect in the underlying gold substrate where depressions in the gold layer lead to shortened alkyl chain lengths (hereafter referred to as a “Type I” defect, see Fig. 1).

Table 1 Setup of PTMetaD-WTE simulations
Fig. 1
figure 1

Side view of LKα14 (side chains shown in space-filling representation and hydrogen not included) on a SAM surface with a Type I substrate defect causing areas of shortened alkyl chain lengths. The +z direction is orthogonal to the SAM surface and the +x direction is out of the plane of the page. Chains are colored to highlight frozen atoms (silver frozen CH2 atoms) and head group atoms allowed to remain free during MD simulation (teal carbon, yellow hydrogen, and red oxygen)

The original surface consisted of 100 randomly alternating protonated and deprotonated chains in a 1:1 ratio to mimic a bulk pH of 7.4 [24]. Fifty chains were randomly mutated to have reduced alkyl chain lengths from 12 to 8 carbons. The same force field parameters were used for the head groups of both the healthy and mutated chains, leaving the overall surface charge of −50 unaffected. Force field parameters were taken from the AMBER99SB-ILDN force field [25] (i.e., COOH/COO from glutamic acid/glutamate). Triplicate systems were set up in this manner; distributions of the healthy/mutated chains for the 3 systems are shown in Fig. 2.

Fig. 2
figure 2

Distribution of healthy to defective (i.e., short) chains for the three Type I defect simulation trials. The +z direction is out of the plane of the page. Cyan and magenta represent healthy and defective chains, respectively

The second type of defect mimics a characteristic film defect that occurs during SAM self-assembly, where alkyl chains pointing in opposite directions lead to domain boundary effects (hereafter, “Type II” defect, see Fig. 3). To introduce this defect while still maintaining the original R3 geometry and 30° normal tilt angle of the SAM chains [26], it was necessary to remove 30 of the original 100 chains. A portion of the remaining chains was then rotated about the chains’ centers of mass (minus the head groups), creating both the outward and the inward defects shown from left to right in Fig. 3. To prevent spurious interactions with the thiol group exposed at the base of the inward boundary defect, thiol groups were removed from the original surface. As all simulations used periodic boundary conditions in the x, y, and z dimensions to allow for electrostatic calculations with the particle mesh Ewald (PME) method [27], the peptide could interact with water in the triangular regions marked in blue in Fig. 3.

Fig. 3
figure 3

Side view of LKα14 (side chains shown in space-filling representation and hydrogen not included) on a SAM surface with a Type II film self-assembly defect causing inward and outward boundary effects. The +z direction is orthogonal to the SAM surface and the +y direction is out of the plane of the page. Chains are colored to highlight frozen atoms (silver frozen CH2 atoms) and head group atoms allowed to remain free during MD simulation (teal carbon, yellow hydrogen, and red oxygen)

Simulations used the GROMACS 4.6.5 MD engine [28] with the AMBER99SB-ILDN force field [25] and the PLUMED 2.0 plug-in [29]. Box heights were chosen to permit diffusion of the peptide beyond the short-range van der Waals and Coulombic cutoff distances of 1.0 nm to experience a bulk-like state. The peptide was prevented from interacting with the image of the surface by placing a harmonic restraint on its center of mass that began acting on the peptide at a z-distance of 4.5 nm from the top of the surface. Energy minimization was performed on all surfaces with a steepest descent algorithm for 40,000 steps, followed by the minimization of the solvated peptide/surface systems where the first 6 and 10 CH2 groups were frozen for the mutated and healthy SAM chains, respectively. Chains were frozen to prevent diffusion or melting at high temperatures and remained frozen in all ensuing simulations while movement of the head groups was unrestricted.

2.2 System Setup

Due to the strong binding forces that exist between the peptide and surfaces, the use of a multiscale modeling algorithm to overcome sampling challenges is essential. This type of algorithm, as applied to protein adsorption, should (1) have strong atomistic detail (e.g., be based on MD or other molecular techniques), (2) be scalable to systems of practical size, and (3) allow for quantitative comparison with experiments (e.g., in resolving the conformation and orientation of adsorbed proteins for comparison with, for example, SFG results). A method that can address all of these challenges is metadynamics (MetaD) [30, 31], which works by applying a history-dependent bias to one or more collective variables (CVs) that describe the underlying changes in a system (e.g., interfacial versus solution state structure of biomolecules in an adsorption process) in reduced dimension:

$$V\left( {{s}({r}),t} \right) = W\sum\limits_{{t^{\prime} = \tau_{\text{G}} ,2\tau_{\text{G}} }}^{t^{\prime} < t} {\prod\limits_{i = 1}^{{N_{\text{CV}} }} {\exp \left[ {\frac{{ - \left( {{s}_{{i}} ({r}) - {s}_{{i}} ({r}({t}^{\prime } ))} \right)}}{{2\sigma_{i}^{2} }}} \right]} }$$
(1)

The added bias potential, V(s, t), is added to the overall potential energy and is repulsive, Gaussian-shaped, and centered on the CV at the time of addition. This results in a net force that prevents the system from exploring previously visited states and instead encourages it to explore new regions of the CVs. To achieve smooth convergence of the bias potential, we use the well-tempered variant of metadynamics (WTM) [32]:

$$W^{\prime } = \omega^{*} \exp \left[ { - \frac{V(s,t)}{{k_{\text{B}} \Delta T}}} \right]$$
(2)

In Eq. (1), the number of CVs is given by N CV, the values of which are defined by a functional mapping that relates the CV to the system’s geometry, or s(r). Gaussian “hills” are added every τ G time steps with characteristic height W and width σ. WTM leads to an exponential decrease in the amount of bias added to previously explored regions of phase space (Eq. 2). The instantaneous hill height, W′, is also controlled by an adjustable parameter ΔT that is related to the characteristic barrier heights in the system. In a post-processing manner, the cumulative bias from the simulation can be inverted to obtain the underlying free energy surface (FES) as projected onto the CVs [33].

Despite its capacity to greatly enhance conformational sampling, MetaD suffers from the ability of the chosen CVs to overcome hidden degrees of freedom in the system. This can be addressed with the use of parallel tempering (PT) [34, 35], which manipulates some or all degrees of freedom in a more general way (e.g., by increasing the temperature of the system). PT works by requiring many parallel simulations or “replicas” of the system that span a wide temperature range and exchange configurations periodically according to the Metropolis criterion. In this way, PT can be combined with metadynamics (PTMetaD [36]) to both increase the exploration of CV space and overcome hidden energy barriers.

The addition of sampling in the well-tempered ensemble (WTE) [37] provides an efficiency boost to the method, which has been discussed elsewhere [10]. The WTE algorithm works by using the potential energy itself as a CV and amplifying energy fluctuations (while leaving average energies of the original ensemble untouched) to increase overlap in the energy distributions of adjacent temperatures. This in turn increases the frequency of exchange between replicas and thus increases the overall efficiency of the method. The degree of amplification of the energy fluctuations is controlled via the same adjustable parameter ΔT. However, the WTE bias of the simulation is generally constructed with a different value of this parameter. Commonly, ΔT is rewritten as γ, called the bias factor, where γ = (ΔT + T)/T [31].

PTMetaD-WTE was used with the same procedure described in past work [11], including the use of a new functionality in PLUMED 2.0 [29] to provide a slight improvement to the method. Spanning a range of 300–450 K, 12 configurationally identical replicas were simulated in a short, 1 ns NVT PT simulation to equilibrate each replica at its respective temperature. A 10 ns WTM simulation biasing the potential energy was then performed to establish the WTE to increase sampling efficiency through increasing the spread in the system’s potential energy. A bias factor of 20 was used in all WTM simulations with Gaussian hills added every ps with a width of 200 kJ/mol at an initial height of 2.0 kJ/mol.

Production runs biased two CVs for LKα14 with an additional two-dimensional MetaD bias potential. As with past work [11], the first CV biased the distance between LKα14’s center of mass (COM) and the surface, whereas a second conformational CV biased the number of backbone α-helical hydrogen bond contacts. A switching function with a reference bond length of 0.25 nm was used to define the degree of the contact, which was defined between α-helical hydrogen bond donor/acceptor pairs along the peptide backbone (i.e., i, i + 4 pairs). The distance and conformational CVs were biased with Gaussian hill widths of 0.05 and 0.1 nm, respectively. A bias factor of 10 was used in all PTMetaD-WTE simulations with Gaussian hills added every ps at an initial bias deposition rate of 3.0 kJ/mol/ps.

3 Results and Discussion

3.1 Convergence of MetaD Simulations

To assess convergence of the PTMetaD-WTE simulations, the free energy difference between the adsorbed and solvated states was calculated as a function of time. Convergence was established when the change in the free energy difference became negligible with time. Figure 4 shows the change in the Helmholtz binding energy as a function of simulation time for each of the systems listed in Table 1. All simulations were initially run for 200 ns per replica, and all Type I defect simulations were deemed converged by the end of that time period. The Type II defect simulation was extended by 50 ns per replica to achieve convergence. Figure 4 shows that both the type of defect (i.e., Type I vs. Type II) and the distribution of the defects (i.e., Type I, trials I–III) impact the final value of the free energy change upon binding as compared to the control simulation.

Fig. 4
figure 4

Convergence of free energy differences between solvated and adsorbed states for PTMetaD-WTE simulations at 300 K. The negative value implies a decrease in free energy upon adsorption

3.2 Clustering of Surface-Bound Structures

Figure 5 shows the Helmholtz energy as a function of distance between LKα14 (Cα center of mass (COM)) and the surface (frozen C10 atom) for each simulation listed in Table 1. Figure 5c shows the minimum peptide/surface distance for the control simulation is approximately 1 nm; therefore, any minima in Fig. 5a, b below 1 nm represent binding to defective areas of the SAM surfaces.

Fig. 5
figure 5

Helmholtz free energy as a function of LKα14/SAM distance for PTMetaD-WTE simulations at 300 K: a Type I defect simulations, trials I–III; b Type II defect simulation, energy minima highlighted in inset; and c control simulation. Note that the relative energy scale is arbitrary owing to the trivial constant introduced in the estimation of the free energy from the MetaD bias potential

To determine the effect of the defects on peptide adsorption, an RMSD-based clustering algorithm [38] was used to extract the most dominant structures in each of the wells in Fig. 5. The algorithm works by first removing external translational and rotational motions so that only the internal structural fluctuations can be characterized. A least-squares alignment between all unique pairs of structures is then performed and an RMSD value is calculated for each pair. For each structure, other structures that fall below a given cutoff value in RMSD are assigned as “neighbors”. The structure with the largest number of neighbors and all of its assigned neighbors is assigned a cluster number and removed from the pool of clusters. The process is then repeated for all remaining structures until each is assigned a cluster value.

An important point should not be overlooked. The clusters obtained in the manner described above are obtained from biased MD trajectories. Therefore, it is impossible to directly compute relative cluster weights or probabilities only using the output of a clustering analysis. Instead, we employed a previously demonstrated reweighting technique [39] that makes use of the classic Torrie-Valleau umbrella sampling reweighting approach [40] with statistical weights calculated according to Eq. (3):

$$w = \exp \left( { V_{\text{bias}} \beta } \right)$$
(3)

where the bias potential in this case is obtained by using the final MetaD bias potential treated as a static biasing potential. We note for interested readers that this analysis is trivially performed within PLUMED/GROMACS by using the “-rerun” functionality of the MD engine along with the final MetaD bias (e.g., the “HILLS” file) and the MD trajectory (i.e., in this case, the 300 K replica trajectory from the sampling scheme). Care should be taken to avoid using the portion of the trajectory that corresponds to the MetaD transient. However, in this case this is not an issue as we only clustered the 2nd half of the trajectories—far beyond the end of the transient period. With the proper statistical weights in hand for the trajectory of surface-bound structures, the final probability of each cluster is trivially calculated by normalizing and summing the individual weights (calculated via Eq. 3) for each member in each cluster.

The analysis was first performed on the trial III Type I defect simulation; since Fig. 5a shows similar free energy profiles for the three trials, we deemed analysis of a single trial to be sufficient. Skipping every second frame to reduce computation time, surface-bound structures (defined as peptide/surface distances below 1.2 nm) were clustered with an RMSD cutoff value of 0.2 nm. As noted above, we used only the second half of the trajectory for the clustering analysis to eliminate the transient part of the MetaD bias potential. Among 39,696 structures, 78 clusters were determined. The control simulation was analyzed in a similar manner, resulting in 29 clusters from 29,848 surface-bound structures. The central conformation of each cluster, the so-called cluster centers, for the top three weighted clusters for each of these simulations, along with their respective weights, is shown in Fig. 6. Both top and side views are included for the Type I defect simulation to highlight binding to either normal or shortened alkyl chain lengths.

Fig. 6
figure 6

Top three surface-bound cluster center conformations from a clustering analysis of the Type I, trial III defect simulation compared to the control simulation with no chain defects. Secondary structure is indicated by peptide backbone color: Purple designates an α-helix, magenta a turn, and cyan a random coil. Silver and pink represent healthy and defective chains, respectively

The first thing to note is the difference in cluster distribution between the defect and the control simulations: Conformations in the top three clusters of the defect simulation make up about 81 % of the total probability of surface-bound states, whereas conformations in the first cluster alone in the control simulation have a similar probability of existing on the surface of just over 78 %. As Fig. 6 shows, this is because areas of shortened alkyl chain lengths caused by depressions in the gold substrate below the SAM surface dramatically disrupt the helical structure that LKα14 normally adopts at interfaces, leading to a wide array of unfolded structures. Nearly, all secondary structure, indicated by the color of the peptide’s backbone (i.e., magenta, cyan, and purple indicate turns, coils, and alpha helical residues, respectively), is lost with the addition of the surface defects. Unlike the central cluster conformations from the control simulation, those from the defect simulation appear to have little in common apart from a tendency toward unstructured coils, which makes sense as defective chains are randomly distributed across the surface.

The same analysis was performed on the Type II defect simulation for each of the three energy minima highlighted in Fig. 5b (i.e., A, B, and C). These minima are related to the presence of the outward boundary defect (see Fig. 3); the inward boundary defect appears to have little influence on binding. Within ± sigma of each minimum, all structures below an RMSD cutoff of 0.2 nm were clustered. This resulted in 9,885 structures in 11 clusters for minimum A, 41,203 structures in 23 clusters for minimum B, and 14,710 structures in 9 clusters for minimum C. The central cluster conformations of the clusters with the top three weights calculated for each of the minima are shown in Fig. 7.

Fig. 7
figure 7

Top three surface-bound cluster center conformations from a clustering analysis of the Type II defect simulation for each energy minima highlighted in Fig. 5b. Secondary structure is indicated by peptide backbone color: Purple designates an α-helix, blue a 310-helix, magenta a turn, and cyan a random coil

Similar to the Type I defect results, conformations in the first cluster of energy minimum A make up about 60 % of all surface-bound states. As the distance between the peptide and the surface increases to correspond to energy minima B and C, however, the cluster distributions become tighter (i.e., over 95 % of all surface-bound structures reside in the top weighted cluster), similar to what was observed with the control simulation. The trends make sense given that the results for energy minimum C should most closely represent those of the control simulation due to the particular peptide/surface distance.

Deep in the hydrophobic cleft (i.e., minimum A) highly extended conformations of LKα14 are stabilized compared to structures in the control simulation, which we believe is due to the shape of the defect. Figure 5b shows binding in the pocket of minimum A is stronger than that for minimum B and much stronger than that for minimum C on top of the surface, which, as mentioned earlier, should most closely resemble the control simulation. Some α-helicity is retained on top of the surface (i.e., minimum C), as indicated by the purple color of the peptide’s backbone in the cluster center conformations. However, even the mere presence of the defect causes the peptide to extend over the edge of the surface into the cavity, thereby affecting the normally helical structure of LKα14.

4 Summary/Conclusion

The enhanced sampling method PTMetaD-WTE was employed to simulate the adsorption of LKα14 to a model hydrophilic SAM with a carboxylate/carboxylic acid-terminated head group and two types of induced surface defects. Naturally occurring defects were chosen to best mimic what has been observed experimentally and included both a substrate defect and a characteristic SAM film defect. Results of free energy versus peptide/surface distance showed a difference in the location of the free energy minima for the surfaces with defects compared to a control surface with no defects. The results also indicated binding to the surface with the characteristic film defect (“Type II” defect) is much stronger than binding to the control surface, which we hypothesized is due to the specific shape of the hydrophobic cleft defect.

A clustering analysis was performed to elucidate structural differences in the bound peptide caused by the surface defects. Results showed the presence of either type of defect heavily disrupts the helical structure that LKα14 normally adopts at interfaces. In performing this analysis, peptide structures were extracted from basins, aligned, and clustered, and thus, orientation of the peptides with respect to the surface was not taken into account, only the conformation. In this case, it was not important to distinguish between orientations because charged or hydrophobic side chains dominate the surface-bound orientations. However, prior to reweighting it would be trivial to extend the clustering analysis to distinguish between orientations by subdividing further to, for example, distinguish between hydrophobic/hydrophilic patches on a peptide or protein or using other directional descriptors to account for protein orientation in conjunction with the conformational clusters.

This work will also have implications for future experimental work. Surface-guided self-assembly of proteins is growing in interest; the observed effects on peptide structure from relatively small changes in surface roughness suggest careful design of the electrostatic and van der Waals interactions at the protein/surface interface may be required. Additionally, this method could be used as a means to reverse engineer protein structure by designing and incorporating specific surface defects to control the structure of biomolecules upon adsorption.

Finally, we note that the predictions from these simulations could be directly probed with surface spectroscopies such as sum frequency generation (SFG) spectroscopy [16]. Provided self-assembly of SAMs of different chain lengths was possible, adsorption of LKα14, we predict, would reveal no appreciable SFG signal compared to neat SAMs, which reveal the expected helical structures. Likewise, using a combination of techniques such as surface plasmon resonance (SPR) and atomic force microscopy (AFM) [41], we propose it would be possible to study the expected increases in binding energy due to the film formation defects. Of course, this would depend on being able to synthesize in a controlled way the film-type defects.