Background

The Tumor Necrosis Factor (TNF) belongs to the group of cell signalling proteins, cytokines that have complex biological function. It has been enormously investigated for more than 40 years, as it was found to have anti-tumoral effects in vivo. The TNF is able to induce fever, apoptotic cell death, tumor regression, cachexia, inflammation [1] and inhibit tumorigenesis and viral replication, as well as respond to sepsis via IL1- and IL6-producing cells [24]. The TNF has such a versatile function that even a relationship between serum TNF activity and insulin resistance in dairy cows with fatty liver has been found [5]. Although its multiple functions in vivo are mostly understood, the underlying mechanisms at cellular and even more molecular level still needs to be completely understood. The TNF exerts its function through interaction with its membrane receptor TNF-R. Having such complex, but vital functions, it is crucial to identify features within the TNF molecule which are relevant to each of these functions. Once these features are identified, it becomes possible to predict mutations to enhance or diminish observed function(s), as well as to design new peptides that can mimic specific protein function. The resonant recognition model (RRM) is a unique computational approach that can fulfil this task [6]. The RRM is a computational physical/mathematical model which is based on finding that certain periodicities in distribution of free electron energies along protein, DNA and RNA are strongly correlated to the biological function of these macromolecules [710]. Using this finding, it is possible to computationally analyse protein biological function(s), predict bioactive mutations, as well as design new peptide/protein with desired biological function [79].

Methods

Resonant Recognition Model

All proteins can be considered as a linear sequence of their constitutive elements, i.e. amino acids, and biological function of proteins is determined primarily by this linear sequence. The RRM [79] interprets this linear information by transforming a protein sequence into a numerical series and then into the frequency domain using digital signal processing method: the Fast Fourier Transform (FFT).

Protein primary structure can be presented as a numerical series by assigning the relevant physical parameter value to each amino acid. Our investigations have shown that the best correlation can be achieved with parameters which are related to the energy of delocalised electrons of each amino acid [calculated as electron ion interaction potential (EIIP)], as electrons delocalised from the particular amino acid have the strongest impact on the electronic distribution of the whole protein [79, 11]. The resulting numerical series represents the distribution of the free electrons energies along the protein molecule.

Such numerical series are then analysed by digital signal analysis methods, using FFT, in order to extract information pertinent to the biological function. As the distance between amino acid residues in a polypeptide chain is 3.8 Å, it can be assumed that the points in the numerical sequence are equidistant. For further numerical analysis, the distance between points in these numerical sequences is set at an arbitrary value of d = 1. Therefore, the maximum frequency in the spectrum is F = 1/2d = 0.5. The total number of points in the sequence influences the resolution of the spectrum only. Therefore, for N-point sequence the resolution in the spectrum is equal to 1/N. The n-th point in the spectral function corresponds to the frequency f = n/N.

In order to extract common spectral characteristics of sequences having the same or similar biological function, the multiple cross-spectral function is used. Peak frequencies in such a multiple cross-spectral function present common frequency components for all sequences analysed. Such common frequency components are found to be related to the common biological function of the analysed proteins leading to the conclusion that each specific biological function within the protein or DNA or RNA is characterised by one frequency [710, 1221].

Each biological function and/or process is driven by proteins that selectively interact with other proteins, DNA regulatory segments or small molecules. Using the RRM, it has been shown that proteins and their targets share the same matching characteristic frequency [710]. The matching of periodicities within the distribution of energies of free electrons along the interacting proteins can be regarded as the resonant recognition and is highly selective. Thus, the RRM frequencies characterise not only protein function, but also recognition and interaction between a particular protein and its targets: receptors, binding proteins and inhibitors. In addition, it has been also shown that interacting proteins have opposite phases at their characteristic recognition frequency [79, 17, 22]. Every frequency can be presented by one sinusoid characterised with its frequency, amplitude and phase. The phase is presented in radians and can be between −\( \pi \) and +\( \pi \) (−3.14 and +3.14). The phase difference of or about 3.14 is considered as opposite phase. The phase value can be presented in the phase circle where it is easier to observe graphically opposite phases.

Prediction of the Key Amino Acids—“Hot Spots”

Knowing the characteristic frequency of a particular protein function creates the possibility to predict which amino acids prevail in the sequence and predominantly contribute to this frequency and consequently to the observed function. This could be achieved by small alternations of amplitude in single protein spectrum at characteristic frequency and then observing which amino acids are mostly sensitive to this alternation [79, 1921]. These sensitive amino acids (“hot spots”) are related to characteristic frequency and consequently to the corresponding biological function. The “hot spots” predictions, using the RRM, have been applied already to a number of protein and DNA examples including: interleukin-2, SV40 enhancer, epidermal growth factor EGF, Ha-ras p21 oncogene product, glucagons, haemoglobins, myoglobins, and lysozymes [79, 1921].

It has been experimentally documented at the example of influenza virus that such predicted amino acids denote residues crucial for protein function [23]. In addition, these “hot spots” amino acids are found to be spatially clustered in the protein tertiary structure and positioned in and around the protein active site [1921].

Bioactive Peptide Design

Once the characteristic biological function of the protein is identified, it is possible to design new proteins with desired frequency components and consequently with desired biological functions [79, 1318]. The process of bioactive peptides design is as follows:

  1. (1)

    Determination of RRM characteristic frequency using multiple cross-spectral function for a group of protein sequences that share common biological function (interaction);

  2. (2)

    Determination of phases for the characteristic frequencies of a particular protein which is selected as the parent for agonist/antagonist peptide;

  3. (3)

    Calculation using Inverse Fourier Transform of the signal with characteristic frequency and phase. The minimal length of the designed peptide is defined by the characteristic frequency f as 1/f;

  4. (4)

    Determination of resulting amino acid sequence using tabulated EIIP parameter values.

This approach has already been successfully applied and experimentally tested in design of FGF [79, 18], HIV envelope protein analogue [79, 17] and peptide to mimic myxoma virus oncolytic function [13, 14].

Here, we applied RRM approach to analyse TNF with the aim to identify distinct characteristic frequencies for: (a) receptor recognition and (b) inflammatory activity related to induction of interleukin 1 (IL-1). Once these characteristic frequencies were identified, amino acids that mostly contribute to receptor recognition characteristic frequency, and thus are critical for receptor recognition, were identified as well. In addition, based on characteristic frequencies initially identified, we designed the peptide which is proposed to be able to recognise receptor without having inflammatory side effects.

Results and Discussion

To identify the characteristic frequencies relevant for TNF biological activities we have applied RRM multiple cross-spectral function, as described above, to the following seven TNF protein molecules: Q06599—TNFA_BOVIN, P51742—TNFA_CANFA, P01375—TNFA_HUMAN, P06804—TNFA_MOUSE, P23563—TNFA_PIG, P04924—TNFA_RABIT, and P16599—TNFA_RAT.

The resulting multiple cross-spectral function is showing one very prominent peak at frequency of f1 = 0.0508 ± 0.0043 and much smaller peak at frequency of f2 = 0.0391 ± 0.0043, as presented in Fig. 1.

Fig. 1
figure 1

TNF characteristic frequency f1 = 0.0508 ± 0.0043, as identified by RRM

To identify biological meaning of these two frequencies, we firstly analysed five TNF receptors: O19131—TNR1A_BOVIN, P19438—TNR1A_HUMAN, P25118—TNR1A_MOUSE, P50555—TNR1A_PIG, and P22934—TNR1A_RAT. As expected, the most prominent peak frequency in TNF receptors RRM multiple cross-spectral function is at frequency of f1 = 0.0488 + 0.0022 which is overlapping the most prominent peak for TNF, as presented in Fig. 2.

Fig. 2
figure 2

TNF receptor’s characteristic frequency f1 = 0.0488 + 0.0022, as identified by RRM

To confirm that this is the common frequency between TNFs and TNF receptors, we have calculated multiple cross-spectral function between 7 TNFs and 5 TNF receptors finding that indeed there is only one prominent peak at frequency of f1 = 0.0508 ± 0.0043, as presented in Fig. 3. This result confirms that this common frequency is characterising the interaction between TNFs and their receptors according to RRM.

Fig. 3
figure 3

Cross-spectral function between 7 TNFs and 5 TNF receptors showing there is only one prominent peak at frequency of f1 = 0.0508 ± 0.0043

TNFs are involved in inflammation and immune response through IL1 producing cells. Thus, to identify the meaning of the second peak frequency in TNF spectra we analysed twelve Interleukin1 proteins: P08831—IL1A_BOVIN, P09428—IL1B_BOVIN, P01583—IL1A_HUMAN, P01584—IL1B_HUMAN, P01582—IL1A_MOUSE, P10749—IL1B_MOUSE, P18430—IL1A_PIG, P26889—IL1B_PIG, P04822—IL1A_RABIT, P14628—IL1B_RABIT, P16598—IL1A_RAT, and Q63264—IL1B_RAT. Interestingly, we found that IL1 cytokines have the most prominent frequency of f2 = 0.0391 ± 0.0038, which is the same as the second prominent peak for TNFs, as presented in Fig. 4. From this finding we can conclude that TNFs recognise and interact with their receptor at a frequency of f1 = 0.0508 ± 0.0043, while they have synergistic function with IL1 at a frequency of f2 = 0.0391 ± 0.0038. This synergetic function is possibly related to rapid response to sepsis and inflammation.

Fig. 4
figure 4

IL-1 characteristic frequency f2 = 0.0391 ± 0.0038, as identified by RRM

Furthermore, when TNFs characteristic frequencies are compared with our earlier results with oncogene and proto-oncogene proteins some interesting correlation can be found [79, 12, 16, 19]. During RRM analysis the oncogene proteins, which are involved in oncogene transformation of the cell, have shown the most prominent characteristic at frequency of fo = 0.0322 + 0.0040 with less prominent characteristic at frequency of fp = 0.0537 + 0.0040. Proto-oncogene proteins are very homologous to oncogene proteins, but do not cause cell transformation and are only involved in cell growth. Proto-oncogene proteins show the most prominent characteristic frequency of fp = 0.0537 + 0.0040, while have less prominent characteristic at frequency of fo = 0.0322 + 0.0040. The conclusion of this earlier work was that frequency of fo = 0.0322 + 0.0040 characterises the process of cell transformation, while frequency of fp = 0.0537 + 0.0040 characterises cell growth without transformation. In addition, possibility is that frequency fp can even prevent cell transformation. Interestingly, TNF’s two characteristic frequencies are overlapping both fo and fp with fp most prominent. Thus, we can propose that TNF behaves similarly as proto-oncogene proteins. This result could be relevant for TNF involvement in inhibition of tumorigenesis and can confirm that frequency range around 0.05 could be crucial for keeping cell growth in control, while frequency range around 0.03 is crucial for cell transformation. Thus, protein which would have only frequency of 0.05 and the phase same as TNF at this frequency could be a good candidate for inhibition of tumorigenesis and cancer treatment without negative side effects like inflammation.

As the main characteristic frequency of TNF activity and its recognition to receptor is identified to be at frequency of f1 = 0.0508 ± 0.0043, it is now possible to identify amino acids that are mostly contributing to this frequency in a single protein. We have chosen for this analysis mature TNF Human protein: P01375—TNFA_HUMAN (77–233), having the following sequence:

VRSSSRTPSDKPVAHVVANPQAEGQLQWLNRRANALLANGVELRDNQLVVPSEGLYLIYSQVLFKGQGCPSTHVLLTHTISRIAVSYQTKVNLLSAIKSPCQRETPEGAEAKPWYEPIYLGGVFQLEKGDRLSAEINRPDYLDFAESGQVYFGIIAL.

Using the procedure described above with −5 % of change at characteristic frequency of f1 = 0.0508 ± 0.0043, we identify that the following amino acids are the most contributing to the TNF characteristic frequency: G54, T77, G153. Although these amino acids are not sequentially linked, they are close to each other in TNF tertiary structure, as presented in Fig. 5. This is in accordance with our previous results showing that amino acids related to functional RRM frequency are clustered together in protein 3D structure and are around active site [1921].

Fig. 5
figure 5

“Hot spots” amino acids related to TNF receptor recognition are positioned in TNF 3D (1TNF.pdb) structure showing that these amino acids although not sequentially connected are spatially close to each other in TNF 3D structure

The next step was to design de novo peptide, which will have only frequency of f1 = 0.0508 ± 0.0043. For this purpose, we have used pair of TNF human—TNF human receptor as the modelling molecule interaction. The main characteristic frequency for this recognition is at frequency of f1 = 0.0508 ± 0.0043. To design new peptide or protein, it is required to identify phases at this characteristic frequency for both interacting proteins at characteristic frequency. According to the RRM, the phases at characteristic frequency within the interacting molecules should be opposite. Using the RRM we found that the phase at characteristic frequency of f1 = 0.0508 ± 0.0043, within TNF human (P01375) protein is −1.73 rad, while within its receptor (P19438) it is +2.04 rad, as presented in phase circle in Fig. 6. Thus, the phase difference is 3.77 rad which is close to 3.14 rad meaning these phases are almost opposite. This is the further proof that the frequency of f1 = 0.0508 ± 0.0043 is crucial for interaction of TNF protein with TNF receptor.

Fig. 6
figure 6

Phase estimation, presented on a phase circle, human TNF (left), and human TNF receptor (right) for frequency of f1 = 0.0508 ± 0.0043. It can be seen from the phase circle that the phase difference is 3.77 rad which is close to be opposite

The aim here is to design protein that will have only receptor recognition frequency as possible cell control frequency, while not having IL1 synergistic frequency which is possibly related to cell transformation. Thus, according to RRM procedure described, we designed protein with characteristic frequency of f1 = 0.0508 ± 0.0043 and phase of −1.73 rad. The minimal length of this protein is determined as 1/f1 and is 20 amino acids. However, to be able to distinguish between frequency of f1 = 0.0508 ± 0.0043 and frequency of f2 = 0.0391 ± 0.0038 and to make sure there is no overlap between these two frequencies, the necessary minimal length of newly designed sequence is calculated as 1/(f1–f2) and it is 53 amino acids. Accordingly, we designed the following 60 amino acids protein: TNFRRM60:

IEPKWQTRDDDDRCQYHPGINEHAWMRDDDDDRMWAHENLNPHYQCRDDD DRFQWKPEII

The newly designed protein has the RRM spectrum, as presented in Fig. 7.

Fig. 7
figure 7

RRM spectrum for newly designed 60 amino acid sequence TNFRRM60

As it has been shown previously in examples of FGF [79, 18], HIV virus envelope protein [79, 17] and Myxoma virus analogues [13, 14] designed by RRM, it is expected that this TNF analogue designed using the same RRM procedure express only desired function, i.e. TNF receptor recognition and activation.

Conclusion

We have applied here the RRM, which is a physical/mathematical computational model, to analyse complex and multifunctional TNF protein. We identified within the TNF sequence features (frequencies) specific for each biological function of TNF:

  • frequency of f1 = 0.0508 ± 0.0043 which is found to be characteristic of TNF—TNF receptor recognition and

  • frequency of f2 = 0.0391 ± 0.0038 which is found to be relevant to TNF—IL1 synergistic activity.

We also propose, based on our earlier work with oncogene and proto-oncogene proteins [79, 12], that these two frequencies are not only characterising specific TNF functions, but also describe the TNF role in wider functional cascade of cell growth without oncogene transformation (f1) and with oncogene transformation (f2). Concentrating on frequency f1 which also describes receptor recognition and using human TNF as model protein, we also identified specific phases for TNF–TNF receptor interaction, identified key amino acids for this interaction, as well as designed de novo peptide which is proposed to have only receptor recognition function without side effects.

It has been shown here that the RRM is a powerful tool in computational analysis of complex multifunctional proteins like TNF. In addition, the RRM can also be used in identifying functional key amino acids, “hot spots”, as well as in design of de novo peptides/proteins with desired biological function.