Introduction

Parkinson’s disease (PD) is a progressive neurodegenerative disorder of the central nervous system characterized by muscle rigidity, bradykinesia, and tremor. The disease affects older people, and it is known that 2–3% of the population ≥ 65 years of age are more prone towards this disease [1]. It is also associated with loss of dopaminergic neurons in the substantia nigra, lewy body generation, and abnormal clustering of α-synuclein protein, which is directly connected to expectancy of long life. Hence, effective research for neurodegenerative disease treatment is one of the vital clinical needs of today’s life. The current therapy of PD includes restoration of dopamine with levodopa in the striatum of the brain. However, to maintain the therapeutic level, the dosage has to be increased which does not prevent the underlying neuronal loss [2, 3]. On the other hand, such long-term treatment in addition may cause adverse effects which include levodopa-induced dyskinesia and behavioral disturbances in the individuals [4, 5].

Adenosine enzyme inhibitors can be considered an alternative medication in the treatment of PD having less degree of adverse effects. Adenosine is an endogenous modulator of different physiological functions in the peripheral tissues in addition to the central nervous system (CNS). It is a purine nucleoside having four varieties of subtypes consisting of A1, A2A, A2B, and A3. A2A receptors are highly expressed in striatum (dopamine-rich areas of the CNS) where it is almost co-located with dopamine D2 receptor on GABAergic striatopallidal neurons [6]. A2A antagonistically interferes with the D2 receptor and as a result decreases the affinity of the D2 receptor for dopamine upon stimulation and show opposite effect on motor function [7]. Thus, adenosine A2A receptor blockade may show motoric improvement as proven by many animal models [8,9,10,11].

Positron emission tomography (PET) [12] and single-photon emission computed tomography (SPECT) [13] are non-invasive methodologies which make use of the dynamic distribution of the radiotracers and provide 3D map of the brain quantifying the biological processes. These imaging agents help in the detection and quantification of dopamine and adenosine receptors in the brain thereby creating a path for early detection of the disease. PET studies are superior to SPECT in terms of accurate results and in determining the temporal measurements of radioactivity with their regional distributions. Agonists and antagonists containing positron-emitting radioisotopes can be introduced in vivo to get 3D image of the receptors which have been helpful in CNS diagnosis. The PET tracers can be used as in vivo–imaging agents in order to improve the pharmacokinetics, physicochemical properties, and mapping of the receptor as per interest. As search of new compounds with desired activity is time-consuming and expensive, pharmaceutical companies have a great interest upon theoretical approaches to design compounds with desired activity.

Quantitative structure–activity relationships (QSAR) have gained a lot of attention in molecular modeling field and are beneficial due to less involvement of human resource and cost-effectiveness [14, 15]. It attempts to develop a correlation between the chemical structures with a well-defined activity. It expresses chemical structures and physiological property in the numerical form and develops a mathematical correlation between them. Furthermore, this relationship can be used to predict the biological response of other existing chemical structures. QSAR-based studies have shown useful applications in drug discovery, molecular modeling, pharmaceutical toxicity modeling, pharmacokinetics/toxicokinetics, data mining, environmental toxicity (ecotoxicity), chemical or drug property modeling, food science, agricultural sciences, pesticide toxicity, fragrance, nanoscience (Nano-QSAR), and many other fields [16,17,18,19,20,21,22,23,24]. QSAR is also used to predict the absorption, distribution, metabolism, excretion, and toxicological (ADMET) of drug like compounds [25, 26]. QSAR has widespread applications in drug design, medicinal chemistry, and predictive toxicology. It has also become an effective tool in understanding and determining the major biochemical features associated with the Parkinson’s disease [27, 28].

In the present study, we have tried to develop QSAR models with PET tracers of xanthine ligands as A2AR (adenosine receptors) antagonist using only 2D descriptors to explore the structural features required for binding affinity towards A2AR and selectivity of the tracers between A2B and A2A receptors.

Fig. 1
figure 1

Observed vs predicted A2AR binding affinity scatter plot

Materials and methods

Dataset

The experimental binding affinity and selectivity data of 35 xanthine ligand–based PET tracers were taken from a previously published literature [29] and applied for QSAR modeling to determine the essential structural features needed for binding affinity and explore the structural requirements necessary to be present in the antagonists for selectivity towards A2A adenosine receptors. The experimental values of selectivity and binding affinity (Ki) ranged from 0.1–20 nM and 7.84–16,500 nM respectively, and the details are provided in Supplementary Material I (Table S1). The experimental values were converted into negative logarithm scale during modeling and were used as independent values. No compounds with binding affinity data were removed during modeling but some compounds (14, 32, 33, and 34) with no experimental selectivity values were eliminated during modeling. Here, the binding affinity and selectivity were separately used as endpoints or independent variables in modeling. The compounds for both the dataset were represented in the MarvinSketch software [30] with proper aromatization and addition of hydrogen bond as necessary.

Molecular descriptors

In the present study, QSAR models were developed using a selected class of two-dimensional molecular descriptors involving E-state indices, connectivity, constitutional, functional, 2D atom pairs, ring, atom-centered fragments, molecular property descriptors, and extended topochemical atom (ETA) indices. The ETA descriptors were calculated using the PaDel-Descriptor software [31], whereas the non-ETA descriptors were calculated using the Dragon 7 software [32]. Intercorrelated (|r| > .95), constant (variance < 0.0001), and other incompetent and redundant data was removed using an in-house software available at http://dtclab.webs.com/software-tools before model development.

Dataset division

Dataset division is a crucial part of QSAR modeling in order to develop a properly validated and robust model. Rational data division ensures an unbiased external validation along with uniform data distribution [33]. The division of the dataset into training set (~ 70%) and test set (~ 30%) was performed employing random dataset division method [34] for both binding affinity and selectivity end points. The training set was used for model development, and the test set was used for model validation.

Variable selection and model development

Prior to model development, we have performed variable selection strategies such as Genetic Algorithm (GA) [35, 36] and stepwise regression [35, 37] for binding affinity and selectivity, respectively, to extract the important and influential descriptors and created a reduced pool of descriptors. After obtaining the important descriptors, we went for model development. The best model with five descriptors was obtained using the spline option in the GA run on Discovery Studio version 4.1 for the binding affinity. On the other hand, for A2AR selectivity, four models with four descriptors were selected from the Best Subset Selection (BSS) method based on MAE criteria [38]. Furthermore, to improve the quality of the external prediction via “intelligent” selection of multiple models, we have applied an “Intelligent consensus predictor” tool [39] developed in our laboratory [40].

Statistical validation metrics

The statistical quality of the models developed in the present study was rigorously examined using multiple approaches to check the robustness and predictivity of the developed models. All the models were validated both externally and internally. Various parameters like determination coefficient R2, explained variance R2a, variance ratio (F), and standard error of estimate (s) were computed. Internal predictivity parameters such as predicted residual sum of squares (PRESS) and leave-one-out cross-validated correlation coefficient (Q2LOO) were also calculated along with external predictivity parameters like R2pred or \( {Q}_{F1}^2 \), \( {Q}_{F2}^2 \), and concordance correlation coefficient (CCC) [41]. It has been reported that consensus models are better in performance in comparison with an individual model [41]. Therefore, we have also performed “Intelligent Consensus Prediction (ICP)” using multiple models to see whether the quality of predictions can be increased through an intelligent selection.

Applicability domain

Applicability domain (AD) [42] is a theoretical region in the chemical space developed based on modeled descriptors and modeled response of the training set, where the developed model could make predictions basing on some logical reliability. Here, we have checked AD using standardization approach using the tool developed in our laboratory [40].

Molecular docking

Molecular docking analysis has been implemented in the present work that helps in understanding the intermolecular interactions taking place between the PET tracer antagonists and the A2A receptor. The protein structure for adenosine A2A receptor is retrieved from the protein data bank with PDB ID:3UZA [43]. The X-ray crystal structure of the protein consists of a bound ligand T4G commonly known as 6-(2,6-dimethylpyridin-4-yl)-5-phenyl-1,2,4-triazin-3-amine (formula: C16H15N5). Before docking the target PET tracers, protein preparation was done by cleaning the protein for any missing residues, explicit hydrogen addition, and generation of the docking site. The generation of active docking site was done in the BIOVIA Discovery Studio platform from the ligand-binding domain of the bound ligand T4G by the selection of the ligand and generating the site “from current selection” program in receptor-ligand interaction module of the software. After the generation of the active ligand-binding domain, the bound ligand was removed for new molecule docking. For ligand preparation, the PET tracers were put through small molecule module in the Discovery Studio platform where a series of ligand conformers were generated. Each of these generated conformers was then used in the CDOCKER module energy for molecular docking involving CHARMm interaction [44]. The CDOCKER interaction energy parameter (kJ/mol) was checked for all the receptor ligand complexes, and the top scoring (most negative, thus favorable to binding) poses were kept.

Results and discussion

Based on the binding affinity and selectivity endpoints of 35 xanthine PET tracer antagonists of adenosine A2A receptor, we have developed one model for the binding affinity (Q2 = 0.85, R2 = 0.90, Q2F1 = 0.80) and 4 models (Q2 = 0.80–0.87, R2 = 0.87–0.91, Q2F1 = 0.84–0.85) for selectivity. All the models were externally and internally validated which showed model robustness and good predictivity in terms of the statistical results. We have also checked the rm2 parameters for both internal sets (\( \overline{r_{m(loo)}^2},\varDelta {r}_{m(loo)}^2 \)) and external sets (\( {r}_{m(test)}^2 \) and\( \varDelta {r}_{m(test)}^2 \)), and the statistical results were above the critical point justifying the reliability of the models. To improve the quality of the external prediction for selectivity, we also performed “Intelligent Consensus Prediction” of the multiple MLR models using the ICP tool [39], and found that the consensus predictions were better than the individual MLR model–derived predictions. The winner model was consensus model 0 (CM0).

Table 1 Definition and contribution of all the descriptors obtained from the MLR models (models developed by using binding affinity)

Modeling binding affinity of PET tracers towards adenosine (A2A) receptor

The model for binding affinity consists of five descriptors: C-025, F09 [N-O], nBnz, NRS, and nCIR which significantly influence the binding of the antagonists to the adenosine (A2A) receptor. The 5 descriptor MLR model (Eq. 1) developed using Genetic Function Algorithm (GFA) could predict 85.0% variance of the training set and 80.0% of the test set. The values of all descriptors appearing in the model for training and test set compounds are given in Supplementary Material II (Excel file) and the scatter plot of the observed vs. predicted binding affinity is shown in Fig. 1.

$$ pKi\left({\mathrm{A}}_{2\mathrm{A}}\mathrm{R}\right)=-0.849\left(\pm 0.2167\right)-0.36271\left(\pm 0.06190\right)\ \mathrm{C}-025+0.17693\left(\pm 0.05895\right)\mathrm{F}09\left[\mathrm{N}-\mathrm{O}\right]-0.52109\left(\pm 0.07616\right)\mathrm{NRS}+0.81699\left(\pm 0.09908\right)\mathrm{nBnz}+0.3024\left(\pm 0.03363\right)\mathrm{nCIR} $$
$$ {\mathrm{n}}_{\mathrm{training}}=25,{R}^2=0.901,{R}_{\mathrm{adj}}^2=0.875,{Q}^2=0.850,S=0.170027,F=34.62,\mathrm{PRESS}=0.833306,\overline{r_{\mathrm{m}\left(\mathrm{LOO}\right)}^2}=0.790,{\Delta r}_{\mathrm{m}\left(\mathrm{LOO}\right)}^2=0.072,\mathrm{MAE}-\mathrm{based}\ \mathrm{criteria}=\mathrm{Moderate} $$
$$ {\mathrm{n}}_{\mathrm{test}}=10,{Q}_{\mathrm{F}1}^2=0.80,{Q}_{\mathrm{F}2}^2=0.681,\overline{r_{\mathrm{m}\left(\mathrm{test}\right)}^2}=0.54,{\Delta r}_{\mathrm{m}\left(\mathrm{test}\right)}^2=0.23,\mathrm{MAE}-\mathrm{based}\ \mathrm{criteria}=\mathrm{Good} $$

Essential features required for binding and receptor interaction

The descriptors obtained in the QSAR model (Table 1) give an insight regarding the mechanism of interaction occurring during binding of the xanthine PET tracer antagonists to adenosine A2A receptor. Unsaturation and aromaticity play a dominating role in regulating the receptor binding affinity which is evident from the occurrence of descriptors such as C-025, nBnz, NRS, and nCIR. Descriptors like nBnz and nCIR have positive influences on the adenosine A2A receptor binding (Fig. 2). But on the other hand, descriptors like C-025 and NRS have negative effects on the binding affinity of the PET tracers (Fig. 3). The occurrence of these similar types of descriptors with opposite influence is contradictory and leads to a conclusion that aromaticity provided by benzene nucleus (as seen in compounds like A-32 and A-23) is more important for binding. On the other hand, the presence of heterocyclic aromatic rings and fused-ring systems decrease the overall binding affinity of the radiotracer molecule (found in compounds A-1, A-2, and A-20).

Fig. 2
figure 2

Features increasing the binding affinity (pKi) value

Fig. 3
figure 3

Features decreasing the binding affinity (pKi) value

The 2D atom pair descriptor F09 [N-O] gives information about the electronegativity of the compounds, and the positive coefficient of the descriptor suggests that higher occurrence of nitrogen and oxygen at topological distance 9 would enhance binding affinity of the compounds as seen in compounds A-4 and A-32. It is found that the presence of electronegative atoms in the compounds or chemical structures can influence the binding to the receptor through hydrogen bonding [45].

Molecular docking

Molecular docking helped in understanding the optimized conformation of the complex between the imaging agent and A2A receptor and gave evidences related to the orientation of the imaging agents at the binding zone of the receptor. The major goal was to understand the molecular interactions taking place during radiotracer binding and correlate these findings with QSAR analysis. The docking analysis showed the predominance of different types of π bonding interactions and hydrogen-bonding interactions. In higher active compounds (Fig. 4) like A-4, A-8, and A-25 (pA2AR(BA) = 0.699, 1.000, and 0.398 respectively), the interaction forces include mainly hydrogen-bonding interactions (conventional hydrogen bond and carbon-hydrogen bond interaction), π interactions (π-cation, π-donor hydrogen, π-π stacked, π-π T-shaped, and π-alkyl). Other interactions include halogen and alkyl interaction in compound A-4 and salt bridge formation in compound A-8. Higher number of interacting residues supports the fact that these compounds have higher binding affinity. Compounds having binding affinity in the medium range (Fig. 5) like compound numbers A-14 and A-27 (pA2AR(BA) = − 0.301 and − 0.255 respectively) make less number of interactions with the adenosine receptor, but the type of interactions remains similar, i.e., π interactions and hydrogen-bonding interactions. The lowest active compounds (Fig. 5) like compound numbers A-20 and A-35 (pA2AR(BA) = − 1.301and − 1.204 respectively) show the least number of interactions. All the details of binding including interacting residues and type of binding interactions are given in Table 2.

Fig. 4
figure 4

Docking interactions for compounds having higher binding affinity (pKi)

Fig. 5
figure 5

Docking interactions for compounds having medium (A-14) and low (A-35) binding affinity (pKi)

Table 2 Details of interacting residues and different types of binding interaction occurring between the PET imaging agents and the target protein (adenosine A2A receptor)

Relationship with QSAR models

The docking study shows different types of π interactions occurring between the PET radiotracer molecules and adenosine A2A receptor. This observation supports the occurrence of nBnz and nCIR descriptors obtained in the QSAR models. The presence of aromatic rings like benzene can enhance binding with the receptor through aromatic π-π stacking interaction with the phenyl/imidazole residue of the receptor [46]. The interaction of these antagonists through π-π stacking interaction eventually blocks the receptor in the indirect pathway thus blocking the activity of GABA-mediated influence in the globus pallidus pars externa (GPe). This helps the PD patients to gain the motor function again by regaining the balance between direct and indirect pathway. Nitrogen and oxygen are capable of hydrogen bond formation and various types of hydrogen bonding as observed in both higher active and lower active compounds, and this can be also correlated to the F09[N-O] descriptor which gives an idea about the electronegativity of the molecule.

Modeling selectivity of PET tracers towards adenosine (A2A) receptor

In the current work, we have developed four MLR models to understand the selectivity of the PET tracer molecules towards adenosine A2A receptor. A single QSAR model may not be efficient enough for the prediction of activity since the property of molecules cannot be understood by a limited number of features. The use of multiple models for prediction using consensus approach helps in reducing model uncertainty by enhancing the prediction quality of the external set and also in reducing the prediction errors [38]. The four MLR models are given below:

Model 1

$$ \log {A}_{2A}R(Sel)=0.5875\left(\pm 0.4130\right)+0.4643\left(\pm 0.1574\right)\ \mathrm{C}-027-0.8679\left(\pm 0.1797\right)\mathrm{C}-040+0.7245\left(\pm 0.1006\right)\mathrm{F}09\left[\mathrm{N}-\mathrm{O}\right]+0.8382\left(\pm 0.01749\right)\mathrm{ETA}\_\mathrm{Beta}\_\mathrm{s} $$
$$ {n}_{\mathrm{training}}=21,{R}^2=0.915,{R}_{\mathrm{adj}}^2=0.893,{Q}^2=0.867,S=0.234982,F=42.88, $$
$$ \mathrm{PRESS}=1.37546,\overline{r_{\mathrm{m}\left(\mathrm{LOO}\right)}^2}=0.81227,{\Delta r}_{\mathrm{m}\left(\mathrm{LOO}\right)}^2=0.07373,\mathrm{MAE}-\mathrm{based}\ \mathrm{criteria}=\mathrm{Moderate} $$
$$ {n}_{\mathrm{test}}=10,{Q}_{\mathrm{F}1}^2=0.84,{Q}_{\mathrm{F}2}^2=0.81,\overline{r_{\mathrm{m}\left(\mathrm{test}\right)}^2}=0.7682,{\Delta r}_{\mathrm{m}\left(\mathrm{test}\right)}^2=0.11949,\mathrm{MAE}-\mathrm{based}\ \mathrm{criteria}=\mathrm{Good} $$

Model 2

$$ \log {A}_{2A}R(Sel)=0.36359\left(\pm 0.43605\right)-0.76227\left(\pm 0.18863\right)\mathrm{C}-040-0.05224\left(\pm 0.02421\right)\mathrm{T}\left(\mathrm{F}..\mathrm{Cl}\right)+0.71046\left(\pm 0.11057\right)\mathrm{F}09\left[\mathrm{N}-\mathrm{O}\right]+0.09777\left(\pm 0.01808\right)\mathrm{ETA}\_\mathrm{Beta}\_\mathrm{s} $$
$$ {n}_{\mathrm{training}}=21,{R}^2=0.90,{R}_{\mathrm{adj}}^2=0.87,{Q}^2=0.82,S=0.274853,F=35.21, $$
$$ \mathrm{PRESS}=1.05627,\overline{r_{\mathrm{m}\left(\mathrm{LOO}\right)}^2}=0.7526,{\Delta r}_{\mathrm{m}\left(\mathrm{LOO}\right)}^2=0.05874,\mathrm{MAE}-\mathrm{based}\ \mathrm{criteria}=\mathrm{Moderate} $$
$$ {\mathrm{n}}_{\mathrm{test}}=10,{Q}_{\mathrm{F}1}^2=0.84,{Q}_{\mathrm{F}2}^2=0.82,\overline{r_{\mathrm{m}\left(\mathrm{test}\right)}^2}=0.7737,{\Delta r}_{\mathrm{m}\left(\mathrm{test}\right)}^2=0.04197,\mathrm{MAE}-\mathrm{based}\ \mathrm{criteria}=\mathrm{Good} $$

Model 3

$$ \log {A}_{2A}R(Sel)=0.9642\left(\pm 0.4535\right)+0.31245\left(\pm 0.08846\right)\ \mathrm{nCIC}+0.4848\left(\pm 0.1856\right)\mathrm{C}-027-0.9394\left(\pm 0.2114\right)\mathrm{C}-040+0.6662\left(\pm 0.1184\right)\mathrm{F}09\left[\mathrm{N}-\mathrm{O}\right] $$
$$ {n}_{\mathrm{training}}=21,{R}^2=0.883\ {R}_{\mathrm{adj}}^2=0.854,S=0.274853,F=30.27, $$
$$ \mathrm{PRESS}=1.72765,{Q}^2=0.833,\overline{r_{\mathrm{m}\left(\mathrm{LOO}\right)}^2}=0.76,{\Delta r}_{\mathrm{m}\left(\mathrm{LOO}\right)}^2=0.12,\mathrm{MAE}-\mathrm{based}\ \mathrm{criteria}=\mathrm{Moderate}, $$
$$ {n}_{\mathrm{test}}=10,{Q}_{\mathrm{F}1}^2=0.84,{Q}_{\mathrm{F}2}^2=0.82,\overline{r_{\mathrm{m}\left(\mathrm{test}\right)}^2}=0.77,{\Delta r}_{\mathrm{m}\left(\mathrm{test}\right)}^2=0.13,\mathrm{MAE}-\mathrm{based}\ \mathrm{criteria}=\mathrm{Good} $$

Model 4

$$ \log {A}_{2A}R\left(\mathrm{Sel}\right)=1.3245\left(\pm 0.2988\right)-0.6702\left(\pm 0.2119\right)\ \mathrm{C}-040+0.10445\left(\pm 0.04427\right)\mathrm{SssN}+0.05519\left(\pm 0.01932\right)\mathrm{F}07\left[\mathrm{C}-\mathrm{C}\right]+0.5954\left(\pm 0.1263\right)\mathrm{F}09\left[\mathrm{N}-\mathrm{O}\right] $$
$$ {n}_{\mathrm{training}}=21,{R}^2=0.872\ {R}_{\mathrm{adj}}^2=0.84,\mathrm{S}=0.287861,F=27.24, $$
$$ \mathrm{PRESS}=2.09555,{Q}^2=0.827,\overline{r_{\mathrm{m}\left(\mathrm{LOO}\right)}^2}=0.717,{\Delta r}_{\mathrm{m}\left(\mathrm{LOO}\right)}^2=0.131,\mathrm{MAE}-\mathrm{based}\ \mathrm{criteria}=\mathrm{Moderate}, $$
$$ {\mathrm{n}}_{\mathrm{test}}=10,{Q}_{\mathrm{F}1}^2=0.85,{Q}_{\mathrm{F}2}^2=0.83,\overline{r_{\mathrm{m}\left(\mathrm{test}\right)}^2}=0.78,{\Delta r}_{\mathrm{m}\left(\mathrm{test}\right)}^2=0.07,\mathrm{MAE}-\mathrm{based}\ \mathrm{criteria}=\mathrm{Good} $$

The significant descriptors obtained from the four MLR models (M1–M4) contributing to A2A receptor selectivity are C-040, C-027, F09 [N-O], ETA_Beta_s, nCIC, T (F..Cl), SsssN, and F07[C-C]. All the descriptors positively contribute to the A2A receptor selectivity, except C-040, as identified from the regression coefficients of the descriptors and summarized in Table 3. We have also checked the applicability domain of the developed MLR models. The models showed good predictive ability as per the statistical results. The details of the descriptors, their contribution, and frequency of appearance in all the four models are explained elaborately in Table 3. The values of all descriptors appearing in the models for training and test set compounds are given in the Supplementary Material II (Excel file) and the scatter plots of the observed vs. predicted selectivity values are given in Figure 6.

Fig. 6
figure 6

Observed vs predicted A2AR selectivity plots for all four MLR models

Table 3 Definition, frequency, and contribution of all the descriptors obtained from the MLR models

Mechanistic interpretation

All the descriptors obtained in the four models and their frequency give an idea about their importance in modeling the selectivity of the PET tracers towards adenosine A2A receptor. The descriptors like C-027, F09[N-O], SsssN, T(F..Cl), and ETA_Beta_s appearing in the models give information about the electronic feature of the compounds and are essential when the selectivity of receptor is considered (Fig. 7). Electronegativity is a chemical property that describes the tendency of an atom to draw electron towards itself. If a compound contains higher number of electronegative atoms in its structure, then the selectivity of the A2A receptor for that compound also increases.

Fig. 7
figure 7

Features affecting the adenosine A2A selectivity

The presence of atom-centered fragments like C-027 (R--CH--X) in compounds like A-23 and A-25 increase the antagonist selectivity of the PET compounds. Since ‘X’ represents any electronegative atom like O, N, S, P, Se, and halogens, the presence of heteroatoms increases the selectivity of the compounds towards A2A receptor. The descriptor F09[N-O] explains the frequency of presence of nitrogen and oxygen at the topological distance 9, and its positive regression coefficient indicates its influential activity on the antagonistic behavior of the imaging agents (as seen in compounds A-4 and A-27). Another similar kind of descriptor is T (F..Cl), explaining the information about sum of topological distances between F and Cl atoms in the chemical structure. These descriptors give information about the electronegative atoms, i.e., nitrogen and oxygen in F09[N-O] and fluorine and chlorine in T(F..Cl). ETA_Beta_s (Σβs) is an extended topochemical atom (ETA) descriptor, which can be represented as sum of βs values of all non-hydrogen vertices divided by 2. The term ′βs′ can be denoted as

$$ \sum {\beta}_s=\sum x\sigma $$

Here, x represents contribution of sigma bonds and σ signifies parameters related to sigma bonds. During the computation of β values, the sigma bond value for two similar types of electronegative atoms should be considered 0.5, and dissimilar electronegative atoms should be considered 0.75. This suggests that compounds bearing dissimilar heteroatoms will have greater selectivity to A2A receptor as seen in compounds A-25, A-23, and A-4. Sigma bonds connected with different heteroatoms will have higher descriptor values indicating that the presence of dissimilar heteroatoms is more favorable for selectivity than similar heteroatoms. E-state descriptor SsssN (> N—) encodes the intrinsic electronic state of the nitrogen atom as perturbed by the electronic influence of other molecules with the context of topological character within the molecule. The electronegative contribution of nitrogen is well-depicted in this descriptor, and the positive regression coefficient shows that an increase in the number of tertiary nitrogen benefits in receptor selectivity as seen in compounds A-30 and A-4.

Other descriptors which significantly contribute to A2A receptor selectivity are nCIC, F07[C-C], and C-040. These descriptors give information about the number of rings present, type of bonds, and size of the antagonists showing selectivity towards the receptor. The number of rings (cyclomatic number) in the structure is indicated by nCIC descriptor. The positive regression coefficient of the descriptor suggests that the presence of high number of rings increases the selectivity towards the A2A receptor as observed in compounds A-25 and A-4. F07[C-C], a 2D atom pair stands for frequency of C–C fragment at the topological distance 7. It provides information about the size (chain length) of the molecule. This means that with an increase in the number of this fragment, i.e., carbon chain, the selectivity towards the A2A receptor increases (as in compounds A-4 and A-25). The atom-centered fragment descriptor, C-040 (Table 3) gives information about the number of carbon atoms that are attached to heteroatoms by single/double or triple bonds in the straight chain length. The negative regression coefficient suggests that an increase in the number of such fragments decreases the selectivity of the compound towards the A2A receptor as seen in compounds A-6, A-7, and A-35. As this fragment suggests high number of double and triple bonds attached with the carbon, it can be concluded that unsaturation in the straight chain of the antagonists is unfavorable for the receptor selectivity.

Intelligent consensus predictions

For further refinement of the predictions obtained from the individual models, we have applied intelligent consensus modeling methods. Consensus modeling helps in enhancing the prediction performance of the models and also reduces the test set errors. It was observed that consensus prediction of the test set compounds (Table 4) is better in terms of both MAE-based criteria and predicted R2 parameter. Four different consensus approaches were used employing “Intelligent Consensus Prediction” tool [39]: CM0 (simple average of predictions), CM1 (average of predictions from the ‘qualified’ individual models), CM2 (weighted average predictions (WAPs) from ‘qualified’ individual models), and CM3 (best selection of predictions (compound-wise) from ‘qualified’ individual models). From the four consensus model obtained, CM0 was found to be the best.

Table 4 Detailed summary of the QSAR models and consensus models obtained for selectivity PET tracer compounds for adenosine A2A selectivity (the quality of the best model CM0 is shown in italics)

Applicability domain

Applicability domain (AD) is an important tool for reliable application of QSAR models. It can be considered a “theoretical region in chemical space defined by the respective model descriptors and responses in which the predictions are reliable” [42, 47]. We have checked the AD of all the models using standardization approach [48] to check whether any molecule in the test set lies outside the AD of a model. From the domain of applicability analysis, it was found that there were no test set compounds outside the AD, and no compound in the training set came as an outlier (see Supplementary II Excel file).

Comparison with a previously published model

A direct comparison between the current and a previously published model [29] is infeasible due to the differences in the composition of training and test sets. However, the current model can be considered more advantageous since it has been developed using simple and easily interpretable two-dimensional descriptors which does not require any conformational analysis or energy minimization before their calculation.

Conclusion

Parkinson’s disease is a neurodegenerative disease affecting the elderly person around the world. An important target for its treatment is blocking adenosine A2A receptor which is co-located with the D2 receptor and is pharmacologically opposite in motor function. Many studies hint that blocking A2A receptor would be a beneficial strategy in the treatment of PD. Thus, this work endeavors exploring QSAR analysis to correlate the chemical structures with their biological activity with the aim to filter the essential chemical features of an antagonist for selectivity and binding affinity to A2A receptor. The computational approach used in this work consists firstly the calculation of the molecular descriptors, and secondly, correlating these descriptors with the binding affinity and selectivity using different chemometric tools such as Genetic Function Algorithm (GFA), Best Subset Selection (BSS) method, and Intelligent consensus predictor (ICP) tools. The statistical quality of the models was checked using traditional metrics both internally and externally. We have also discussed about the contributions of the descriptors in the light of known binding mechanisms such as π-π stacking interaction, hydrophobic interaction, and hydrogen bonding with the different protein residues present in the receptor binding sites. From the insights obtained from such mechanism, we found that electronegative atoms and presence of aromatic ring like benzene are favorable for enhancing the binding affinity to the A2A receptor. Furthermore, the docking studies supported the conclusions derived from the QSAR studies. In conclusion, the study highlights the pharmacophoric features mainly responsible for antagonizing adenosine receptors that can be further modified for better binding and selectivity to A2A receptor. In case of selectivity also, electronegativity and aromaticity of the compounds play essential and influential roles. The simple two-dimensional (2D) descriptors appearing in all the models are easier to compute requiring no conformation analysis or energy minimization process. Thus, this information would help in the future development and synthesis of newer PET tracer targeted towards adenosine receptor.