1 Introduction

Atmospheric processes in the troposphere directly impact human well-being. The thermodynamical variables (surface temperature, T and relative humidity, RH), dynamical variables (wind speed, WS and direction, WD), and thermodynamic structure of shallow atmosphere (mixing height) govern the transport of contaminants. T and RH directly influence thermal comfort at the workplace, thereby affects economic growth, and also plays a vital role in smart agricultural management. The sensible and latent heat fluxes from various land use categories influence the near-surface T and RH and directly affect the evolution of the atmospheric boundary layer (ABL). The ability of the pollutant dispersion is greatly affected by the mixing depth of ABL. WS and WD play a significant role in the advection of pollutants. Photochemical reactions and photolysis rates of pollutants are affected by the radiation balance and cloud cover. Given the dynamic nature of various feedbacks that influence the transportation of air pollutants, it is necessary to more accurately model the meteorological variables for applications in air quality modeling (Boadh et al. 2016; Sathyanadh et al. 2017) and heat mitigation studies (Kadaverugu et al. 2021).

Air quality management has become a top priority in most countries as the latent health issues associated with poor air quality are conspicuous and pervasive. It is estimated that globally 90% of the children under 15 are exposed to particulate matter concentrations above the WHO permissible limits (WHO 2018). Out of the estimated 9 million premature deaths annually, 2.5 million are from India, and 1.8 million are from China (Landrigan et al. 2018). According to Greenstone and Fan (2018), poor air quality reduces the life expectancy by four years as a global average and it reaches as high as ten years in some metropolitan cities like New Delhi. In developing countries, air quality is measured through manual methods (which have poor temporal resolution) and also monitored through the network of automated sensors (but they are scanty). For instance, in India, the metro cities (area) like Delhi National Capital Region (1500 km2), Hyderabad (650 km2), Mumbai (600 km2), and Kolkata (205 km2) have just 30, 6, 3 and, 4 automated monitoring stations, respectively, maintained by the Central Pollution Control Board (CPCB, https://cpcb.nic.in/), as of 2020. The coarse spatial–temporal resolution of meteorological and air quality data is insufficient for studying the dispersion and mapping of non-attainment zones (Kadaverugu et al. 2019). To fill the gap and complement the existing monitoring networks, several researchers have stressed the need for air quality modeling at multiple scales ranging from regional to urban and then to building scale (Kadaverugu et al. 2019). The need for improved air quality models can only be satisfied with the more accurate meteorological modeling. Hence, we aimed to study the widely used Weather Research and Forecasting (WRF) model's suitability and tested the model performance in the central Indian region.

The WRF model is widely used globally for multipurpose numerical weather predictions, which provides a platform for chemical transport modeling with the WRF-Chem model (https://www.mmm.ucar.edu/weather-research-and-forecasting-model). For instance, the WRF modeling platform is applied for studying aerosol impacts over the Mediterranean region (Georgiou et al. 2018), tropospheric ozone in Brazil (Gavidia-Calderón et al. 2018), and climate modeling in the USA (Yahya et al. 2017). Also, the WRF model in complement with other air quality models is used for source apportionment studies (Wu et al. 2018) and air quality mapping due to forest fires (Jose et al. 2017). Several studies using the WRF model are also reported from India on the spatial–temporal variation of O3 over Hyderabad (Sheel et al. 2016), characterization and source apportionment in Delhi (Sharma et al. 2016), and source apportionment in Visakhapatnam (Police et al. 2016). Although a WRF-Chem model is the most advanced mesoscale model featuring atmospheric dynamics, physics, and trace gas chemistry schemes (Kadaverugu et al. 2019), its relevance and suitability over various geo-climatic zones is not thoroughly understood. Model performance significantly varies with the choice of gaseous or aerosols schemes (Knote et al. 2015; Yang et al. 2018), planetary boundary layer (PBL) schemes (Perez et al. 2006; Cuchiara et al. 2014; Banks and Baldasano 2016), and gridded emission inventory (Saikawa et al. 2017). Especially, the PBL parameterization schemes are significant from a meteorological point of view. The boundary layer processes due to surface forcing are represented through various PBL parameterizations in a numerical weather prediction model.

The effect of PBL schemes on the WRF model’s performance is essential to be tested over diverse environments and geographical locations (Sathyanadh et al. 2017). Several researchers have evaluated the performance of the WRF model on different Indian domains. For instance, Panda and Sharan (2012) have reported on rather poor performance of the WRF model with almost all PBL and land surface model (LSM) parameterizations over northern India, but they observed a relatively better model performance with Noah-LSM and MY-E (Mellor-Yamada Eta) schemes over the Delhi region. Madala et al. (2014) have observed that rainfall and surface pressure over the Gandaki region are under-predicted by many PBL schemes, except MY-E and Grell Devenyi. Mohan and Bhati (2011) reported that Pleim-Xiu-ACM (Asymmetric Convective Model) and MM5-YSU (Multi-scale Model version5—Yonsei University) schemes suit best for the Delhi region in simulating temperature, relative humidity, and wind speed. The choice of PBL schemes also significantly affects the diurnal evolution of the mixing layer (Hariprasad et al. 2014), thereby influencing the dispersion of pollutants and other meteorological variables (Banks and Baldasano 2016).

The three PBL schemes viz. YSU (Hong et al. 2006), ACM2 (Pleim 2007a, b), and MY-E (based on Mellor and Yamada 1982) are widely studied over Indian domains. YSU and ACM2 are first-order non-local schemes (See Xie et al. 2012; Hariprasad et al. 2014), while MY-E is a one-and-half order local Turbulent Kinetic Energy (TKE)-based scheme and is also referred to as Mellor-Yamada Eta scheme (See Janjić 1994; Mesinger 1993a, b, 2010). Panda and Sharan (2012) observed that the WRF model with MY-E scheme coupled with Noah LSM produced better results than other PBL schemes over western Indian regions. According to Shrivastava et al. (2014), the mixing height is quite accurately predicted over Mangalore City for both cold and dry seasons with the MY-E scheme combined with the Eta surface layer scheme (Mesinger 1993a; Janjić 1994, 1996). The WRF model applied in Rachi City, situated in the mid-eastern part of India, showed that boundary layer variables are better simulated with ACM2 parameterization (Madala et al. 2015). Gunwani and Mohan (2017) showed that the WRF model with the ACM2 scheme produced better model performance over entire Indian climatic zones. The ACM2 scheme also provides better meteorological forcing for mesoscale air quality modeling (according to Xie et al. 2012).

In the present study, we have analyzed the WRF model performance by varying the PBL schemes over three different geographical settings representing the rural, semi-urban, and urban environments of the central Indian domain. We have further compared the model performance during January and March representing the winter and summer seasons. With this context, the objectives of the present study are (a) to understand the performance of the WRF model in capturing the diurnal variations, spatial variations among urban, rural, and semi-urban environments, along with the seasonal variations, and (b) to test the WRF model accuracy by varying the PBL schemes viz. ACM2, YSU, and MY-E. The WRF model-simulated meteorological variables (T, RH, WS, and WD) were compared with the observed data over the rural, semi-urban, and urban environments. The thermodynamic profile of upper air was validated with the radiosonde data measure over the urban location. The description of the study domain, monitoring locations, data sources for the WRF model, and settings used in the study are presented in Sect. 2. The results and discussion on the WRF model-simulated surface and vertical thermodynamic variables is provided in Sect. 3, followed by the conclusions in Sect. 4.

2 Methods

2.1 Study area

The study area occupies 69,389 km2 (shown as Domain-2 in Fig. 1A) in the eastern part of Maharashtra state (also known as the Vidarbha region) of central India. It extends between 19.25–21.76° N and 77.41–79.78° E with an altitude varying from 13 to 1000 m above the mean sea level. The region experiences a hot tropical climate with temperatures ranging between 3.5–11.6 °C (minimum in winter, from November to February) and 43–47 °C (maximum in summer, from March to May). It receives an annual rainfall of 1200 mm (from June to August) during the southwest monsoon. Around 77.7% of the area is under cultivation, 17.5% area is covered with forest, 2.7% is covered by shrubs/grasslands, and built-up area occupies nearly 0.8% (Fig. 1B) (derived using land cover maps having 300 m resolution from CRDP (http://www.esa.int/ESA). Almost 90% of the soil in the study area is Vertisols, followed by 3% of Luvisols as per the FAO system (derived from 250 m resolution soil cover map downloaded from www.soilgrids.org). The domain envelopes the administrative boundaries of 5 districts (Nagpur, Wardha, Yavatmal, Chandrapur, and Amaravati) of Maharashtra state, two districts (Betul and Chinndwara) of Madhya Pradesh state (towards North), and partially covers Adilabad district of the neighboring state of Telangana (towards South). Nearly 15.3 million population lives in the study domain, out of which 38% reside in urban areas, which is higher than the national average of 30% (Census 2011).

Fig. 1
figure 1

A WRF model nested domains (D1 and D2) overlaid over digital elevation model map of India, B Land use land cover over the inner domain representing the study area is shown. The thick black dots represent the meteorological monitoring locations (https://cpcb.nic.in/). S1: rural, S2: semi-urban, and S3: urban environment

The central Indian region envelops notable cities, including Nagpur, Chandrapur, Amravati, Yavatmal, and Nanded. The first two cities are especially loci for thermal power plants and mining activities. More than 65% of the thermal power generated in Maharashtra (10,170 MW) is from three power plants located in the study area, one at Chandrapur, and the remaining two are near Nagpur (at Koradi and Khaparkeda) (Mahagenco 2019). Chandrapur city is also a hub for many large-scale industries such as opencast coal mining and cement industries. India's Maharashtra state is one of the top economic and industrial powerhouses having a growth rate of 10% during 2016–2017 (DES 2018). The state in general and specifically the study area has been drawing attention from the investors due to the 'ease of doing business' policies (DES 2018) and rich mineral wealth (DGM 2016). Also, by virtue of its location at India's geographical center, the region has a potential for multi-modal connectivity, which is accelerating economic development.

Three locations representative of rural (S1), semi-urban (S2), and urban (S3) environments are considered within the Domain-2 of the study area (Fig. 1B). Agricultural fields and forest areas surround the S1 location. The semi-urban environment and agricultural fields surround the S2 location. The S3 is in the heart of the urban built-up area situated in Nagpur City (discussed in Sect. 2.3).

2.2 WRF V3.9 model

The Advanced Research WRF-v3.9 (ARW) core of the model was used in the present study. It employs Arakawa-C grid format upon which the governing equations depicting conservation of mass, momentum, and energy are discretized and solved using 2nd- and 3rd-order Runge–Kutta schemes for time integration and higher-order schemes for advection (Skamarock et al. 2008). Some of the prognostic variables solved by the model are 3D wind fields, perturbation potential temperature, surface pressure, geopotential pressure, turbulent kinetic energy, etc.

Two computational domains were set up, namely outer domain (D1) and inner domain (D2) covering the Indian peninsular region and the study area (eastern Maharashtra of Central India/Vidarbha region), respectively. These one-way nested domains D1 and D2 have grid resolutions of 12 and 4 km, respectively, and the former domain consists of 100 and 64 grid points in the east–west direction, and the latter domain consists of 132 and 76 grid points in the north–south direction (Fig. 1A). Both domains have 28 vertical terrain-following hybrid-sigma levels up to 60 hPa extending to an altitude of ~ 18.5 km from the ground level. National Center for Environmental Prediction (NCEP) final analysis (FNL) 6-hourly meteorological gridded data of 1° spatial resolution were downloaded (NCEP FNL 2000) and used for providing initial and boundary conditions. Land use land cover static layer classified by the United States Geological Survey (USGS) having 24 categories was used in the study. The WRF model parameterizations depicting various physical phenomena used in the study are summarized in Table 1.

Table 1 The parameterization schemes used in the WRF model

Spatio-temporal profiles of the prognostic meteorological variables were simulated in the study region during January and March 2018. A duration of one week in both months was selected that has fair weather and without any significant synoptic activity viz. 11–18 January 2018 and 4–11 March 2018. The first 24 h was treated as spin-up duration.

2.3 Surface and radiosonde observations

The surface meteorological variables were validated with available in situ data, maintained by the CPCB. The hourly observations of meteorological data (Temperature, Relative Humidity, Wind Speed, and Direction) from the monitoring stations (Fig. 1B and Table 2) were downloaded from the web portal (http://cpcb.nic.in/). Monitoring stations located at S1, S2, and S3 represent three different environmental settings: rural, semi-urban (having industrial complexes), and urban contexts. The timestamps of the observed data were corrected to represent the UTC (Local Time—0530 h) to match the WRF model simulations.

Table 2 Surface monitoring stations in the study area for meteorological variables

Validation of upper air meteorological simulations with radiosonde data is a well-established method (Boadh et al. 2016). The radiosonde data for Nagpur station (id: 42867) falling in the study domain were obtained from the University of Wyoming (http://weather.uwyo.edu/upperair/sounding.html) portal. The observation site is at Nagpur City's airport, represented by the S3 location (urban setting). The vertical meteorological profile is measured twice a day at 0000 and 1200 UTC by the India Meteorological Department (IMD). The WRF model-simulated meteorological variables (Potential Temperature, Virtual Temperature, Temperature, Relative Humidity, and Wind Speed) were compared with the radiosonde data to determine the model performance in depicting the vertical meteorological profile over the urban location.

2.4 Model performance indicators and post-processing

The WRF model accuracy in simulating the surface meteorological variables was tested with model performance metrics such as mean bias (MB), normalized mean bias (NMB), mean gross error (MGE), normalized mean gross error (NMGE), and Pearson's correlation coefficient (r). The mathematical definition of the metrics is presented in Appendix A. The positive MB indicates the over-prediction of the variable by the model and vice versa. MGE indicates the sum of absolute differences between modeled and observed values, which indicates the level of deviation. The normalized metrics NMB and NMGE indicate the relative deviations with respect to the observed values. The correlation coefficient, r, indicates the strength of the linear relationship between the modeled and observed variables. A perfect model would have r = 1, and MB, NMB, MGE, NMGE = 0. Several researchers have also used these model evaluation metrics (for example, Gunwani and Mohan 2017; Georgiou et al. 2018). One-way ANOVA (analysis of variance) was performed to identify whether there exists any significant difference among the mean of variables simulated by choosing three PBL schemes. January 11th and March 4th days representing the first 24 h (spin-up time) of the simulation were not considered in the calculation of model evaluation metrics. We used the openair (http://www.openair-project.org) library of R statistical programming language to calculate the performance metrics and to develop the graphical representation of the data. The soccer plot between the NMB and NMGE model metrics classifies the model performance with respect to different goalposts. These metrics were calculated for all variables at three different locations in both seasons. The inner goalpost has less bias and minor error, while the outer-most goal post has high bias and high error. Model accuracy in simulation of upper air meteorological variables was assessed using Pearson's correlation coefficient (r). Before the correlation analysis, the values were interpolated to match the same vertical levels for both modeled and measured data. NCL-v6.3 (https://www.ncl.ucar.edu/index.shtml), QGIS-ver-2.18 (https://qgis.org/en/site/) and R-v3.4.3 (R Core Team 2017) were used to develop the graphics.

3 Results and discussion

3.1 Surface variables

In brief, the results obtained according to the parameterizations and configurations used in the study conclude that the WRF model performed better in simulating the thermodynamical variables (T and RH) compared to the dynamical variables (WS and WD at 10 m). Gunwani and Mohan (2017) also reported similar observations over different climatic zones of India. The model has captured the diurnal variations in T and RH (at 2 m), but with a slightly warm and cold bias in T and consistently negative bias in the RH simulations. The differences in surface variables simulated at S1, S2, and S3 representing three different environmental contexts: rural, semi-urban, and urban settings, respectively, are also significant. We have also observed instances of statistically significant differences in the modeled variables according to the choice of PBL schemes considered in this study.

The WRF model has simulated T-2 m with correlation values in the range of 0.85–0.95 at urban (S3), 0.93–0.96 at semi-urban (S2), and 0.21–0.72 at rural (S1) stations inclusive of both months. The high degree of correlation in predicting the surface temperatures is also reported by Hariprasad et al. (2014) and Boadh et al. (2016). Despite the high correlation, the modeled surface temperatures are slightly lower than the observed values with a marginal negative MB, except at rural station (S1) during March. The NMB and NMGE values have consistently remained in the range of − 0.01 to 0.01, which indicates that the ratio between the deviated and observed values is relatively low (an indication of quite a good model performance). Further, the results indicate no significant difference among the model predictions by varying the PBL schemes during March. However, a significant variation (p  <  0.01) is observed with ACM2 during January for S1 and S2 stations. The application of ACM2 has produced relatively higher ME at S1 and S2 stations for both months. In contrast, it showed lower ME at S3 station. Overall, the model error in simulating T is relatively higher at S1 location for all three PBL schemes. Regarding the trends in diurnal and seasonal variation in T, the differences in average values are also well captured in the model (Fig. 2). During March (summer), the range of T at S1 and S2 varied between 290 and 310 K, and at S3 it ranged between 295 and 310 K. While, in January (winter), the range of T at S1 and S2 varied between 280 and 305 K, and at S3 it varied between 287 and 305 K. The range of T is lower in January than March, and relatively higher minimum T is simulated at the urban station (S3) (Fig. 2). Slight warm bias in daytime temperatures and moderate cold bias in nighttime temperatures is observed at S1 and S2 stations using the ACM2 scheme. Similar biases are reported by Hariprasad et al. (2014) and Madala et al. (2015) using ACM2 and all three schemes, respectively. Mohan and Bhati (2011) have also reported over-forecasting of T during daytime and under-forecasting during nighttime over the Delhi region. They have suggested for selection of different land surface models according to the intended application of the model.

Fig. 2
figure 2

Time series plot (in UTC) of measured and simulated Temperature (T in Kelvin, at 2 m from surface) at three stations (S1, S2, and S3). Lines represent the simulated values, while black dots represent surface measured data. S1: rural, S2: semi-urban and S3: urban environment

Although the WRF model in this study has simulated RH-2 m with a moderate degree of correlation with the r values in the range of 0.55–0.76 at rural (S1), 0.63–0.79 at semi-urban (S2), and 0.54–0.70 at urban (S3) stations inclusive of both months, the values of MB are consistently negative. The NMB during March varied from − 0.41 to -0.53 inclusive of all stations, which indicates that the simulated values are under-forecast by roughly 41 to 53%. While, during January, the NMB varied from − 0.07 to − 0.39 inclusive of all stations, which indicates an under-forecasting of the model by 7–39%. The results suggest that the model performed considerably better in simulating RH during January than in March. Similar observations are reported by Hariprasad et al. (2014) in India, Misenis and Zhang (2010) over Mississippi, Wang et al. (2019) over China, and Garcia-Diez et al. (2013) over Europe using the PBL schemes viz. ACM2, YSU, and MY-E. Studies by Sathyanadh et al. (2017) and Mohan and Bhati (2011) over the Northern Indian region have also reported an under-forecasting of the RH, especially during the summer season. While Hu et al. (2010) have reported an over-prediction of RH over the USA using ACM2 and MY-E PBL schemes. The diurnal variability and seasonal differences in the values are better captured in all stations (Fig. 3). The RH values in March (10–40%) are captured to be lower than the January values (20–75%), as March is the beginning of the hot and dry summer season in central India (Boadh et al. 2016). The YSU scheme is observed to be significant (p  <  0.001) during January at all stations and influenced S3 (p  <  0.001) during March. Overall, the MY-E scheme has performed relatively better than other schemes (having lower ME), and the urban station has higher ME than the rest. Dang et al. (2016) observed a significant negative correlation (r = − 0.34 at p  <  0.05) between the surface RH and the height of the planetary boundary layer. As the PBL height increases, water vapor dilution is predominant unless the release of water vapor from the earth's surface is significant (Wang et al. 2016). Especially in hot summer seasons over Central India, the sensible heat fluxes from the land surface drive the higher mixing heights (see Sect. 3.30), which thereby cause a consistently under-forecasting of RH-2 m in the study area.

Fig. 3
figure 3

Time series plot (in UTC) of measured and simulated Relative humidity (RH in %) at three stations (S1, S2, and S3). Lines represent the simulated values, while black dots represent surface measured data. S1: rural, S2: semi-urban and S3: urban environment

Unlike T and RH, WS is a highly dynamic variable, which is significantly influenced by local factors such as topographic features and building geometry. The model-simulated wind speed at surface level is a representative value of WS in a control volume having dimensions of 4 km by 4 km (an area of 16 km2). The results indicate that the WRF model-simulated WS (at 10 m) is not well correlated with the observed data (Fig. 4). The absolute value of the correlation coefficient varied in the range of 0.10–0.33 at rural (S1), 0.05–0.15 at semi-urban (S2), and 0.02–0.20 at urban (S3) station inclusive of both months. The majority of NMB and NMGE values are above 0.4 for all stations and months, which indicates an incorrect forecast and a poor match between the modeled and observed data. The measured values of WS are in the range of 0–2 m/s at all stations during both months, while the simulated values are in the range of 0–4 m/s, and in some instances, it has reached up to 6 m/s during March. The MY-E scheme at S3 has shown a relatively higher ME, while at S1, the scheme performed better than the rest. Further, the extreme values recorded at S3 during January might have also resulted in inadequate model validation.

Fig. 4
figure 4

Time series plot (in UTC) of measured and simulated Wind Speed (WS in m/s, at the surface level) at three stations (S1, S2, and S3). Lines represent the simulated values, while black dots represent surface measured data. S1: rural, S2: semi-urban and S3: urban environment

Similarly, the results indicate a poor correlation between the modeled and observed wind direction (WD) at 10 m above the ground (Fig. 5). The absolute value of the correlation coefficient varied in the range of 0.04–0.23 at rural (S1), 0.01–0.21 at semi-urban (S2), and 0.02–0.25 at urban (S3) stations inclusive of both months. The NMGE values are greater than 0.47 for all PBL schemes, months, and stations, indicating an absolute error > 47% in the model simulations. Overall observations suggest that the WS and WD are relatively less poorly simulated in March than in January. Over-forecasting of surface-level WS has been reported by several other studies carried over different geographical settings (Hariprasad et al. 2014; Madala et al. 2015; Satyanadh et al. 2017; Ferrero et al. 2018). The WRF model with the settings used in this study has failed to capture the southeast winds during January and low-intensity winds in almost all directions during March. The bulk shift in wind direction pattern is reported in earlier studies (Hariprasad et al. 2014; Madala et al. 2015) and is attributed to poor accounting of surface drag parameters and roughness factors. Especially the urban environments are characterized by low wind speed conditions due to the complex surface interactions (Ferrero et al. 2018). However, over-estimation of WS by the model can be attributed to the inadequate representation of surface topography (Duan et al. 2018) and land surface processes in the model parameterizations. However, several studies indicated that MYNN2 (Mellor–Yamada–Nakanishi–Niino Level 2.5) parameterization has the slightest error in simulating surface variables over the Ganga region in Uttar Pradesh state of India (Satyanadh et al. 2017). Madala et al. (2015) reported that ACM2 is relatively better in simulating surface meteorology over Ranchi, India.

Fig. 5
figure 5

Time series plot (in UTC) of measured and simulated Wind Direction (WD in degrees from North) at three stations (S1, S2, and S3). Lines represent the simulated values, while black dots represent surface measured data. S1: rural, S2: semi-urban and S3: urban environment

The scatter plots between each variable's modeled and observed values provide an overview that T is quite well forecasted with a relatively most negligible bias and error (Fig. 6). While, RH is under-predicted, and WS is over-predicted. The Soccer plot provides a visual interpretation of NMB and NMGE percentages of all variables, PBL schemes, and seasons (Fig. 7).

Fig. 6
figure 6

Scatter plot between the measured and the simulated meteorological variables–Temperature (T), Relative Humidity (RH), and Wind Speed (WS) cumulative data of all seasons and stations. The dashed line indicates the y = x line

Fig. 7
figure 7

Soccer plot between NMB (normalized mean bias, in %) and NMGE (normalized mean gross error, in %) derived from simulated and measured data of meteorological variables. S1: rural, S2: semi-urban and S3: urban environments

The uncertainty and error in the model simulations might have percolated from the improper land use classification (Karlický et al. 2017) and inadequacies in the land surface processes. The physical effects of land surface elements and land use scenarios (for example, irrigation schedule of crops that affect the latent heat flux, T, RH, etc.) will also have to be accounted for while setting up a regional scale model. A decadal study over Delhi by Sati and Mohan (2017) indicated that the increase in urban land use had increased surface heat fluxes, thereby severely affecting the atmospheric dynamics. They also observed that increase in the built-up area has also resulted in lower surface winds and relative humidity. Further, inconsistencies in WS simulations might be mainly due to its dependency on local topography (Duan et al. 2018) and building geometry (Kadaverugu et al. 2019), which are nearly impossible to accommodate even in a 1 km resolution grid. Although the Urban Canopy Model (UCM) parameterizations are not included in the present study, its inclusion might improve the model performance, especially over urban locations (Bhati and Mohan 2016). It is also reported that there is further no significant improvement in the model accuracy due to refining the grid resolution from 4 to 1 km (Pay et al. 2014). Mohan and Bhati (2011) reported that the WRF model accuracy did not improve significantly by increasing the grid resolution from 18 to 6 km to 2 km over the Delhi region, India. The WS and WD forecast might be improved by further downscaling the regional scale variables to building scale using the Computational Fluid Dynamics (CFD) models (Kadaverugu et al. 2019). The local factors like building configuration, vegetation, and open spaces play a vital role in channelizing the urban surface wind flow, which are accounted for in the CFD modeling.

The rationale for the identification of monitoring locations is debatable. The air quality and meteorological monitoring equipments are usually located in easily accessible places such as next to roads and in public offices. They are generally prone to biases from the local factors and fail to represent the regional background. The inconsistencies also stem from the idea of validating the volume-averaged simulated meteorological variables with the point measurements collected from surface monitoring instruments. The limitations in forecasting dynamic variables like wind speed and direction are mostly inevitable. Further studies are required for exploring the need for downscaling the mesoscale wind flow simulations to building scale with the integration of localized CFD models depending on the need of the study.

3.2 Vertical profile

The vertical profiles of the Temperature (T), Relative Humidity (RH), Wind Speed (WS), Potential Temperature (PT), and Virtual Temperature (VT) simulated by the WRF model at various vertical levels (0–18 km) were validated with the weather balloon radiosonde data at Nagpur City (S3 urban location) for two different seasons—January (Fig. 8) and March (Fig. 9). The values at the same vertical level are averaged over the simulation period and are compared with the averaged values of the available radiosonde data measured at 0000 and 1200 UTC. As the radiosonde data for the bottom and top levels were not available during March, the data at overlapping vertical levels (between modeled and balloon heights) were used for computing the model performance metrics. The model has quite accurately captured the stable boundary layer at early hours (0000 UTC/0530 LT). The evolution of the stable boundary layer into the mixing layer is also evident through the simulated gradients of T and RH at 1200 UTC. Similar observations for March could not be made as the measured data at the surface level were not available. Results show that the effect of PBL schemes is quite negligible in simulating the thermodynamic structure of the atmosphere. Hence, only one PBL scheme (YSU) is considered in correlation analysis with the measured data at overlapping vertical levels. The results indicate a high level of correlation > 0.85 for all the variables during January. There is a moderate degree of correlation varying between 0.45 and 0.94 for all the variables during March. The results for January indicate that the model used in the study quite accurately predicted the vertical profiles of all variables (0.01 < NMGE < 0.12), except with a deviation by 20% over-forecasting in RH at higher altitudes (NMGE = 0.365). Although the surface WS is prone to high uncertainty, the vertical profile exhibited a high degree of correlation with a slight negative bias (0.12 < NMGE < 0.26 inclusive of both months). Further investigation is required for assessing the deviations in the model, especially during March. All three PBL schemes have quite accurately simulated the upper air profile over Nagpur City, but Boadh et al. (2016) mentioned that the YSU scheme performed relatively better than the rest.

Fig. 8
figure 8

Vertical profiles of Temperature (T in Kelvin), Relative Humidity (RH in %), Wind Speed (WS in m/s), Potential Temperature (PT in Kelvin) and Virtual Temperature (VT in Kelvin) during 11–18 January, 2018 at 0000 UTC and 1200 UTC over the Nagpur urban area

Fig. 9
figure 9

Vertical profiles of Temperature (T in Kelvin), Relative Humidity (RH in %), Wind Speed (WS in m/s), Potential Temperature (PT in Kelvin) and Virtual Temperature (VT in Kelvin) averaged during 04–11 March, 2018 at 0000 UTC and 1200 UTC over the Nagpur urban area

3.3 PBL/mixing layer height

Friction velocity and sensible heat flux are known to be responsible for the evolution of the planetary boundary layer. The WRF model in the study has reasonably captured the temporal evolution of the planetary boundary layer height (PBLH). The PBLH temporal variation for Nagpur City (S3) is shown in Fig. 10. The local time is UTC + 0530 h according to Indian Standard Time (IST). The results indicate that PBLH begins to rise from 0800 IST, reaches the peak around 1530 IST in the afternoon, and subsides by 1730 IST. During the late evening to early morning, the stable boundary layer height is observed to be 50–60 m in March and 40–60 m in January. The diurnal trend in PBLH is observed to be similar during both January and March. However, the mixing layer's depth is more during March (~ 3500 m) than in January (~ 1800 m). A deep mixing layer in summer and a relatively shallow boundary layer during winter is observed by Madala et al. (2015). The same trend for Nagpur City is noted by Boadh et al. (2016). The deep mixing layer might have resulted from the high surface heat fluxes (Satyanadh et al. 2017) during the summer season, where the soil moisture is relatively lower than other seasons. The model-predicted maximum upward sensible heat flux values are in the range of 418.60–481.74 Wm−2 during March and in the range of 324.78–364.71 Wm−2 during January (not plotted). In the present study, the ACM2 scheme is observed to simulate deeper convective layers compared to other schemes during March over S1 (rural) and S2 (semi-urban) stations. Satyanadh et al. 2017 also reported higher PBLH associated with ACM2. In contrast, MY-E has consistently simulated higher mixing layers during January over all three stations.

Fig. 10
figure 10

Daily averaged values of Planetary Boundary Layer Height (PBLH) simulated at three different locations. S1: rural, S2: semi-urban and S3: urban environments

4 Conclusions

We emphasize for a more accurate meteorological modeling as an antecedent for good regional air quality modeling and weather forecasting. In this context, we have studied the Weather Research and Forecasting (WRF) model (over the Central Indian domain having an area of 69,000 km2 with a nested grid resolution of 12 km and 4 km). The surface and upper air meteorological variables simulated with different planetary boundary layer (PBL) schemes viz. ACM2, YSU, and MY-E over three different locations representing the rural, semi-urban, and urban settings and are validated with the observed data collected during January and March 2018. In this context, the present study addressed the questions: a) whether the meteorological variables simulated by the WRF model over the central Indian region are comparable enough with the measured data at rural, semi-urban, and urban settings, and b) is there any significant difference in meteorological variables simulated by three different PBL schemes.

Overall, the results indicate that the surface thermodynamic variables (temperature and relative humidity) are more accurately simulated than the dynamic variables (wind direction and speed). Surface temperature and relative humidity are simulated with less bias and error at all three stations during both months. However, the results indicated slightly cold and warm biases in the night and daytime temperatures, respectively. The surface wind speed is over-predicted, and the wind direction is rather poorly correlated with the observations. These discrepancies in the simulation of wind speed and direction might be due to the inadequate representation of surface drag and roughness parameters in mesoscale models. The normalized mean bias and error metrics showed that all three PBL schemes have produced almost similar outcomes in wind speed and direction. The WRF model in the study with the YSU and MY-E schemes has simulated the surface temperature relatively better at rural and semi-urban locations for both seasons, and ACM2 has shown relatively better performance at the urban location. Overall, MY-E scheme has rather demonstrated better performance in simulating the relative humidity values. Further, the MY-E scheme has performed relatively better in wind speed simulation at rural and semi-urban locations while poorly performed at the urban location. The performance of three PBL schemes in simulating the surface wind speed and direction could not be evaluated due to the WRF model's poor forecast within the settings used in the study.

The vertical thermodynamic structure (temperature, potential temperature, virtual temperature, relative humidity, and wind speed) is more accurately simulated during January than March. The model has captured the diurnal trends and seasonal variation in the boundary layer mixing heights. Results show that the ACM2 has simulated the deep convective layers during March at rural and semi-urban locations, while MY-E scheme has simulated the deep convective layer at the urban site. Overall, the MY-E scheme has consistently simulated the deeper mixing layers during January at all locations.

The results indicate the WRF model, within the choice of parameterizations and settings used in this study, is largely suitable in simulating the thermodynamic meteorological variables over different environmental contexts (rural, semi-urban, and urban). Further studies are required to understand the factors affecting the inconsistencies in capturing the surface wind speed and direction. The results are more encouraging towards the applications of the WRF model in agrometeorology and fog-related studies rather than for air quality studies owing to significant inaccuracies in the simulation of surface wind profile. The efficacy of dynamical downscaling of the wind flow using CFD models is to be tested for accurate air quality applications.