Keywords

1 Introduction

1.1 A Subsection Sample

Tropical cyclones (TCs) are one of the most feared and deadly weather systems in the world. The extreme weather caused by TCs has a tremendous impact on human life, property, and production activities [1]. The storm surges, huge waves, and other marine disasters caused by TCs have brought huge safety hazards and serious economic losses to maritime operations and transportation. The west coast of the Pacific Ocean, where China is located, is one of the areas where most TC activities take place and affects regions in the world. Therefore, the improvement of TC forecasting and warning capability in the Northwest Pacific is of great significance in marine disaster prevention and mitigation, resource development, rights protection, law enforcement, and national security.

Track and intensity prediction is the key to TC warning and disaster prevention. Thanks to improvements in satellite [2] and dropsonde [3,4,5] observations as well as a better understanding of physical processes that control the motion of TCs [6], considerable progress has been made in TC track forecasts in the past three decades. However, predictions of TC intensity and their change have improved inconspicuously [7]. The principal reason is that a single atmospheric numerical model failed to describe the influence of the intense ocean-atmosphere interaction on the intensity change reasonably. For example, TC surface wind stress induces upwelling and strong turbulent mixing in the upper ocean, which consequently cools the ocean surface due to the entrainment of cold water from the thermocline into the mixed layer [8,9,10]. The reduction of Sea Surface Temperature (SST) reduces the enthalpy flux from the ocean to the atmosphere, resulting in a decrease in TC intensity. Moreover, strong winds under TC conditions can generate a layer of foam at the air-sea interface [11] and enormous amounts of sea spray into the atmospheric boundary layer. The former impedes the transfer of momentum from the wind to the ocean, and the latter enhances the sensible and latent heat transfer [12,13,14], resulting in varying TC intensities. In contrast, an air-sea coupled numerical model offers a more accurate description of such physical processes governing the intensity variation.

Based on international advanced atmospheric, oceanic, and ocean wave component models, a coupled numerical prediction system for TCs in the Northwest Pacific was developed [15, 16] and has been employed for operational runs in the National Marine Environmental Forecasting Centers since 2015. A schematic diagram of the coupled model physical concept is shown in Fig. 1. Aiming at better operational application, the prediction system is supposed to spend as little time on the whole forecasting flow as possible. However, it is difficult to reduce the time from the procedure of receiving initial data, pre- and postprocessing. Therefore, parallel optimization for the coupled model is an inevitable choice to achieve the high efficiency and ability of TCs to forecast and pre-alert.

Fig. 1.
figure 1

Schematic diagram of the coupled model physical concept

2 Model Setup

2.1 Atmospheric Model Setup

The Weather Research and Forecasting (WRF) Model version 3.4 was used in this study for the atmospheric component of the Tropical-Cyclone Coupled model. The initial and boundary conditions for the WRF model are initialized on the 5th of July 2018 using 6 hourly Climate Forecast System Reanalysis (CFSR) data with a 1/2 degree spatial resolution. The model domain of WRF in this study covers the northwest Pacific and China’s adjacent sea with a horizontal resolution of 1/12 degrees and 890 × 846 grid points. The vertical resolution is employed with 61 sigma levels. This domain includes the area where hurricanes in the west Pacific warm pool are generated and the region by which hurricanes usually pass. The resolution resolved the typhoon with small diameters of several tens of kilometers.

2.2 Hydrodynamic Model Setup

The Regional Ocean Modeling System (ROMS) [17] svn No. 455 was used in this study for the ocean component in the tropical cyclone coupled model. To resolve mesoscale oceanic processes such as mesoscale eddies that might influence typhoon activities, the ROMS horizontal grid number is set to 1339 × 1391. There are 30 layers in the vertical direction.

The ROMS model is initialized on the 5th of July 2018 using the fields of currents, salinity, temperature, and sea surface height from the CFSR ocean dataset with a 1/2 degree spatial resolution. The lateral boundary conditions of the coupled model (including currents, salinity, and temperature) are provided by the same dataset.

Fig. 2.
figure 2

Model domain in atmospheric model WRF (a) and the bathymetry in ocean model ROMS (b)

2.3 Ocean Wave Model Setup

The Simulating WAves Nearshore (SWAN) version 40.81 developed by Delft University of Technology was used in this study for the ocean wave component in the Tropical-Cyclone Coupled model. For the ocean area, the SWAN used the same grid as the ROMS model. The wind forcing of the SWAN model is the National Centers for Environmental Prediction (NCEP) Final Analysis (FNL) data, with 1/2 degree spatial resolution and 6 h time resolution. The SWAN model was set up to 3600 s for the time step and initialized by a steady state.

2.4 HPC Facilities

Based on the process of the coupled model system, considering the number and performance of the test platform’s processors, the system is deployed on the Lenovo cluster of the National Marine Environmental Forecasting Center. The specific configuration is shown in Table 1.

Table 1. HPC facilities used for the tests

2.5 Coupled Variables

We used the Model Coupling Toolkit (MCT) [18, 19] to exchange variables. Figure 3 illustrated coupling variables such as the wind, etc. The WRF atmospheric model provides a 10-m surface wind for the SWAN wave model. The ROMS ocean models receive heat fluxes and momentum fluxes calculated by atmospheric models. The ROMS Ocean models provide SST for the WRF atmospheric models and sea surface currents, sea surface elevations, and ocean bathymetry for the SWAN wave models. The SWAN wave models provide significant wave heights and wavelengths for WRF atmospheric and ROMS oceanic models. The SWAN wave model also provides the ROMS ocean model with the direction of the waves, surface and bottom periods, wave fracture percentages, wave energy dissipation, and bottom orbital velocity.

Fig. 3.
figure 3

Coupled variables of the model

3 Scaling Experiments

3.1 Parallel Tests Analysis

Fig. 4.
figure 4

Intel® Trace Collector data of the coupled model (blue stands for computing and red for MPI communication)

Figure 4a) shows the coupling model ROMS -WRF- SWAN, each using an MPI process. Intel® Trace Collector analysis found that there are two main problems in the parallel computing process of the typhoon-coupled model. First, there is a problem with the internal algorithm of the ocean wave model SWAN, resulting in low parallel computing efficiency. The SWAN ocean wave model only performs two-dimensional operations. In principle, the calculation amount should be much smaller than the atmospheric and ocean models for performing three-dimensional operations. However, as seen from the left figure, the internal operation efficiency of the SWAN model is much lower than that of the WRF atmospheric and ROMS ocean models.

The second problem is that the overall operational efficiency of the load imbalance between the component models of the TC coupled model is low, as shown in Fig. 4b). Since the operation model of the ocean wave model is much lower than that of the atmosphere and ocean model, the three-component models are assigned the same number of CPU cores in the current operational operation, thereby causing load imbalance among the component models. Before the coupling exchange, the atmosphere and ocean model waited for information exchange for a long time, so the overall operational efficiency of the model was low.

3.2 SWAN Model Parallel Algorithm Optimization

Fig. 5.
figure 5

Parallel algorithm (a) four different colors and (c) block wavefront [20] and Intel® Trace Collector results (b) (d) for (a) and (b) (blue stands for computing and red for waiting)

The ocean wave SWAN model uses an implicit scheme, which is more challenging to parallelize than the explicit scheme. The SWAN model has two different parallel strategies.[20] One is known for the four different color methods, which color the subdomain with different colors, such as red, orange, green and blue, as seen in Fig. 5 a). Each colored subdomain starts with a different order of updates in the same scan. The number of unknowns was replaced in four scans based on the color of the subdomain. The numerical overhead can be reduced, thereby reducing the number of synchronization points with a high degree of parallelism [21, 22].

Another method is known as the block wavefront method [23]. This method breaks down the computation field into many stripes. In each scan, CPU processors belonging to the different stripes communicate. The different unknown numbers of each stripe update are shown in the circle in Fig. 5 c), and the algorithms were as follows.

  • Step 1: The unknown number N(i, j, l, m) along j = 1 in CPU1 was updated.

  • Step 2: The unknown number N(i, j, l, m) along j = 2 in CPU1 and j = 1 in CPU2 were updated in parallel. The known number is j = 1 in CPU1.

  • Step 3: The unknown number N(i, j, l, m) along j = 3 in CPU1, j = 2 in CPU2, and j = 1 in CPU3 will be updated in parallel. The known number is j = 1,2 in CPU1 and j = 1 in CPU2.

Zijlema [24] examines in detail the numerical computational efficiency of the four different colors and block wavefront methods. In this paper, we used the block wavefront method to improve the coupling model operation efficiency. In Fig. 5 b) and d) the block wavefront method reduced the communication time and had a good balance of loading. We obtained an 18.5% performance improvement by SWAN model parallel algorithm optimization. With 196 CPU core simulations for 1 day and the SWAN model, the calculation time was reduced from 250 s to 211 s.

3.3 Ocean Model Grid Optimization

Fig. 6.
figure 6

Ocean model grid with nine tiles reference with https://www.myroms.org/wiki/Parallelization

A parallel domain of ROMS with nine tiles is shown in Fig. 6, one color per tile. For MPI jobs, each tile is an MPI process. The number of tiles is set to NtileI and NtileJ in the input file, and the product of both must be equal to the total number of MPI processes. The two mesh points with wide overlapping areas are called ghost points or halo points, which are used for exchange. The exchange in the east–west direction occurred before the north–south direction. Figure 6 b) c) and d) show how the point is updated before and after an update. The interior points are colored with different tiles, and the halo points are colored gray.

The ROMS model is written in Fortran Languages. For the Fortran memory order, the I-element is the rapidly changing index. Therefore, it is advantageous to have more partitions in the J-direction (NtileJ) than in the I-direction (NtileI) to facilitate vectorization. In addition, the coupled model mesh configuration is optimized, and the coupling mode is changed from the same mesh as the WRF model to a different mesh. Under the premise of ensuring that the prediction accuracy is unchanged, the calculation amount of the component model is reduced, and the parallel computing time of the model is further reduced.

4 Parallel Test Results

The parallel algorithm of the SWAN wave model and the parallel partition of the ROMS ocean model were optimized for Sects. 3.2 and 3.3. Then, we take different proportions according to the amount of computation in the model instead of the same number for each different component. Within ~400 CPU cores, there is no significant difference in the results using different numbers for each component of the TC Coupled model. When using processes with more than 400 CPU cores, a better acceleration ratio can be achieved if the number of cores is properly configured. Before optimizing the allocation of cores, the overall operational efficiency for the load imbalance between the different component models of the TC coupled model is low. Finally, we get speedup up to 107 times than serial for each component and MPI communication time was significantly reduced (Table 2, Table 3, Fig. 7).

Table 2. The coupling model uses the same number of cores
Table 3. The coupling model uses the optimized number of cores
Fig. 7.
figure 7

Speedup of the Tropical-Cyclone Coupled Numerical Model (Data from the Table 2 and Table 3)

5 Model Results Discussion

Figure 2 shows the configuration of the Tropical-Cyclone Coupled Numerical Model domain. The domain of the TC model covers a large area of the western North Pacific, where the components of the ocean and atmosphere are fully interactive. Mesoscale air-sea interaction weather phenomena, such as typhoons, occur frequently in this area.

Fig. 8.
figure 8

Spatial distributions of sea surface temperature (°C) at 12 am, July 06, 2018 a) from the coupled model output (tropical-cyclone coupled numerical model) and b) the ERA5 dataset.

Figure 8 shows the Tropical-Cyclone Coupled model results in the simulation of sea surface temperature (SST) at 12 am on July 6, 2018. The features of SST are well captured in the Tropical-Cyclone Coupled Numerical Model. The spatial pattern correlation coefficient (PCC) of SST between the Tropical-Cyclone Coupled model and ERA5 dataset is 0.98, which is statistically significant at the 5% level. The Tropical Cyclone Coupled Numerical Model overestimates the SST over the western Pacific warm pool. The largest bias of SST in the western Pacific warm pool is approximately 2 ℃. As shown in Fig. 8, SST biases are common biases in the western Pacific region in ocean–atmosphere coupled models [25,26,27,28].

Fig. 9.
figure 9

Spatial distributions of 10 m wind (vectors, units: m/s) and sea-level pressure (shading, units: hPa) at 12 am, July 06, 2018 a) from the coupled model output (tropical-cyclone coupled numerical model) and b) from the ERA5 dataset.

The spatial distributions of the observed (ERA5) and simulated (tropical cyclone coupled numerical model) low-level winds at 10 m and the associated sea-level pressure at 12 am on July 6, 2018, are shown in Fig. 9. The observations are characterized by the subtropical high north of 20°N and a low-pressure center near 17°N, 140°E with a cyclonic circulation anomaly. These features are captured well by the Tropical Cyclones Coupled Numerical Model but with lower pressure in the western North Pacific, which may be caused by the warm SST biases. The spatial pattern correlation coefficient of sea-level pressure between the coupled model output and ERA5 dataset is 0.98, which is statistically significant at the 5% level.

6 Conclusion

The current trends of tropical cyclone numerical model development are higher resolution, higher accuracy and multicomponent coupling. We realized the operational Tropical Cyclones real-time coupling model of the ocean, atmosphere, and wave model using the ROMS – WRF – SWAN model. Through optimization of the SWAN model parallel algorithm, ROMS grid partition and allocation of cores of the coupled model, we obtained speed-ups up to 108 times than serial for each component. The acceleration ratio was significantly improved, and a well-balanced performance of the coupled system was obtained.