Hello,
Problem: I noticed that the calculation of hysplit trajectories (and probably also dispersion runs, which I did not test) slows down significantly once the trajectory approaches the dateline or even crosses it. The slowdown is very notable when using a high-resolution global meteorological input, such as ERA5 (which I interpolated to a global 0.25x0.25 grid). In this case trajectory computation slows down by a factor of 10-20 when compared to trajectories that do not cross the dateline.
Diagnostics: After going through the code and doing some debugging, I do think I have a grasp of why this is happening. I give a short description so you can correct me if I’m wrong with anything:
For efficiency, hsyplit usually works on a moving subgrid instead of using the entire meteorological input file. When the dateline is approached, both cells with an ‘x index’ close to the maximum number of grid cells and cells with an ‘x index’ close to 1 are needed, and the model handles this by expanding the subgrid to cover the WHOLE extent of the input file (it ‘goes global’, it cannot take a small slice close to xmax and another small slice close to 1). To make things worse, the size of the subgrid is not allowed to shrink during runtime, only to grow, so once the subgrid ‘went global’ there is no way back. This means that even if after some time the trajectory is again ‘far’ from the dateline and using the global extent would no longer be needed, the model continues to perform all operations on the entire input grid.
It took me a while to figure out why exactly this slows down things so importantly. I first thought it was a memory issue (each daily global ERA5 input data set is over 6 Gb after all), but when I split the meteorological input into 8-hour increments (dividing their size by 4) there was no speedup at all. On the other hand, I also noted that there was no notable speedup when using a larger model timestep. I was a bit puzzled and did more rigorous benchmarking and found that most of the computation time was spend in the PRFCOM subroutine. In its header, the subroutine is described as follows:
“routine examines the profile at each sub-grid node and converts the input data to common units and interpolates the sounding to the internal model sigma system.”
Looking at the routine, it caught my attention that it performs a loop over all cells in the subgrid and performs a vertical (??) interpolation routine on the cell. As ERA5 in a 0.25x0.25 resolution has many cells (> 1E6, 16 times as many as the commonly used GDAS1), this appears to take up a lot of computation time. As far as I can tell, the PRFCOM routine is performed every time a new time step in the meteorological input file is loaded. This would also explain why I found no significant relationship between computation time and chosen model timestep: Independently of the timestep, PRFCOM would get called once every hour, which is the time resolution in the ERA5 fields.
Possible solution: While it would probably best to treat the dateline issue in a way that does not involve loading the entire extent of the data, I found it quite complicated to implement such a solution. However, I did have good success with modifying the PRFCOM subroutine itself in the following way: I passed the array of the current position of all particles (in both x and y) to the subroutine as an optional parameter and then drew a sort of ‘safety margin’ around them, assuming that no particle moves more than 500 km/h and taking into account the spherical shape of the earth. For all cells in the subgrid that are out of the ‘safety margin’ of all particles, I skipped all further expensive computation and interpolation (by just cycling).
With this fairly easy fix I achieved a speedup of about 8-10 times for trajectories that are emitted close to the dateline (or close to the poles). I confirmed for several starting locations that the resulting trajectories are identical to those I obtain without applying the fix. The computation time is identical for trajectories that never come close to the dateline. The only thing that bothers me a bit is that the current position of particles (SPOT%XP, SPOT%YP) must be passed to PRFCOM (through ADVPNT), which means I had to not only modify prfcom.f, but also do very slight modifications to advpnt.f, DEFARG1.INC, DEFARG2.INC, hymodelt.F, and hymodelc.F.
I can provide the code of my fix, if this is of interest. I didn’t include the adapted code yet because I wasn’t sure if the forum rules permit posting part of the HYSPLIT source code.
Very slow computation of ERA5-based HYSPLIT trajectories close to the dateline and a possible way of solving it
-
- Posts: 1
- Joined: September 20th, 2024, 7:45 am
- Registered HYSPLIT User: Yes
Re: Very slow computation of ERA5-based HYSPLIT trajectories close to the dateline and a possible way of solving it
Hello,
thanks for the in depth analysis of the problem.
We appreciate it if you don't post the source code here.
Please email arl.webmaster@noaa.gov as we would like to consider incorporating changes
into future versions of HYSPLIT.
thanks for the in depth analysis of the problem.
We appreciate it if you don't post the source code here.
Please email arl.webmaster@noaa.gov as we would like to consider incorporating changes
into future versions of HYSPLIT.