How to set up control file for multiprocessing

Post any questions or comments regarding the LINUX version of HYSPLIT. This includes the model execution, GUIs, results, or graphics. Be sure to mention the version of LINUX you are using.
Post Reply
dlu
Posts: 6
Joined: September 3rd, 2021, 5:43 pm
Registered HYSPLIT User: Yes

How to set up control file for multiprocessing

Post by dlu »

Hi,

I have multiple power plant locations and stack heights that I'd like to compute a dispersion simulation for. In general, I'm looking for guidance on the most efficient way to generate results for all n of my power plant locations over the same time period (dispersion for 1 hour long every 6 hours for every day between ~10 years).

I am assuming the best way to do this is edit the CONTROL file to take multiple locations (questions below), but please let me know if that's not the case.

In the CONTROL file below, I've included 2 power plant locations. Is there a maximum number of starting locations I can put in the file?

Code: Select all

06 01 01 6
2
31.0069 -88.0103 182.88
70.0069 -73.0021 130.77
6
0
20000
1
data/hysplit_met/reanalysis/
RP200601.gbl
1
3
100
0.01
06 01 01 06 00
1
0.0 0.0
0.01 0.01
180 360
./
output.bin
1
5000
06 01 01 6 00
06 01 01 23 00 
00 01 00
1
15 0 0
0.002 0 0 0 0
0 0 0
0
0
With this CONTROL file and SETUP.CFG file (not attached), in the same directory, I run the lines below in the terminal to generate my results using `hycm_std`. Following the `run_mpi.sh` template included in the code, I was able to run `hycm_std` with multiple processors. However, I end up getting 5 sets of different results for the same input. I was expecting that the inputs are distributed to the different processors, rather than the same set of inputs repeated across the processors. Could you help me understand what's happening?

Code: Select all

time mpirun -np 5 /scratch/midway2/daisylu/data/hysplit.v5.1.0_UbuntuOS20.04.2LTS/exec/hycm_std; /hysplit.v5.1.0_UbuntuOS20.04.2LTS/exec/parhplot -iPARDUMP.001 -a1
 Configured for multiple processors:            5
A subset of the results I get look like this (showing 1 file of 6 generated per processor - 5 total):

Code: Select all

$ cat GIS_part_001_ps.txt
00001, -88.0291,  31.1404,     141.
00002, -72.7968,  69.9548,     109.
END
Is there a way to identify which particle corresponds to which starting location?

Thank you.
alicec
Posts: 411
Joined: February 8th, 2016, 12:56 pm
Registered HYSPLIT User: Yes

Re: How to set up control file for multiprocessing

Post by alicec »

There is no limit on the number of starting locations.

The PARDUMP files are not considered the main HYSPLIT outputs. When running with hycm_std, one pardump file per processor is produced. This is because each processor is 'running' different computational particles. But the computational particles won't be assigned to the different processors according to the different sources which is what it sounds like you are expecting.
However only one concentration output file is produced.

If you are looking at PARDUMP output then you can you can keep track of particles through the sort_index as long as you ensure that no particles are removed from the simulation.

From the concentration file output, if you want to keep track of which emissions came from which source locations, the simplest way to do that is to generate different HYSPLIT runs for each source location. The main drawback is that you have to keep track of more files.

An alternative would be to utilize a different particle species for each source location and use an emit times file to specify the emissions. Then you can separate out the emissions from each source location in the concentration output file based on the species ID. The main advantage here is you would have less files generated. The main drawback is that generating the emit-times file for multiple source locations and species is somewhat complicated.
dlu
Posts: 6
Joined: September 3rd, 2021, 5:43 pm
Registered HYSPLIT User: Yes

Re: How to set up control file for multiprocessing

Post by dlu »

Thank you for the response. You mentioned that PARDUMP files are not considered the main HYSPLIT outputs -- in that case, what is?

I've been using the parhplot executable to get results from the last snippet I sent (generates GIS_part_001_ps.txt, GIS_part_002_ps.txt, GIS_part_003_ps.txt, GIS_part_004_ps.txt, GIS_part_005_ps.txt, GIS_part_006_ps.txt files). I have 6 files because I am releasing n particles over an hour for 6 different timestamps through the day -- at hour 0, 6, 12, and 18, so each of these files have n particles in them.

Is this the right way to get the lat, long, height, and timestamp of the particle simulation?

As for what hycm_std is doing, what do you mean by each processor is "running" different computational particles? For example, if I am releasing 2500 particles each at 10 different locations for a given time, does hycm_std distribute the 2500 * 10 = 25000 particles across the 5 processors I'm running?

I was under the impression that each of the 5 processors were running the same 25000 particles and generating 5 sets of results for the same 25000 particles. I assumed this because each processor had its own PARDUMP file and when I used parhplot to parse the PARDUMP files, I got 5 sets of similar-seeming results (PARDUMP.001 yielded one set of GIS_part_001_ps.txt, GIS_part_002_ps.txt, GIS_part_003_ps.txt, GIS_part_004_ps.txt, GIS_part_005_ps.txt, GIS_part_006_ps.txt files, and PARDUMP.002 yielded another set, and so on). Am I interpreting these results correctly that each set of GIS files from the PARDUMP files are simulations for the 25000 particles?

I would assume that if 25000 particles were being distributed across the 5 processors, then each of the PARDUMP.00* files would have a singular portion of the results (perhaps 25000/5 = 5000 particles each). However, each of the PARDUMP files yields 25000 particles for each of the GIS part files.

Perhaps I'm looking at the wrong output all together -- what is the one concentration output file you are referring to from hycm_std?
alicec
Posts: 411
Joined: February 8th, 2016, 12:56 pm
Registered HYSPLIT User: Yes

Re: How to set up control file for multiprocessing

Post by alicec »

The files which contain gridded concentrations (on user defined grid) are considered the main output of HYSPLIT.
Sometimes we refer to these as cdump files.
https://www.ready.noaa.gov/documents/Tu ... _eqns.html

The user defined grid, averaging time, filename and so forth are all defined in the CONTROL file.
More than one cdump file can be generated per simulation (e.g. if you want a coarse and fine grid or a fine grid at several locations).
https://www.ready.noaa.gov/hysplitusersguide/S313.htm
christopher.loughner
Posts: 81
Joined: August 15th, 2017, 3:59 pm
Registered HYSPLIT User: No

Re: How to set up control file for multiprocessing

Post by christopher.loughner »

Each processor is simulating different particles. If you ask HYSPLIT to run with 10 processors and simulate 2500 particles, then each processor will simulate 250 particles. After running HYSPLIT with multiple processors and before running parhplot, run parmerge to merge the multiple PARDUMP files into one file.
dlu
Posts: 6
Joined: September 3rd, 2021, 5:43 pm
Registered HYSPLIT User: Yes

Re: How to set up control file for multiprocessing

Post by dlu »

Thank you both for the replies.

I am trying to use parmerge to combine my PARDUMP files, listed below. This is the command I'm using in the directory with my files:
`/hysplit.v5.1.0_UbuntuOS20.04.2LTS/exec/parmerge -iPARDUMP.00* -oPARDUMP`

Code: Select all

-rw-rw-r-- 1 dlu dlu   26136 Oct 12 09:24 PARDUMP.001
-rw-rw-r-- 1 dlu dlu   26136 Oct 12 09:24 PARDUMP.002
-rw-rw-r-- 1 dlu dlu   26136 Oct 12 09:24 PARDUMP.003
-rw-rw-r-- 1 dlu dlu   26136 Oct 12 09:24 PARDUMP.004
-rw-rw-r-- 1 dlu dlu   26136 Oct 12 09:24 PARDUMP.005
What is the proper way to merge the files? I am getting output (snippet below - it's the same thing and just keeps repeating) that doesn't make much sense to me:

Code: Select all

           0  particles at            0           5       65535  1064927906       32697
           0  particles at            0           5       65535  1064927906       32697
           0  particles at            0           5       65535  1064927906       32697
           0  particles at            0           5       65535  1064927906       32697
           0  particles at            0           5       65535  1064927906       32697
           0  particles at            0           5       65535  1064927906       32697
           0  particles at            0           5       65535  1064927906       32697
           0  particles at            0           5       65535  1064927906       32697
           0  particles at            0           5       65535  1064927906       32697
           0  particles at            0           5       65535  1064927906       32697
           0  particles at            0           5       65535  1064927906       32697
For reference, here is my control file:

Code: Select all

06 01 01 6
1
31.0069	-88.0103 182.88
6
0
20000
1
/data/hysplit_met/reanalysis/
RP200601.gbl
1
3
1
1
06 01 01 06 00
1
0.0 0.0
0.01 0.01
180 360
./
output.bin
1
5000
06 01 01 6 00
06 01 01 23 00 
00 01 00
1
15 0 0
0.002 0 0 0 0
0 0 0
0
0
Here is my setup.cfg file (I had a negative number of particles just to test different inputs, I remember reading it in some documentation. I think ultimately I got 120 computational particles released and followed over time):

Code: Select all

&SETUP
tratio = 0.75,
initd = 0,
kpuff = 0,
khmax = 9999,
kmixd = 0,
kmix0 = 250,
kzmix = 0,
kdef = 0,
kbls = 2,
kblt = 2,
conage = 48,
numpar = -100,
qcycle = 0.0,
efile = '',
tkerd = 0.18,
tkern = 0.18,
ninit = 1,
ndump = 1,
ncycl = 1,
pinpf = 'PARINIT',
poutf = 'PARDUMP',
mgmin = 10,
kmsl = 0,
maxpar = 10000,
cpack = 1,
cmass = 0,
dxf = 1.0,
dyf = 1.0,
dzf = 0.01,
ichem = 0,
maxdim = 1,
kspl = 1,
krnd = 6,
frhs = 1.0,
frvs = 0.01,
frts = 0.10,
frhmax = 3.0,
splitf = 1.0,
tm_pres = 0,
tm_tpot = 0,
tm_tamb = 0,
tm_rain = 0,
tm_mixd = 0,
tm_relh = 0,
tm_sphu = 0,
tm_mixr = 0,
tm_dswf = 0,
tm_terr = 0,
/
Post Reply

Return to “HYSPLIT for LINUX”