Page 1 of 1

download error when parallelizing dispersion runs using batch script

Posted: March 29th, 2023, 9:45 am
by rmadhok
I need to run hysplit for 575 point sources monthly from 1970-2017 (i.e. 575*47*12 = 324,300 times). I am using a supercomputer to parallelize the process. I wrote one R script that parallelizes the monthly runs for a single point source. Then I submitted 575 different jobs (one for each point source).

The output of each run is saved in a different working directory, so that there are no conflicts. This process worked fine for about 400 point sources, and then I started getting various timeout, download, and connection errors.

Is it possible that NOAA has blocked the connection from the supercomputer node due to too many attempts? Is there a way to resolve that?

Note that I am using the splitR package as a wrapper.
Thank you.

Re: download error when parallelizing dispersion runs using batch script

Posted: March 30th, 2023, 7:21 am
by alicec
It is the meteorological data that you are trying to download from the ftp that is causing issues?
Can you download all the data first and then complete the runs?
Please provide examples of the error messages.

Re: download error when parallelizing dispersion runs using batch script

Posted: April 1st, 2023, 9:13 am
by rmadhok
Yes it is the meteorological data download that is giving me issues.
I am not able to download the data as I am getting errors like:

Error in download.file(url, ...) :
cannot open URL 'ftp://arlftp.arlhq.noaa.gov/archives/re ... 198504.gbl'
In addition: Warning message:
In download.file(url, ...) :
URL 'ftp://arlftp.arlhq.noaa.gov/archives/re ... 198504.gbl': Timeout of 500 seconds was reached

Strangely, it did work for the first several thousand iterations, but now I am getting these errors. Sometimes it is a timeout error, sometimes it is a connection error.

If I run hysplit on my mac, it works. The problem only arises when I submit jobs in parallel through a slurm batch script.
Can I provide the IP address and perhaps someone can check if it was blocked for too many attempts?
Thanks

Re: download error when parallelizing dispersion runs using batch script

Posted: April 3rd, 2023, 7:12 am
by sonny.zinn
Send the IP address to ARL webmaster. We will check it with our IT department.

Re: download error when parallelizing dispersion runs using batch script

Posted: April 4th, 2023, 2:38 pm
by sonny.zinn
Your IP address was blocked and our IT department is looking into unlocking it. When you regain the FTP access, please limit your concurrent FTP connections to 2.