I need to run hysplit for 575 point sources monthly from 1970-2017 (i.e. 575*47*12 = 324,300 times). I am using a supercomputer to parallelize the process. I wrote one R script that parallelizes the monthly runs for a single point source. Then I submitted 575 different jobs (one for each point source).
The output of each run is saved in a different working directory, so that there are no conflicts. This process worked fine for about 400 point sources, and then I started getting various timeout, download, and connection errors.
Is it possible that NOAA has blocked the connection from the supercomputer node due to too many attempts? Is there a way to resolve that?
Note that I am using the splitR package as a wrapper.
Thank you.
download error when parallelizing dispersion runs using batch script
Re: download error when parallelizing dispersion runs using batch script
It is the meteorological data that you are trying to download from the ftp that is causing issues?
Can you download all the data first and then complete the runs?
Please provide examples of the error messages.
Can you download all the data first and then complete the runs?
Please provide examples of the error messages.
Re: download error when parallelizing dispersion runs using batch script
Yes it is the meteorological data download that is giving me issues.
I am not able to download the data as I am getting errors like:
Error in download.file(url, ...) :
cannot open URL 'ftp://arlftp.arlhq.noaa.gov/archives/re ... 198504.gbl'
In addition: Warning message:
In download.file(url, ...) :
URL 'ftp://arlftp.arlhq.noaa.gov/archives/re ... 198504.gbl': Timeout of 500 seconds was reached
Strangely, it did work for the first several thousand iterations, but now I am getting these errors. Sometimes it is a timeout error, sometimes it is a connection error.
If I run hysplit on my mac, it works. The problem only arises when I submit jobs in parallel through a slurm batch script.
Can I provide the IP address and perhaps someone can check if it was blocked for too many attempts?
Thanks
I am not able to download the data as I am getting errors like:
Error in download.file(url, ...) :
cannot open URL 'ftp://arlftp.arlhq.noaa.gov/archives/re ... 198504.gbl'
In addition: Warning message:
In download.file(url, ...) :
URL 'ftp://arlftp.arlhq.noaa.gov/archives/re ... 198504.gbl': Timeout of 500 seconds was reached
Strangely, it did work for the first several thousand iterations, but now I am getting these errors. Sometimes it is a timeout error, sometimes it is a connection error.
If I run hysplit on my mac, it works. The problem only arises when I submit jobs in parallel through a slurm batch script.
Can I provide the IP address and perhaps someone can check if it was blocked for too many attempts?
Thanks
-
- Posts: 362
- Joined: May 8th, 2019, 1:31 pm
- Registered HYSPLIT User: Yes
Re: download error when parallelizing dispersion runs using batch script
Send the IP address to ARL webmaster. We will check it with our IT department.
-
- Posts: 362
- Joined: May 8th, 2019, 1:31 pm
- Registered HYSPLIT User: Yes
Re: download error when parallelizing dispersion runs using batch script
Your IP address was blocked and our IT department is looking into unlocking it. When you regain the FTP access, please limit your concurrent FTP connections to 2.