How Best to Download Climate Data?

I’ll be working on gathering this data for the Dakotas (bounding box: 49 N, 42 S, -96 E, -105 W), and then summarizing it to the level of sample blocks they created earlier in the project. I tried using OPeNDAP but I’m not quite sure how to use it. So, I tried NetcdfSubset instead, which seems like it would do what I want, but the request is too big for their on-line retrieval system. Instead, they suggested I use nccopy.

This is the giant URI of the climate data I need, broken up for readability in this blog post. In other words, I put the new lines there, and they shouldn’t be in the real URI you pass to nccopy.

http://cida.usgs.gov/thredds/dodsC/dcp/conus_t?lon[163:1:230],time[0:1:51099],lat[137:1:192],
ccsm-a1b-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
ccsm-a1b-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
ccsm-a1fi-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
ccsm-a1fi-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
ccsm-a2-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
ccsm-a2-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
ccsm-b1-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
ccsm-b1-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
cgcm3_t47-a1b-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
cgcm3_t47-a1b-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
cgcm3_t47-a2-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
cgcm3_t47-a2-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
cgcm3_t47-b1-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
cgcm3_t47-b1-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
cgcm3_t63-a1b-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
cgcm3_t63-a1b-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
cgcm3_t63-a2-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
cgcm3_t63-a2-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
cgcm3_t63-b1-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
cgcm3_t63-b1-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
cnrm-a1b-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
cnrm-a1b-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
cnrm-a2-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
cnrm-a2-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
cnrm-b1-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
cnrm-b1-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
csiro-a1b-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
csiro-a1b-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
csiro-a2-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
csiro-a2-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
csiro-b1-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
csiro-b1-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
echam5-a1b-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
echam5-a1b-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
echam5-a2-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
echam5-a2-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
echam5-b1-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
echam5-b1-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
echo-a1b-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
echo-a1b-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
echo-a2-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
echo-a2-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
echo-b1-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
echo-b1-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
gfdl_2-0-a2-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
gfdl_2-0-a2-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
gfdl_2-0-b1-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
gfdl_2-0-b1-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
gfdl_2-1-a1b-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
gfdl_2-1-a1b-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
gfdl_2-1-a1fi-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
gfdl_2-1-a1fi-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
gfdl_2-1-a2-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
gfdl_2-1-a2-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
gfdl_2-1-b1-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
gfdl_2-1-b1-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
giss_aom-a1b-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
giss_aom-a1b-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
giss_aom-b1-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
giss_aom-b1-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
hadcm3-a1b-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
hadcm3-a1b-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
hadcm3-a1fi-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
hadcm3-a1fi-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
hadcm3-a2-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
hadcm3-a2-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
hadcm3-b1-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
hadcm3-b1-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
hadgem-a1b-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
hadgem-a1b-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
hadgem-a2-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
hadgem-a2-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
miroc_hi-a1b-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
miroc_hi-a1b-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
miroc_hi-b1-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
miroc_hi-b1-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
miroc_med-a1b-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
miroc_med-a1b-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
miroc_med-a2-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
miroc_med-a2-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
miroc_med-b1-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
miroc_med-b1-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
mri_cgcm2-a1b-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
mri_cgcm2-a1b-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
mri_cgcm2-a2-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
mri_cgcm2-a2-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
mri_cgcm2-b1-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
mri_cgcm2-b1-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
pcm-a1b-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
pcm-a1b-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
pcm-a1fi-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
pcm-a1fi-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
pcm-a2-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
pcm-a2-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230],
pcm-b1-tmax-NAm-grid[0:1:51099][137:1:192][163:1:230],
pcm-b1-tmin-NAm-grid[0:1:51099][137:1:192][163:1:230]

To get the file(s) you pass…

nccopy -u <URI> whatever.nc

My request is so big that perhaps it comes across a missing file somewhere in the list, if I try to just pass the whole URI? I get this error from nccopy (running v. 4.4.0 on Xenial Xerus):

NetCDF: file not found
Location: file /build/netcdf-StLR0y/netcdf-4.4.0/ncdump/nccopy.c; line 1355

I’m thinking that probably the best thing to do will be to script this request, so it handles each URI separately. The friendly data portal manager has been much help, in that it looks like from his example to write a standalone URI for each model, it would be something of the form (an example of the first dataset):

So here’s what I did: I copied the big URI broken up and pasted it into a text file called “climate_structure.” Then, I deleted the first line, and got the names of the climate models into a new file, “climate_models” like so:
cat climate_structure | cut -d [ -f1 > climate_models

Then, I setup a simple bash script to loop over the related models and variables now in “climate_models” and download each into a separate *.nc file, and give each the name of the model and variable combination.

#!/bin/bash
#loop over list of climate model names

while read climate_model; do
 echo "Now downloading $climate_model"
 nccopy -u "http://cida.usgs.gov/thredds/dodsC/dcp/conus_t?lon[163:1:230],time[0:1:51099],lat[137:1:192],${climate_model}[0:1:51099][137:1:192][163:1:230]" ${climate_model}.nc
done < climate_models

Next I’ll need to get the precipitation.

My friend suggested instead I try GrADS to handle the data request, and had many friends that used it successfully and often. It can be scripted so I’m looking into that now! In the meantime, please comment if you’ve had any experiences with big climate data and gathering lots of NetCDF files!