Google Earth Image Exporting for Download

We were breaking apart our region of interest to facilitate downloads, but I decided instead to let Google tile the image for me. Some things I noticed: even when providing a region for export, the extent is used as opposed to an actual “clip” of the image. However, when you clip the image earlier in the script, it appears that the extra region given is just filled with 0’s (so perhaps you can download a smaller image of the same extent by providing the clip).

Some weird notes on differences between the *.tifs surely resulting from some part of from what I’ve changed…

  • the mask that makes only 1’s part of the image: in some things I was trying before it fussed at me seemingly because of the mask so by exporting an image of 0’s and 1’s it seemed to help

What results from both is an image of 0’s and 1’s, but oddly “out of the box” the 1st TIFF displays as expected (i.e. in ArcGIS, even though the auto color scheme is stretched, you can see the wetlands). However, the one with the bullet point changes made does not: it looks all black (the value assigned at the 0 end) until you display with discrete colors. Then, the images look analogous. Looking closer, it looks like the original script exports with unsigned integer and 8 bit pixel depth, whereas my new script exports with floating point and 32 but pixel depth. This is making the newer images way bigger (and I think unnecessarily).

What I’m Doing in Google Earth Engine

Right now, there’s a data set we want that’s distributed exclusively on Earth Engine. So, to break it into manageable/meaningful chunks for our analysis, we’ve created regions. We basically need to generate the raster (requires a “max” function for the time period of interest) for each year, and then clip the raster to each region and download it. All of this is exclusively done in the JavaScript IDE on the site.

Already, some data distribution hurdles for this platform are identified (to its credit, it sort of isn’t built for that purpose; it has many other strengths, i.e. actually being able to do processing). Possibly one of the funkiest things about Earth Engine is the distinction (and difference in how it’s coded) between “client-side” and “server-side” functions. So, whenever you make something “ee.” it’s server side: it means that the server is processing whatever it is into the thing you want. So, you have to deal with the resulting objects differently depending on what “side” you’re on.

Some Processing I’ve Done in GDAL

I have GDAL installed in Linux (i.e. the easiest way to install/implement it) so the following examples represent command line usage. I have used a smattering of different GDAL utilities, and the links in the descriptions go to the manual page for the utility in each example. I have incorporated these example commands into various bash scripts if I need to implement them in an iteration over multiple files.

This is an example of re-sampling a raster to larger pixel sizes (in this case to a lower resolution of 0.09 degree from the original 0.0002 degree [30m LANDSAT pixels]) by taking the mean of the pixels encompassed in the coarser output.

gdalwarp -tr 0.09 0.09 -r average -ot Float32 file.tif fileout.tif
Rplot
This is the output of the above command, where a 30m resolution image was scaled up to 10km resolution image. The new pixel values are the averages of the binary raster pixels (0,1) that contributed to them. All maps are plotted in R 3.4.1

I have used GDAL to set no data values (in this case, I reclassified 0 as the no data value).

gdal_translate -of GTiff -a_nodata 0 PPR/2000/Playas_2000.tif region_img/Playas.tif

Here’s an example of stitching 2 tiff’s together into a larger area, and setting the output no data value to be where the 0’s were in the original input files.

gdal_merge.py -o region_img/Canada.tif -a_nodata 0 PPR/2000/Alberta_PPR_2000.tif PPR/2000/SAK_MAN_PPR_2000.tif
Rplot01
Notice blue is the only color, representing the only value left in the raster (1).

If you want to stitch shape files together into a new file, you have to initialize the new shape file first with one of your input files and then add to it.

ogr2ogr regions/Canada.shp shapefiles/Alberta_PPR_2001.shp
ogr2ogr -update -append regions/Canada.shp shapefiles/SAK_MAN_PPR_2001.shp -nln Canada

If you’re going to, say, turn your raster into polygons, you can get rid of clumps below a certain threshold before doing so (in this case, I’m getting rid of single pixels in my unary raster when using an 8-neighbor clumping rule).

gdal_sieve.py -st 2 -8 file.tif

Then, I can make my polygon layer, simplified. In the 2nd line, I project the shapefile to Albers Equal Area.

gdal_polygonize.py -8 file.tif -f "ESRI Shapefile" shapefile/file.shp 
ogr2ogr -f "ESRI Shapefile" -progress outfile.shp shapefile/file.shp -t_srs "EPSG:5070"
Rplot02
Here’s a shape file of wetlands created from the TIFF of the Canadian Prairie Pothole Region!

Waterfowl & Wetlands Literature Review

Studying waterfowl with large extant datasets is intimidating because I often have the sneaking suspicion “someone has done this before.” I’m in the process of figuring out which of my suspicions are correct.

  • Are there more ducks where there are more wetlands in the surrounding landscape? If so, what scale is relevant to predict waterfowl abundance (e.g. are there more waterfowl when the 10km surrounding them have more wetlands)?
    • No…
      • duck abundance on a given pond is lower when there are more wetlands in the surrounding landscape (Bartzen et al. 2017)
        • The Waterfowl Breeding Population and Habitat Survey (WBPHS) 1993-2002
        • Canadian PPR
        • generalized least-squares regression models
        • compound symmetry covariance structure to account for repeated annual counts on ponds
        • AIC model selection

Literature Cited

Bartzen, B., Dufour, K. W., Bidwell, M. T., Watmough, M. D. and Clark, R. G. (2017), Relationships between abundances of breeding ducks and attributes of Canadian prairie wetlands. Wildl. Soc. Bull., 41: 416–423. doi:10.1002/wsb.794

Learning to use Google Earth Engine

I’m poking my way around Google Earth Engine and seeing what’s here, as well as learning some basics of the web IDE. They have image and feature collections, accessible through functions…

  • ImageCollection()
  • FeatureCollection()

You can pass the name of a collection to FeatureCollection() to query it. Sometimes these are actually Fusion Tables. It’s kind of weird to get to the fusion tables, but let’s say you have a script or something else with a reference to their ID. You can open it by…

https://fusiontables.google.com/data?docid=<whatever that long ID is>

Variables are modifiable through .function(). Lines end with semicolons.

GIS for Fun: My Digitized BBS Route

The North American Breeding Bird Survey (BBS) was meant to be a stratified random sample of the continent’s terrestrial bird populations, so we could monitor the pulse of populations and trends. When I went to the American Ornithologists’ U/COS conference (back when it was still called that) in 2015, I presented in a symposium dedicated to BBS. The opening talk was about the history of BBS, which helped give me some historical context about the data set. It added a dimension to my understanding of the time periods of ecological research. The data set was developed before landscape ecology really took hold in the U.S., and along with it an explicit awareness of spatial analysis.

The spatial data we have for the BBS is what I believe to be digitized route paths from old highlighted paper route maps. As such, it’s rife with errors: I’ve found routes in the layer of many variable lengths, but some egregiously too long. Being that it’s a volunteer effort, it’s at times hard to get clarification on what’s going on here. I’ve done and continue to do analyses based on these route paths, so their accuracy matters to me. Side note: if the United States Geological Survey – Patuxent would have me, I’d be happy to travel the country in something like the “Google street view car” and help digitize these routes during BBS season! Just putting that out there.

I decided to pull up my route in GIS to see what calculated stops would look like, by first placing stops every 0.5 mi. (probably the more correct method) and then placing 50 equally spaced stops along the route. Since the route is a little long, the 50 stops end up probably being more spread out than they should be. Also, as route runners know, it’s not an exact science, in that you might need to move a stop up to 0.1 mi. for safety concerns.

Anyway, when I synced this up to road data, I found that the routes were significantly off from the latest road path data, probably because the latest version is > 4 years old now. The road data I’m using is quite intensive, and I’m trying to figure out what to do next. How will I line up the paths, and will this little experiment even matter in the long run?

Misadventures in Open Source GIS

When most ecologists conceptualize “GIS” we often think of a desktop GIS program, most likely ArcGIS. When we want to not use ArcGIS (or can’t), we most likely say “is there an R package for that?” In many cases, there is, and the repository keeps growing! The ever-growing number of R spatial packages reflects the general trend in GIS right now: it’s booming in lateral growth. Everywhere you turn, there’s something new, and actually my motivation for writing this post is just to try to keep up.

Going back “old school,” perhaps the most famous and long-running (i.e. older than I am) open source program is GRASS GIS. Here’s where the open source movement GIS movement fails, though: the point-and-click functionality and friendliness of interfacing with the program stops before you even get to the gate. If I go to the Ubuntu software center, it points me to an outdated version of the program that won’t even launch, so I had to remove it before installing the newer version (where the version number is in the command name). Luckily, though, it points me to a broader repository whereby I can access the “Ubuntu GIS” suite.

# Add Ubuntu Unstable PPA when running LTS Ubuntu release
sudo add-apt-repository ppa:ubuntugis/ubuntugis-unstable

The real problem I’ve had with GRASS from the beginning, though, is the beginning: you can’t just open the program and try to open some spatial data like you can in ArcGIS. You have to start by selecting a home folder, defining a “location” which must have uniform spatial reference, and then making a map set with spatial reference info. It’s where I admittedly gave up trying with GRASS long ago, in favor of the ease of just launching ArcGIS and adding a shape file.

QGIS solves more of this problem, but there are again some hangups on the install: it’s not in the Ubuntu software center, and installing it “the old fashioned way” seems to get an older version of the program (and the terminal errors will tell you so if you launch it from the terminal).

I’m thinking about finally trying to learn PostgreSQL in conjunction with PostGIS. I’m trying to retrace my thoughts on that to last year, and I think it was because I found out that SQL was a solid way to build my own GIS tools. I remember settling on that PostGIS was probably my best bet to learn open source GIS, and ironically I just saw a post-doc advertisement that specifically requested PostGIS knowledge.

Want to keep up? I compiled a certainly-not-exhaustive-but-pretty-comprehensive list of people I could find tweeting about open source GIS

Let me know if you should be on that list!

Extracting NetCDF Values to a Shape File

Here’s a script that loops over climate NetCDF bricks in a folder and extracts the values for each layer in the brick of each file, in this case averaged over polygons in a shape file.

rm(list = ls())
library(raster)
library(rgdal)
library(ncdf4)
library(reshape2)
library(stringr)
library(data.table)
setwd("where your climate files are")
your_shapefile <- readOGR("path to your shapefile","the layer name for your shapefile")

climate <- rbindlist(lapply(list.files(pattern="nc"), function(climate_file)
{
#in my case, the weather variable is in the filename
climate_var = ifelse(grepl("pr",climate_file),"pr",ifelse(grepl("tmax",climate_file),"tmax","tmin"))
print(climate_var)
climate_variable <- brick(climate_file,varname=climate_var,stopIfNotEqualSpaced=FALSE)
#I needed to shift the x-axis to be on a -180 to 180 scale
extent(climate_variable)=c(xmin(climate_variable)-360, xmax(climate_variable)-360, ymin(climate_variable), ymax(climate_variable))
blocks <- spTransform(blocks,projection(climate_variable))
r.vals <- extract(climate_variable, blocks, fun=mean,na.rm=TRUE,df=TRUE,layer=1,nl=1680)
r.vals <- melt(r.vals, id.vars = c("ID"),variable.name = "date")
r.vals$climate <- climate_var
#depending on the filename convention of your files, get a variable for your climate model and name it "modelname"
r.vals$model <- modelname
climate_variables[[length(climate_variables)+1]] <- r.vals
}))

climate <- rbindlist(climate_variables)

At this point, you have a data frame called “climate” that has all your data long-form. You can cast this as you see appropriate depending on your needs!