Category Archives: Practicum

Options for Open Source, Public Access for ATBI Data

When I worked for the NBII we used Rackspace and Amazon’s cloud service, along with MySQL for the “Species of Greatest Conservation Need” database that held a similar number of species, but no spatial data.
Since your data has a spatial component, something like PostgreSQL or SQLite might be worthwhile to look at. I took a class in environmental information management last summer and was introduced to SQLite – which apparently has some advantages over databases created with Access, including native support for GIS applications like GRASS and QGIS.
SQLite works with a CMS called Django, Scott Simmerman suggested I look at that for improvements on the ATBI mapping project’s Web interface.  There some free hosting services for small or dev projects:https://wiki.python.org/moin/FreeHosts Might be worthwhile to look at that. Another “free” hosting site here: https://www.pythonanywhere.com
Finally Tom Colson mentioned something he’s working on with GSMIT – the Otter Spotter. Tom said Google Earth Engine might be worthwhile looking at for that.  Google offers grants and hosting of data for non-profits to use Earth Engine, so I’m curious how much of the ATBI database data might be translatable to KML.
When I worked for the NBII I was interested in serving up spatial data in KML files – I like them because you can open KML in robust GIS packages along with popular virtual globe tools like Google Earth or ArcGIS Explorer – things the everyday person has access to.
Even if not all the data can translate, it’s another “view” of the ATBI data that’s pretty useful for outreach, if not doing actual science.
I will explore at this issue as I have time and hopefully help as I am able.  I am a student of RDBMS and the ATBI is one of the more interesting datasets to learn with.
Advertisements

Eastern Hemlock (Tsuga canadensis) model on PC versus Supercomputer

I recently ran a model of Hemlock on my personal computer.

I modeled ~489 records; the Nautilus model used over 2,000.

From the model output:

This is a representation of the Maxent model for Tsuga_canadensis. Warmer colors show areas with better predicted conditions. White dots show the presence locations used for training, while violet dots show test locations.

 

This slideshow requires JavaScript.

This is not really a fair comparison, but the difference between the models with 489 records and 2000+ is interesting for comparing the predictions.

EECS model for Eastern hemlock is different.  It uses more data, and it was run 20 times with 10% of the records reserved before being synthesized into one image.

I will post some additional comparisons for other trees in some later posts.

Possible Spatial Data Inputs for MaxEnt Species Distribution Models in Great Smoky Mountains N.P.

Environmental layers are available to the public via IRMA.

Source metadata are not available at http://tiny.utk.edu/atbi.

I have attempted to map or cross-walk the layers listed by the Simmerman et. al paper to the names of datasets available for download from IRMA.

Table. Mapping from UTK names to IRMA names.

1 Soil Organic Type Soil Classification https://irma.nps.gov/App/Reference/Profile/2198007
2 Topographic Convergence Index Topographic Wetness Index https://irma.nps.gov/App/Reference/Profile/2208650
3 Solar Radiation Data 30 -m Potential Solar Radiation https://irma.nps.gov/App/Reference/Profile/2208716
4 Terrain Shape Index 30-m Topographic Shape Index https://irma.nps.gov/App/Reference/Profile/2208684
5 Terrain Shape Index 30-m Topographic Ruggedness Index Model https://irma.nps.gov/App/Reference/Profile/2182017
6 Digital Elevation Model 30-m Lidar Digital Elevation Model https://irma.nps.gov/App/Reference/Profile/2180606
7 Slope in Degrees 30-m Lidar Slope Model https://irma.nps.gov/App/Reference/Profile/2180632
8 Understory Density Classes Understory Vegetation at GRSM https://irma.nps.gov/App/Reference/Profile/1047499
9 Leaf On Canopy Cover Overstory Vegetation at GRSM https://irma.nps.gov/App/Reference/Profile/1047498
10 Vegetation Classes Vegetation Classification Great Smoky Mountains NP Vegetation Classification https://irma.nps.gov/App/Reference/Profile/1021458

Note: I am grateful to http://www.textfixer.com/html/csv-convert-table.php which made it possible to easily create this table from plain text.  I expect to add this to my “toolkit” of useful items and it saved me a lot of time.

ATBI Mapping Program: Species Distribution Models for Great Smoky Mountains National Park

Spatial Data Diversity Supporting Herpetofaunal Research in Great Smoky Mountains National Park

2014 North Carolina PARC Poster.

Reduced file size image of poster presented at North Carolina Partners in Amphibian and Reptile Conservation (NCPARC) meeting, March 2014.

Poster presented at 2014 North Carolina Partners in Amphibian and Reptile Conservation Meeting.

              Jessel, Tanner; Super, Paul E.; Colson, Thomas (2014): Spatial Data Diversity Supporting Herpetological Research in Great Smoky Mountains National Park. figshare.

http://dx.doi.org/10.6084/m9.figshare.978500

SDM re-projected for Google Earth, OSM with gdal2tile

I think I have stumbled upon the solution for tiling the png image.
To re-project the PNG image, we can geo-reference the images with GDAL, then warp the geo-referenced image to the correct projection, also with GDAL.
The process is described here:
First, enter “gdalinfo Abies_fraseri.png” into terminal to get the bounds of the PNG image.
This yields the following output:
Corner Coordinates:
Upper Left  (    0.0,    0.0)
Lower Left  (    0.0, 1302.0)
Upper Right ( 2899.0,    0.0)
Lower Right ( 2899.0, 1302.0)
Center      ( 1449.5,  651.0)
A template and implementation for our PNG files is demonstrated here:
Template:
gdal_translate -of VRT -a_srs EPSG:4326 -gcp 0 0 ULlong ULlat -gcp UPPERRIGHTPx 0 URlong URlat -gcp LOWERRIGHTPx LOWERRIGHTPy LRlong LRlat Abies_fraseri.png Abies_fraseri.vrt
Implementation: 
gdal_translate -of VRT -a_srs EPSG:4326 -gcp 0 0 -84.000683874 35.7889383688 -gcp 2899.0 0 -83.0424855 35.7889383688 -gcp 2899.0 1302.0 -83.0424855 35.426963641 Abies_fraseri.png Abies_fraseri.vrt

For “ULlong” (Upper Left Long) and so forth I used a bounding box tool (http://boundingbox.klokantech.com) to determine the following bounds of the PNG – however, if there is a more “official” known boundary, it would be wise to use that instead.
                                35.426963641,-84.000683874 – bottom left / lower left
                                35.426963641,-83.0424855 bottom right / lower right
                                35.7889383688,-83.0424855 – top right / upper right
                                35.7889383688,-84.000683874 – top left / upper left
Next, take the .vrt file and warp it:
    gdalwarp -of VRT -t_srs EPSG:4326 Abies_fraseri.vrt Abies_fraseri_2.vrt
This creates a folder with tiles for a KML network link.
see “doc.kml” in the attached zip folder which you may open in Google Earth.
There is also an Open Map Layers HTML page in the attached zip folder.
To add trails in Google Earth
My opinion is it would be more interesting and informative to have the species occurrences added as a separate layer.  Therefore, I would prefer to create placemarkers / waypoints for species locations rather than black squares on the images.  Is it possible to generate PNG images with the likely distribution but not the black squares?  If so, would that take a long time to re-do to make these tiled overlays?
Also, although I “hacked” the PNG bounds, I’m still worried it the overlay’s georeferencing be “off” a bit and would love to use a more trustworthy set of coordinates than those derived from my best judgement and http://boundingbox.klokantech.com.
From here, I think it would be smart to document a workflow for doing a lot of PNG to VRT to KML at a time.  I haven’t tried processing a whole directory of images at once yet – just playing with the Abies_fraseri.png for now.
The doc.kml and virtual raster dataset work with Google Map API and Open Layers map.  I think they both run on Javascript.

Environmental Layers for HPC Maximum Entropy Species Distribution Models in Great Smoky Mountains National Park

Currently there are 10 environmental layers that were used by Simmerman et. al in the demonstration project, “Exploring similarities among many species distributions.”

  1. Bedrock geology
  2. Digital elevation model
  3. Slope measured in degrees
  4. Solar radiation data
  5. Soil organic type
  6. Terrain shape index
  7. Topographic convergence index
  8. Leaf on canopy cover
  9. Understory density classes
  10. Vegetation classes

This slideshow requires JavaScript.

The contribution of environmental variables to each “Maximum Entropy” Species Distribution Model (MaxEnt SDM) environmental variables are accessible from each model under the “Environmental Layers” tab.

The help icon accompanying each model provides this text:

This is a species distribution model (SDM) produced by MaxEnt. This SDM is actually a composite from ten cross-validation runs for each species (see cross-validation results tab for more information). Original ATBI record locations are shown with black dots. The color scale goes from 0 probability of presence (dark brown) to 100% probability (dark green).

Color distribution depicting a probability between 0 (no chance of finding a species) and 1 (100% chance of finding a species)

Color distribution depicting a probability between 0 (no chance of finding a species) and 1 (100% chance of finding a species)

Below is a screen capture of the model generated for the Fraser fir (Abies fraseri). This model is based on 474 occurrence records.  The full size image is available online at <http://seelab.eecs.utk.edu/alltaxa/maps/Abies_fraseri.png>

Abies_fraseri

SDM model generated by the ATBI mapping project / University of Tennessee SEElab.

Interestingly, the MaxEnt model output suggests that the “Digital Elevation Model” contributed 88.8% to the Species Distribution Model for Abies frasieri.  This makes sense, since the Fraser fir is a species of conifer favoring cold environments that inhabits only the highest elevations of the Park.  The remaining layers contribute less than 5% to the model.

I’m copying out the taxonomic classification from Wikipedia:

Scientific classification
Kingdom: Plantae
Division: Pinophyta
Class: Pinopsida
Order: Pinales
Family: Pinaceae
Genus: Abies
Species: A. fraseri

The purpose is to access the records from the ATBI database, where I don’t see the Fraser fir listed in the ATBI “plants” kingdom. From <http://tremont22.campus.utk.edu/ATBI_Query.cfm> I searched by “order’ for “pinales.”

Fraser fir is accessible: <http://tremont22.campus.utk.edu/ATBI_Species.cfm?genus=Abies&epithet=fraseri&subspecies=%7E>.  Interestingly, the number of specimens in the database is 866.  Contrast this with the 474 records that were used to generate the model.  The model may simply be older (there does not appear to be a timestamp for the model; either from “Get Info” or opening up “Properties” in GIMP for both the large 299×1302 pixel PNG file and the small 600×269 pixel png) or, not all 866 occurrence records have spatial coordinates associated (e.g., they are “references from the literature.”)

Poster for North Carolina Partners in Amphibian and Reptile Conservation

Poster might highlight the MaxEnt model as an example of the outcome of research in the park and emphasize the need for more occurrence data to populate the ATBI database. 
As of Feb. 24, 2014, there are no reptile species modeled already by the All Taxa mapping project.
Some simple facts might help illustrate the need for more reptile data:
  • ​total herp species in park
  • total herp species in the ATBI database
  • total herp species in the ATBI database that have n = 30 or more
  • total herp species in the ATBI database that need more data to be modeled
I think n of 30 or more was settled on by convention in hopes of having a normal distribution (Central limit theorem).
I just looked at this issue for plants – ATBI has ~1,600 vascular plants, but only around 500 have data points that can be modeled.  And of those, 257 have the requisite 30 or more occurrence records.
Total herp species in the Park according to ATBI database:
Class Amphibia – 63 species
Class Reptilia – 41 species
I’m not sure where to get a “species checklist” for total herp species in the park.  Since the ATBI database is “All Taxa,” it seems that yes, all herp species would be included.  However I don’t know that this is true and it would be nice to verify against another source.
It is interesting that the species with the highest number of records is northern copperhead (Agkistrodon contortrix mokasen,) with 233 specimens, followed by the northern black racer (Coluber constrictor constrictor) at 181 specimens.
I took some .tiff images, saved in “Documents/NPS-NICS-Practicum/herpetofauna/ATBI-reptiles-022414” convert to PNG using GIMP (GNU Image Manipulation Program).
File – Export
(Command + Shift + E)
Dialog:
Select Filetype by Extension
-Choose PNG
Created files:
black-racer.png
northern-copperhead.png
Black racer species records in Great Smoky Mountains National Park.

Black racer species records in Great Smoky Mountains National Park.

Black racer species records in Great Smoky Mountains National Park.

Northern copperhead species records in Great Smoky Mountains National Park.

Total herp species in the ATBI database that have n = 30 or more

Total herp species in the ATBI database that need more data to be modeled