With the ongoing expand of the available geospatial databases in raster format (GeoTiff) it is getting easier to analyse the relationship between any complex morphology, captured in landmark data, and i.e. climate or precipitation data with the help of partial least squares (PLS). R has a wonderful raser, sp and gdal packages that facilitate the extraction of the geospatial data from GeoTiff raster grids. In this post the geoTiff used will be from the WorldClim website, and the climate data for European square 16, where some of my real samples originate from. In order to get this data you can follow the downloads section of the WorldClim website and choose download data by tile, 30 arc-seconds resolution. When a map opens the dataset for this post would be under the map, when clicked on the square zone 16 and finally download the Mean Temperature. This is a .zip file with twelve GeoTiffs, one for each month. Unpack the files in a pre-destined directory, and declare this a working directory in the R session.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
After reading in the basic data, raster package offers several formats in which to keep the data, besides the basic raster. The most useful one is a raster stack which really is just the piled-up rasters that can be manipulated together. Also one useful visual help is the addition of the country boundaries followed by zooming in to the extent of the country or region where the samples originate from. Country boundaries .shp files is freely avalilable from thematicmapping. When downloaded they should be in the same directory as the GeoTiffs.
1 2 3 4 5 6 7 8 9 10 |
|
Before final extraction of the climate data it can be useful to plot each month`s mean temperature for the selected extent of the rasater. This can be easily achieved by using the rasterVis package levelplot and the extentR variable, which can be used to cut through all raster layers in the stack.
1 2 3 4 5 6 |
|
Slightly more efficient than the drawExtent is the spatialPolygons function from the sp package that will be used for definition of the real sampling localities, by simply drawing boundaries of the localities by hand or using the known positional data. It is best if approximate latitude/longitude coordinates are known in advance so that the defining polygon can be drawn connecting several corners-points, that form the broader sampling area border. If coordinates are not known in advance then the drawExtent should be used first for finding the latitude/longitude of the spatial polygon points, and then making the SpatialPolygon object out of them, as described next.
1 2 3 4 5 |
|
After all spatial polygons are defined, final step involves the extraction of the mean temperature values from the points encircled by the polygon in question, for all months. Since GeoTiffs are raster formats, they are carrying the information on the temperature in the points of the bitmap grid, so if the polygon is too small, small will be the number of the grid-points inside, maybe smaller than the number of individuals. To avoid this make sampling area broader, and in this case polyTara has some 220 individual grid points inside, which means 220 values for the mean temperature in the area. If we have i.e. ten individuals per population it is necessery to have also 10 mean temperature values, so that both blocks in subsequent PLS would have same dimensionality. R`s basic sample function can be used to extract random values from the i.e. 220 points within the population (Tara and R1) polygons. This simulates random sampling of individuals from the sampling locality and after that any analysis can be done, from linear models to PLS, which will be shown in future posts.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
Figure 5. shows the locations of tara and R1 spatial polygons within the zoomed extent of the zone 16, while in Figure 6. barplots are shown for 10 random points for all months, and for both localities, for comparison.
It is obvious that the Tara locality has higer mean monthly temperature, on average for every month, and also it has less temperature variaton than R1.