Week 9 Raster
Week 9: Rasterio and Geopandas
1. Introduction¶
Rasterio is a Python library designed for reading, writing and analyzing geospatial raster data. It leverages the powerful GDAL (Gspatial Data Abstraction Library) to offer a streamlined interface for working with raster datasets like satellite imagery, digital elevation models (DEMs), and other gridded data. Rasterio makes it easier to manipulate and analyze raster data, particulary when integrated with otehr Python tools like Numpy, pandas, and matplotlib, make it valuable tool for geospatial data processing and analysis.
Geopandas is an open source project to processing geospatial data in python environment easier and efficient. GeoPandas extend the functionality from pandas to allow spatial operation on geometric types.
2. Installation¶
Creating a clean virtual envionrment is not strictly necessary, but install all packages in the same global environment may cause dependency conflicts. So it's recommend to install the geopandas in a clean virutual environment
- Create a virtual environment with name as 'gpd' and specify python version as 3.9
conda create --name gpd python=3.9
- Activate virtual environment 'gpd'
activate gpd
- Install the latest version of geopandas, rasterio and ipykernel
conda install -c conda-forge geopandas rasterio fiona
# Windows:
codna install ipykernel
# Mac:
pip install ipykernel
3. Get to know Geospatial Data¶
4. Importing libraries¶
import geopandas as gpd
import rasterio
import os
import rasterio.plot
5. Reading Raster Data¶
os.getcwd()
dem_path = os.path.join(os.getcwd(), "data\GSM_tiff.tif")
dem_path
dem = rasterio.open(dem_path)
print(dem)
6. Extracting Basic Information about Raster¶
6.1 File name¶
dem.name
6.2 File Mode¶
6.3 File mode¶
The raster data can be opened in read-only (r
) or write (w
) mode.
dem.mode
6.4 Metadata¶
The metadata comes from a GeoTIFF file, which is a common format for storing geospatial raster ata, such as DEM.
- Nodata: the value used to represent 'No data' or missing data.
- Width & Height: The number of column and row in the raster
- Affine transformation show the information about the pixel
- a,b,c,d,e,f (a and e are pixel width and height, c and f are the coordinates of the upper left corner of thr raster, b and d are rotation)
dem.meta
6.3 Spatial Resolution¶
dem.res
6.4 Spatial Dimensions: Width and Height¶
dem.width
dem.height
6.5 Spatial Bounds¶
dem.bounds
6.6 Number of bands¶
dem.count
7.Plotting Raster Data¶
rasterio.plot.show() is a simple way to display a raster image, showing the first band of the raster by default
rasterio.plot.show(dem)
7.1 Customizing Plots¶
You can further enhance your plots with color maps and titles.
The cmap parameter in the function allows you to specify a color map to apply to the image, which controls how different values are represented in color
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(8, 8))
rasterio.plot.show(dem, cmap="terrain", ax=ax, title="Digital Elevation Model (DEM) Great Smoky Mountain")
plt.show()
7.2 Overlay vector on Raster¶
In geographic data visualization, overlaying vector data (such as polygons) on raster data is a common technique to provide spatial context and enhance interpretation. For instance, to highlight the boundary of the Great Smoky Mountains, you can overlay a polygon vector on a raster DEM. The vector layer defines distinct boundaries or features, while the raster layer provides continuous data (elevation, temperature, etc.). This combination allows users to see how geographic features relate to underlying spatial patterns.
vector_path = os.path.join(os.getcwd(), "data\GSM_boundary\GSM_boundary.shp")
gdf = gpd.read_file(vector_path)
fig, ax = plt.subplots(figsize=(8, 8))
rasterio.plot.show(dem, cmap="terrain", ax=ax, title="Digital Elevation Model (DEM) & Boundary")
gdf.plot(ax=ax, edgecolor="red", facecolor="none", lw=2)
8. Masking a raster using a shapefile¶
Masking a raster with a shapefile is a process that extracts the raster data within a specified geographic boundary defined by the shapefile. This is useful for focusing on a particular area of interest, such as limiting a DEM to the boundary of the Great Smoky Mountains. The shapefile (a vector format) acts as a "cookie-cutter" that trims the raster, retaining only the values inside the shapefile's polygons and setting areas outside the boundary to "no data."
import fiona
import rasterio.mask
8.1 Read shapefile¶
with fiona.open(vector_path, "r") as shapefile:
shapes = [feature["geometry"] for feature in shapefile]
shapes
8.2 Clip raster¶
out_image, out_transform = rasterio.mask.mask(dem, shapes, crop=True)
8.3 Update information about clipped raster¶
out_meta = dem.meta
out_image.shape
out_meta.update(
{
"driver": "GTiff",
"height": out_image.shape[1],
"width": out_image.shape[2],
"transform": out_transform,
}
)
8.4 Writing a masked or modified raster image to a new file called "dem_clip.tif"¶
with rasterio.open("dem_clip.tif", "w", **out_meta) as dst:
dst.write(out_image)
clip_dem_path = os.path.join(os.getcwd(), 'dem_clip.tif')
clip_dem = rasterio.open(clip_dem_path)
fig, ax = plt.subplots(figsize=(8, 8))
rasterio.plot.show(clip_dem, cmap="terrain", ax=ax, title="Digital Elevation Model (DEM) Great Smoky Mountain")
plt.show()