#### [SOLVED] Issue Trying to create Zonal Statistics using Gdal and Python.

I am trying to create zonal statistics using Python and Gdal. I have a polygon shapefile and a raster file, and in order to do so, I am using a piece of code I found in StackExchange.

The raster and shapefile used can be found here

Here is the code I am using:

``````#!/usr/bin/python
#coding: utf-8

## Code from Stack Exchange: http://gis.stackexchange.com/questions/21888/how-to-overlay-shapefile-and-raster
## User: ustroetz

# Calculates statistics (mean) on values of a raster within the zones of an polygon shapefile

import gdal, ogr, osr, numpy

def zonal_stats(input_value_raster, input_zone_polygon):

# Open data
raster = gdal.Open(input_value_raster)
driver = ogr.GetDriverByName('ESRI Shapefile')
shp = driver.Open(input_zone_polygon)
lyr = shp.GetLayer()

# get raster georeference info
transform = raster.GetGeoTransform()
xOrigin = transform[0]
yOrigin = transform[3]
pixelWidth = transform[1]
pixelHeight = transform[5]

# reproject geometry to same projection as raster
sourceSR = lyr.GetSpatialRef()
targetSR = osr.SpatialReference()
targetSR.ImportFromWkt(raster.GetProjectionRef())
coordTrans = osr.CoordinateTransformation(sourceSR,targetSR)
feat = lyr.GetNextFeature()
geom = feat.GetGeometryRef()
geom.Transform(coordTrans)

# Get extent of geometry
ring = geom.GetGeometryRef(0)
numpoints = ring.GetPointCount()
pointsX = []; pointsY = []
for p in range(numpoints):
lon, lat, z = ring.GetPoint(p)
pointsX.append(lon)
pointsY.append(lat)
xmin = min(pointsX)
xmax = max(pointsX)
ymin = min(pointsY)
ymax = max(pointsY)

# Specify offset and rows and columns to read
xoff = int((xmin - xOrigin)/pixelWidth)
yoff = int((yOrigin - ymax)/pixelWidth)
xcount = int((xmax - xmin)/pixelWidth)+1
ycount = int((ymax - ymin)/pixelWidth)+1

# create memory target raster
target_ds = gdal.GetDriverByName('MEM').Create('', xcount, ycount, gdal.GDT_Byte)
target_ds.SetGeoTransform((
xmin, pixelWidth, 0,
ymax, 0, pixelHeight,
))

# create for target raster the same projection as for the value raster
raster_srs = osr.SpatialReference()
raster_srs.ImportFromWkt(raster.GetProjectionRef())
target_ds.SetProjection(raster_srs.ExportToWkt())

# rasterize zone polygon to raster
gdal.RasterizeLayer(target_ds, [1], lyr, burn_values=[1])

# read raster as arrays
banddataraster = raster.GetRasterBand(1)
dataraster = banddataraster.ReadAsArray(xoff, yoff, xcount, ycount).astype(numpy.float)

# mask zone of raster

# calculate mean of zonal raster
return numpy.mean(zoneraster)
``````

However, when I implement it, the output (numpy.mean(zoneraster)) is just a number and not an array or a raster. What am I missing? Is there something wrong with the files imput? The code?

#### @ninnghazad 2014-03-18 11:08:47

With multiprocessing, for fastness! Has a little different output-formatting.

``````#!/usr/bin/python
import gdal, ogr, osr, numpy, sys
from multiprocessing import Pool

# Raster dataset
input_value_raster = sys.argv[1]

# Vector dataset(zones)
input_zone_polygon = sys.argv[2]

# Open data
rast = gdal.Open(input_value_raster)
shp = ogr.Open(input_zone_polygon)

# Get raster georeference info
transform = rast.GetGeoTransform()
xOrigin = transform[0]
yOrigin = transform[3]
pixelWidth = transform[1]
pixelHeight = transform[5]

def proc(fid):

# Each process needs its own pointer it seems.
shape = ogr.Open(input_zone_polygon)
lyr = shape.GetLayer()
feat = lyr.GetFeature(fid)
raster = gdal.Open(input_value_raster)

# Get extent of feat
geom = feat.GetGeometryRef()
if (geom.GetGeometryName() == 'MULTIPOLYGON'):
count = 0
pointsX = []; pointsY = []
for polygon in geom:
geomInner = geom.GetGeometryRef(count)
ring = geomInner.GetGeometryRef(0)
numpoints = ring.GetPointCount()
for p in range(numpoints):
lon, lat, z = ring.GetPoint(p)
pointsX.append(lon)
pointsY.append(lat)
count += 1
elif (geom.GetGeometryName() == 'POLYGON'):
ring = geom.GetGeometryRef(0)
numpoints = ring.GetPointCount()
pointsX = []; pointsY = []
for p in range(numpoints):
lon, lat, z = ring.GetPoint(p)
pointsX.append(lon)
pointsY.append(lat)

else:
sys.exit()

xmin = min(pointsX)
xmax = max(pointsX)
ymin = min(pointsY)
ymax = max(pointsY)

# Specify offset and rows and columns to read
xoff = int((xmin - xOrigin)/pixelWidth)
yoff = int((yOrigin - ymax)/pixelWidth)
xcount = int((xmax - xmin)/pixelWidth)+1
ycount = int((ymax - ymin)/pixelWidth)+1

# Create memory target raster
target_ds = gdal.GetDriverByName('MEM').Create('', xcount, ycount, gdal.GDT_Byte)
target_ds.SetGeoTransform((
xmin, pixelWidth, 0,
ymax, 0, pixelHeight,
))

# Create for target raster the same projection as for the value raster
raster_srs = osr.SpatialReference()
raster_srs.ImportFromWkt(raster.GetProjectionRef())
target_ds.SetProjection(raster_srs.ExportToWkt())

# Rasterize zone polygon to raster
gdal.RasterizeLayer(target_ds, [1], lyr, burn_values=[1])

# Read raster as arrays
banddataraster = raster.GetRasterBand(1)
dataraster = banddataraster.ReadAsArray(xoff, yoff, xcount, ycount).astype(numpy.float)

# Mask zone of raster

# Calculate statistics of zonal raster
value = numpy.max(zoneraster)
#print value
return str(fid)+": "+str(value)

# Start the processes
layer = shp.GetLayer()
featList = range(layer.GetFeatureCount())
pool = Pool(processes=24)
print pool.map(proc,featList,8)

``````

#### @ustroetz 2013-11-20 17:44:53

If you want to get zonal statistics for several features in one shapefile, you have to loop over the `zonal_stats` function. You can write the results of the loop for example to a dictionary. Below is the modified `zonal_stats` function together with a loop, looping over the input shapefile. As an output you get a Dictionary containing for each Feature ID the mean of the covered raster.

For your sample dataset the dictionary for the first five features looks like this

``````{0: 114.57909872798288, 1: 21.889622561136882, 2: 287.79623686237102, 3: 35.350804240486838, 4: 19.63043032511781}
``````

The two functions:

``````import gdal, ogr, osr, numpy

def zonal_stats(feat, input_zone_polygon, input_value_raster):

# Open data
raster = gdal.Open(input_value_raster)
shp = ogr.Open(input_zone_polygon)
lyr = shp.GetLayer()

# Get raster georeference info
transform = raster.GetGeoTransform()
xOrigin = transform[0]
yOrigin = transform[3]
pixelWidth = transform[1]
pixelHeight = transform[5]

# Get extent of feat
geom = feat.GetGeometryRef()
if (geom.GetGeometryName() == 'MULTIPOLYGON'):
count = 0
pointsX = []; pointsY = []
for polygon in geom:
geomInner = geom.GetGeometryRef(count)
ring = geomInner.GetGeometryRef(0)
numpoints = ring.GetPointCount()
for p in range(numpoints):
lon, lat, z = ring.GetPoint(p)
pointsX.append(lon)
pointsY.append(lat)
count += 1
elif (geom.GetGeometryName() == 'POLYGON'):
ring = geom.GetGeometryRef(0)
numpoints = ring.GetPointCount()
pointsX = []; pointsY = []
for p in range(numpoints):
lon, lat, z = ring.GetPoint(p)
pointsX.append(lon)
pointsY.append(lat)

else:
sys.exit()

xmin = min(pointsX)
xmax = max(pointsX)
ymin = min(pointsY)
ymax = max(pointsY)

# Specify offset and rows and columns to read
xoff = int((xmin - xOrigin)/pixelWidth)
yoff = int((yOrigin - ymax)/pixelWidth)
xcount = int((xmax - xmin)/pixelWidth)+1
ycount = int((ymax - ymin)/pixelWidth)+1

# Create memory target raster
target_ds = gdal.GetDriverByName('MEM').Create('', xcount, ycount, gdal.GDT_Byte)
target_ds.SetGeoTransform((
xmin, pixelWidth, 0,
ymax, 0, pixelHeight,
))

# Create for target raster the same projection as for the value raster
raster_srs = osr.SpatialReference()
raster_srs.ImportFromWkt(raster.GetProjectionRef())
target_ds.SetProjection(raster_srs.ExportToWkt())

# Rasterize zone polygon to raster
gdal.RasterizeLayer(target_ds, [1], lyr, burn_values=[1])

# Read raster as arrays
banddataraster = raster.GetRasterBand(1)
dataraster = banddataraster.ReadAsArray(xoff, yoff, xcount, ycount).astype(numpy.float)

# Mask zone of raster

# Calculate statistics of zonal raster
return numpy.mean(zoneraster)

def loop_zonal_stats(input_zone_polygon, input_value_raster):

shp = ogr.Open(input_zone_polygon)
lyr = shp.GetLayer()
featList = range(lyr.GetFeatureCount())
statDict = {}

for FID in featList:
feat = lyr.GetFeature(FID)
meanValue = zonal_stats(feat, input_zone_polygon, input_value_raster)
statDict[FID] = meanValue
return statDict

# Raster dataset
# Vector dataset(zones)
input_zone_polygon = 'borders_tribes.shp'

print loop_zonal_stats(input_zone_polygon, input_value_raster)
``````

#### @Doon_Bogan 2013-11-20 18:55:57

I have another issue with the code in the section `#get extent of geometry`. My features can be multiple polygons. Here's my output for `geom.GetGeomRef(i)`, where `i' is each value of `geom.GetGeometryCount()`: `DATABASE: 7 LINEARRING (0.980468987882861 8.28964924042946, ..., 0.980468987882861 8.28964924042946) DATABASE: 8 POLYGON ((43.148330027378911 11.721389243497216,..., 11.72443924429683,43.148330027378911 11.721389243497216))` I modified the code to deal with these special cases (code written in next comment) but typically, it won't deal with polygons like "Database 8".

#### @Doon_Bogan 2013-11-20 18:57:07

My modified section is as follows: `# Get extent of geometry geonumbers=geom.GetGeometryCount() xmin_all=[] (...) ymax_all=[] for i in range(geonumbers): ring_i = geom.GetGeometryRef(i) numpoints = ring_i.GetPointCount() print ring_i pointsX_i = []; pointsY_i = [] for p in range(numpoints): lon, lat, z = ring_i.GetPoint(p) pointsX_i.append(lon) pointsY_i.append(lat) xmin_all.append(min(pointsX_i)) xmax_all.append(max(pointsX_i)) ymin_all.append(min(pointsY_i)) ymax_all.append(max(pointsY_i)) xmin=min(xmin_all) xmax=max(xmax_all) ymin=min(ymin_all) ymax=max(ymax_all) `

#### @ustroetz 2013-11-21 05:34:16

@Doon_Bogan I modified the function, so that it accepts polygons and multipolygons. I test run it with your sample dataset and it worked fine. Thanks for catching that bug. Please let me know if it works now for you.

#### @Deep 2014-07-18 17:26:01

How to the "# Get extent of feat " step is different than using GetEnvelope() on feature geometry object ? I think it will automatically take care of "Polygon" or "MultiPolygon".

### Layer spatial reference system units EPSG::32633

• 2019-02-01 14:12:56
• RemoteSensing
• 55 View
• 1 Score
• Tags:   qgis python gdalogr

### [SOLVED] Multiple Subsetting

• 2014-10-30 18:18:21
• user37082
• 174 View
• 0 Score
• Tags:   python r