domingo, 11 de noviembre de 2012

Hexagons, trees and more

One thing I like to do (and that I will probably be doing in other entries in this blog) is to find examples of spatial analysis and to replicate them with SEXTANTE. If the original example uses QGIS, a SEXTANTE-based solution not only involves obtaining the same result using the same software, but also having a better way of replicating that analysis with a different dataset or automating it, since it can be integrated into SEXTANTE tools such as the graphical modeler or the batch processing interface.

For today's example, I am going to replicate this post from Anita Graser's blog, one of the best QGIS blogs around, with plenty of interesting examples. I recommend reading the original entry, but basically we are going to work with the Viena Tree Cadastre, create an hexagonal grid and find the number of trees in each hexagon, to get a nice rendering of how green each area of the city is. After that, we will go a little further and add some extra calculations, just to show some additional SEXTANTE elements.

The first thing is to create the grid. The MMQGISX Create Grid algorithm can be used for that, but the extent of the grid has do be defined manually with the coordinates of the center of the grid and its width and height. That might make it difficult to use it later in an automated context. For this reason, we are going to create our own algorithm to generate the grid based on the bounds of a layer. Of course, we will not do it from scratch, but reusing the MMQGISX algorithm. Here is how to do it:

Create a new script algorithm, selecting the Create new script tool in the SEXTANTE toolbox.

Add the following code to the script. Comments explain most of it.

#Definition of inputs and outputs
##cellsize=number 1000.0
##grid=output vector

#Algorithm body
from sextante.core.QGisLayers import QGisLayers

# "input" contains the location of the selected layer.
# We get the actual object, so we can get its bounds
input = QGisLayers.getObjectFromUri(input)

# And now we calculate the bounds and call the MMQGISX algorithm
centerx = (input.extent().xMinimum() + input.extent().xMaximum()) / 2
centery = (input.extent().yMinimum() + input.extent().yMaximum()) / 2
width = (input.extent().xMaximum() - input.extent().xMinimum())
height = (input.extent().yMaximum() - input.extent().yMinimum())
Sextante.runalg("mmqgisx:creategrid", cellsize, cellsize, width, height, centerx, centery, 3, grid)

This creates a new algorithm that wraps the existing one to construct the grid layer, but makes it easier to enter the bounding box, since it takes it from an already loaded layer.

Executing the algorithm, we will see the following parameters dialog.

We select the tree cadastre layer as input, and a cellsize of 0.0025, and we will get the grid covering the input layer.

Now to count the number of trees in each hexagon, we combine the trees layer and the grid layer, using both of them as inputs in the fTools Count points in polygon algorithm.

The result, with a graduated color ramp based on the attribute containing the number of points, looks like this.

 Empty cells are rendered in white, and since the borders of each hexagon are also white, it seems that there are no cells in certain parts of the covered region. However, they are there, and it would be a good idea to remove them, so as to make the layer size smaller. We can use another MMQGISX algorithm (Select), to create a new layer that contains just the hexagon cells with at least one tree.

Cell borders are represented in black this time, just to show the missing ones.

If we put all that in a model, it will allow us to create the final grid in one single step, just selecting the input layer with elements to count (trees in this case). and the desired hexagon size.

As you can see, there is no problem reusing the algorithm we have created as a part of a model. In fact, we can use a model inside another model, or our script-based algorithm as part of another script, calling it just like we call any of the other algorithms.

Running the model, we should see the parameters dialog shown below.

We can represent not just the number of trees, but the diversity of them, showing how many different species can be found in each cell (the trees layer includes an attribute with the name of the specie)

There is no algorithm to get the number of unique points in a polygons, so we have to find a workaround. The first step for that is to add a new field to the points layer with the code of the cell it belongs to. We can use the SAGA Add polygon attributes to points algorithm, but we need a unique attribute for each cell. The Add autoincremental field algorithm does exactly that.

 Now we can run the Add polygon attributes to points algorithm.

Once the points have the ID of the polygon they belong to, we can make a small script to count the number of different tree species in each class. We will create a new SEXTANTE algorithm with the following script.

#Definition of inputs and outputs
##class_field=field input
##value_field=field input
##output=output vector

#Algorithm body
from sextante.core.QGisLayers import QGisLayers
from qgis.core import *
from PyQt4.QtCore import *
from sextante.core.SextanteVectorWriter import SextanteVectorWriter

# "input" contains the location of the selected layer.
# We get the actual object, so we can get its bounds
layer = QGisLayers.getObjectFromUri(input)
provider = layer.dataProvider()
allAttrs = provider.attributeIndexes() allAttrs )
fields = provider.fields()
fields[len(fields)] = QgsField("UNIQ_COUNT", QVariant.Int)
writer = SextanteVectorWriter(output, None, fields, provider.geometryType(), )

# Fields are defined by their names, but QGIS needs the index for the attributes map
class_field_index = provider.fieldNameIndex(class_field)
value_field_index = provider.fieldNameIndex(value_field)

inFeat = QgsFeature()
outFeat = QgsFeature()
inGeom = QgsGeometry()
nFeat = provider.featureCount()
nElement = 0
classes = {}

#Iterate over input layer to count unique values in each class
while provider.nextFeature(inFeat):
  #Printing a number to the console to indicate progress
  print int((50 * nElement)/nFeat)
  nElement += 1
  atMap = inFeat.attributeMap()
  clazz = atMap[class_field_index].toString()
  value = atMap[value_field_index].toString()
  if clazz not in classes:
      classes[clazz] = []
  if value not in classes[clazz]:

# Create output vector layer with additional attribute
while provider.nextFeature(inFeat):
  print int((500 * nElement)/nFeat)
  nElement += 1  
  inGeom = inFeat.geometry()
  outFeat.setGeometry( inGeom )
  atMap = inFeat.attributeMap()
  clazz = atMap[class_field_index].toString()
  outFeat.setAttributeMap( atMap )
  outFeat.addAttribute( len(provider.fields()), QVariant(len(classes[clazz])))
  writer.addFeature( outFeat )
del writer

Again, the comments in the source code should explain how it works.

 Running that on the trees layer with the extra field will generate a new points layer with another new field containing the number of  different species (the richness) in the polygon each tree belongs to. To pass that value to the corresponding hexagon cell in the grid layer, we can use the SAGA Point statistics in polygons algorithm. Since all points in a cell will have the same richness value (because it is the richness of the cell they belong to), calculating the average value will yield exactly that richness value.

 Below you can see the result.

The script we have created can be used as well to find out the street with the largest number of different trees. Just run it on the original trees layer (which contains an attribute with the name of the street where the tree is), using the street name as class attribute, and then sort the attributes table of the resulting layer based on the richness attribute.

Tuerkenschanzpark seems to be the one with the largest value (219). Nice place for a walk, I guess ;-)

A note

The process above is a bit cumbersome, although rather useful not just as a didactic example, but also from a practical point of view, since we have created two new algorithms that can be reused. However, an algorithm to count unique points would be a better solution. As I said, there was not one to do that, but I thought it would be a good idea, and it is pretty easy to create one, so I added it to the fTools group. You can now find it in the SEXTANTE toolbox if using the development version, and it will be available in the next SEXTANTE release.

Another note

The measure of richness we have used is rather poor. For better ones, check this wikipedia article on diversity indices, and if you feel adventurous, try to implement one yourself :-)


Another interesting way of representing points when the number of them is high is by aggregating them and replacing a group of points by a bigger one with a size related to the size of the group. This technique is used, for instance, in the GeoTools PointStacker transformation. We can create such a rendering by converting the hexagon cells into circles with a radius depending on the number of trees. First we convert the hexagons into points, by running the fTools Polygon centroids algorithm. The attributes table of the centroids layer is exactly like the one of the original polygon layer, so it will contains the field with the number of points. We can divide the values in groups and create a new value for each group (use the Field pyculator algorithm for that), representing the radius of the circle that represents that class. Now we can buffer each centroid using that radius as buffer distance. Feel free to try it yourself :-)

No hay comentarios:

Publicar un comentario en la entrada