I’ve been slowly exploring the NumPy Python library. Without delving into the technical details, which I don’t understand, suffice it to say that using NumPy arrays allows certain speed advantages over nested Python lists, and presumably, ArcPy cursors.

Take the Near Analysis. For every point in a feature class, calculate the distance to and record the object ID of the nearest point. Given 1,000 points and no fancy indexing algorithms, this means calculating 1,000,000 distances (or 999,000 – we don’t care about distance to self). 10,000 points? That explodes or distance calculations to 100,000,000.

Some options:

- We could run the out-of-the-box ArcGIS Near tool, but that requires Advanced licensing, which I don’t have.
- We could make our own Near tool, using two arcpy.da cursors.
- We could load our points into a NumPy array, using FeatureClassToNumPyArray, and use NumPy/SciPy matrix operators for the analysis.

Here’s the code I came up with for options 2 and 3:

>>> import arcpy, numpy, numpy.lib.recfunctions, scipy.spatial, random, cProfile
... fc = "bc_geoname_albers"
... def numpyNear():
... npArray = arcpy.da.FeatureClassToNumPyArray(fc,["OID@","SHAPE@XY"])
... distArray = scipy.spatial.distance.pdist(npArray['SHAPE@XY'],'euclidean')
... sqDistArray = scipy.spatial.distance.squareform(distArray)
... nearFID = npArray['OID@'][numpy.argsort(sqDistArray)[:,1]].transpose()
... npAppend = numpy.lib.recfunctions.append_fields(npArray,'NearFID',nearFID)
... nearDist = numpy.sort(sqDistArray)[:,1].transpose()
... npAppend = numpy.lib.recfunctions.append_fields(npAppend,'NearDist',nearDist)
... outFC = r'in_memory\outPts'+str(random.random()*1000)
... outFeat = arcpy.da.NumPyArrayToFeatureClass(npAppend,outFC,"SHAPE@XY")
... arcpy.CopyFeatures_management(outFC,r'in_memory\numpy_Pts')
... def arcpyNear():
... lyr = 'arcpy_Pts'
... arcpy.MakeFeatureLayer_management(fc,lyr)
... arcpy.AddField_management(lyr,'NearFID','LONG')
... arcpy.AddField_management(lyr,'NearDist','DOUBLE')
... with arcpy.da.UpdateCursor(lyr,['SHAPE@','OID@','NearFID','NearDist']) as cursor1:
... for row in cursor1:
... dist = float("inf")
... with arcpy.da.SearchCursor(lyr,['SHAPE@','OID@']) as cursor2:
... for row2 in cursor2:
... if row[1] != row2[1]:
... curDist = row[0].distanceTo(row2[0])
... if curDist < dist:
... dist = curDist
... nearFID = row2[1]
... row[2] = nearFID
... row[3] = dist
... cursor1.updateRow(row)
... cProfile.runctx('numpyNear()',None,locals())
... cProfile.runctx('arcpyNear()',None,locals())

Even though the result of each function is essentially the same, using 1,000 points the NumPy function completes in **0.516** seconds, while the ArcPy takes **202.062** seconds.

Side note: see use of cProfile for benchmarking method.

Here is the NumPy method, using three points as an example.

1. Create the NumPy Array (FeatureClassToNumPyArray):

npArray = arcpy.da.FeatureClassToNumPyArray(fc,["OID@","SHAPE@XY"])
[(0, [0.0, 0.0])
(1, [3.0, 0.0])
(2, [3.0, 4.0])]

2. Calculate all distances using SciPy pdist function (result = distance from 0 to 1, 0 to 2, and 0 to 3):

distArray = scipy.spatial.distance.pdist(npArray['SHAPE@XY'],'euclidean')
[ 3. 5. 4.]

3. Create a squareform of the distance array (row 1 is all distances from point 0, row 2 is all distances from point 1, etc.):

sqDistArray = scipy.spatial.distance.squareform(distArray)
[[ 0. 3. 5.]
[ 3. 0. 4.]
[ 5. 4. 0.]]

4. For each object ID in the original points, find the second nearest value’s index (so, not the distance to itself) – result is list of positions, by row, in squareform distance array holding the second-nearest value:

nearFID = npArray['OID@'][numpy.argsort(sqDistArray)[:,1]].transpose()
[1 0 1]

5. Append the array of NearFIDs to original point array, using numpy.lib.recfunctions.append_fields:

npAppend = numpy.lib.recfunctions.append_fields(npArray,'NearFID',nearFID)

6. Get second-nearest distance value for each row in squareform distance array:

nearDist = numpy.sort(sqDistArray)[:,1].transpose()
[ 3. 3. 4.]

7. Append the array of NearDist values to original point array:

npAppend = numpy.lib.recfunctions.append_fields(npAppend,'NearDist',nearDist)

8. Convert back to feature class and save (NumPyArrayToFeatureClass):

outFC = r'in_memory\outPts'+str(random.random()*1000)
outFeat = arcpy.da.NumPyArrayToFeatureClass(npAppend,outFC,"SHAPE@XY")
arcpy.CopyFeatures_management(outFC,r'in_memory\numpy_Pts')

Final attribute table, including original object ID, Near object ID, and Near Distance:

### Like this:

Like Loading...