Ditch the Cursor¶

Editing Feature Classes with Spatialy-Enabled DataFrames¶

ArcPy Is Great, And...¶

Problem one: `row[0]`¶

def _update_year_built(layer, year_fields):
    with arcpy.da.UpdateCursor(layer, year_fields) as cursor:
        for row in cursor:
            if row[0] is None or row[0] < 1 or row[0] == '':
                row[0] = f'{row[1]}_{row[2]}' 

            cursor.updateRow(row)

Problem 2: Transfering data between feature classes and tables¶

with arcpy.da.SearchCursor(new_data_fc, fields) as new_data_cursor, \
     arcpy.da.InsertCursor(current_data_fc, fields) as current_data_cursor:
    for row in new_data_cursor:
        current_data_cursor.insertRow(row)
        copied_records += 1

Problem 3: Renaming/reordering fields¶

fieldmappings = arcpy.FieldMappings()

fieldmappings.addTable(energov_parcels_fc)
fieldmappings.addTable(tville_parcels_fc)

fields_list = [
    ('PIN', 'parcel_id'),
    ('own_cityst', 'own_citystate'),
    ('own_zip_fo', 'own_zip_four'),
    ('prop_locat', 'prop_location'),
    ('property_t', 'property_type'),
    ('neighborho', 'neighborhood_code'),
    ('adjusted_p', 'adjusted_prcl_total'),
    #: ...
]

for field_map in fields_list:
    field_to_map_index = fieldmappings.findFieldMapIndex(field_map[0])
    field_to_map = fieldmappings.getFieldMap(field_to_map_index)
    field_to_map.addInputField(tville_parcels_fc, field_map[1])
    fieldmappings.replaceFieldMap(field_to_map_index, field_to_map)

Problem 4: Intermediate feature classes¶

ssa_summarized_roads = fr'{output_gdb}\ssa_bike_lanes_roads'
ssa_summarized_paths = fr'{output_gdb}\ssa_bike_lanes_paths'
ssa_summarized_lengths = fr'{output_gdb}b\SmallStatisticalAreas_2018_bike_lane_lengths'
tract_summarized_roads = fr'{output_gdb}\tract_bike_lanes_roads'
tract_summarized_paths = fr'{output_gdb}\tract_bike_lanes_paths'
tract_summarized_lengths = fr'{output_gdb}\census_tracts_2020_bike_lane_lengths'
buffered_tracts = fr'{output_gdb}\census_tracts_2020_buffered_30ft'
buffered_areas = fr'{output_gdb}\small_areas_buffered_200ft'
bike_lanes = fr'{output_gdb}\bike_lanes_20220111'
major_paths = fr'{output_gdb}\major_paths'

Enter the Pandas!¶

pandas gives you the tools to work with tables of data defined by rows and columns, called a DataFrame

In [1]:

import pandas as pd

medians_df = pd.read_csv('assets/median_age.csv')
medians_df.head()

Out[1]:

	County	Median_age	educationHighSchoolGraduate	educationBachelorOrGreater	Median HH income (dollars)	Mean HH income (dollars)	Mean Per Capita income	Avg_MonthlyIncome	TI_1	TI_2	TI_3	TI_4	TI_5	TI_6	TI_7	TI_8	TI_9	TI_10
0	BEAVER	33.4	90.3	20.4	54,212	64,003	22,558	2805	Gasoline stations	Elementary and secondary schools	Traveler accommodation	Executive, legislative and general government	Cattle ranching and farming	Building equipment contractors	Offices of physicians	Residential building construction	Depository credit intermediation	Other crop farming
1	BOX ELDER	32.5	93.2	23.3	59,937	73,085	23,998	3344	Restaurants and other eating places	Architectural and structural metals mfg.	General freight trucking	Building foundation and exterior contractors	Utility system construction	Justice, public order, and safety activities	Grain and oilseed milling	Gasoline stations	Automobile dealers	Offices of physicians
2	CACHE	25.1	93.1	37.8	56,840	72,148	22,666	3049	Restaurants and other eating places	Elementary and secondary schools	Dairy product manufacturing	Accounting and bookkeeping services	Other miscellaneous manufacturing	General merchandise stores, including warehous...	Offices of physicians	Scientific research and development services	Grocery stores	Electronic instrument manufacturing
3	CARBON	37.2	90.8	16.4	50,278	60,125	23,473	3436	Justice, public order, and safety activities	Machinery and supply merchant wholesalers	Executive, legislative and general government	Commercial machinery repair and maintenance	Offices of physicians	Gasoline stations	Building equipment contractors	Traveler accommodation	Offices of dentists	Home health care services
4	DAGGET	46.1	96.2	12.4	81,250	78,210	27,698	2252	Elementary and secondary schools	Administration of environmental programs	Postal service	Support activities for mining	Offices of real estate agents and brokers	Justice, public order, and safety activities	Executive, legislative and general government	Personal care services	Commercial machinery repair and maintenance	Special food services

We can access individual rows and columns using .loc (with index labels) or .iloc (with indices)

medians_df.loc[row labels, column labels]
medians_df.iloc[row indices, column indices]

In [2]:

medians_df.loc[[0, 1, 2, 5], 'County']

Out[2]:

0       BEAVER
1    BOX ELDER
2        CACHE
5        DAVIS
Name: County, dtype: object

In [3]:

medians_df.iloc[10:15, :4]

Out[3]:

	County	Median_age	educationHighSchoolGraduate	educationBachelorOrGreater
10	IRON	29.1	92.8	29.1
11	JUAB	30.4	92.2	17.1
12	KANE	41.7	92.8	27.3
13	MILLARD	35.8	88.7	22.4
14	MORGAN	32.4	97.6	39.5

We can also get just a few columns from all rows

In [4]:

medians_df[['Median_age', 'Avg_MonthlyIncome']].head()

Out[4]:

	Median_age	Avg_MonthlyIncome
0	33.4	2805
1	32.5	3344
2	25.1	3049
3	37.2	3436
4	46.1	2252

Extending pandas Spatially¶

The ArcGIS API for Python provides Spatially Enabled DataFrames, which include geometry information.

In [5]:

from arcgis.features import GeoAccessor, GeoSeriesAccessor

counties_fc_path = r'C:\Users\jdadams\AppData\Roaming\Esri\ArcGISPro\Favorites\opensgid.agrc.utah.gov.sde\opensgid.boundaries.county_boundaries'
counties_df = pd.DataFrame.spatial.from_featureclass(counties_fc_path)
counties_df.head()

Out[5]:

	xid	countynbr	entitynbr	entityyr	name	fips	stateplane	pop_lastcensus	pop_currestimate	globalid	fips_str	color4	SHAPE
0	1	20	2.010201e+09	2010.0	SANPETE	39.0	Central	28437	None	02C0C074-657F-44AE-A886-44ADB97263BD	49039	2	{"rings": [[[448347.6200000001, 4407163.6], [4...
1	2	11	2.010111e+09	2010.0	IRON	21.0	South	57289	None	2ACA2EB9-31B5-4D6A-A858-A0F6B16C64E2	49021	3	{"rings": [[[292688.5800000001, 4224956.960000...
2	3	13	2.010131e+09	2010.0	KANE	25.0	South	7667	None	A250C849-8914-4E00-A80F-C1B45CD8F0A7	49025	4	{"rings": [[[425313.88999999966, 4154648.07000...
3	4	29	2.010291e+09	2010.0	WEBER	57.0	North	262223	None	1757A80B-1895-4975-81DA-3AD2EBBA64C9	49057	1	{"rings": [[[422712.16000000015, 4554559.60999...
4	5	19	2.006191e+09	2006.0	SAN JUAN	37.0	South	14518	None	EC858EAC-D7E7-4748-B7A5-B2B744668178	49037	3	{"rings": [[[609343.4100000001, 4095382.08], [...

pandas lets you work on rows that meet a certain condition

In [6]:

counties_df.loc[counties_df['stateplane'] == 'Central', ['name', 'stateplane', 'fips_str']]

Out[6]:

	name	stateplane	fips_str
0	SANPETE	Central	49039
8	TOOELE	Central	49045
12	UINTAH	Central	49047
13	GRAND	Central	49019
15	MILLARD	Central	49027
16	WASATCH	Central	49051
17	JUAB	Central	49023
18	UTAH	Central	49049
19	DUCHESNE	Central	49013
25	EMERY	Central	49015
26	SEVIER	Central	49041
27	CARBON	Central	49007
28	SALT LAKE	Central	49035

You can easily add new columns

In [7]:

counties_df['emperor'] = 'Jake'
counties_df.head()

Out[7]:

	xid	countynbr	entitynbr	entityyr	name	fips	stateplane	pop_lastcensus	pop_currestimate	globalid	fips_str	color4	SHAPE	emperor
0	1	20	2.010201e+09	2010.0	SANPETE	39.0	Central	28437	None	02C0C074-657F-44AE-A886-44ADB97263BD	49039	2	{"rings": [[[448347.6200000001, 4407163.6], [4...	Jake
1	2	11	2.010111e+09	2010.0	IRON	21.0	South	57289	None	2ACA2EB9-31B5-4D6A-A858-A0F6B16C64E2	49021	3	{"rings": [[[292688.5800000001, 4224956.960000...	Jake
2	3	13	2.010131e+09	2010.0	KANE	25.0	South	7667	None	A250C849-8914-4E00-A80F-C1B45CD8F0A7	49025	4	{"rings": [[[425313.88999999966, 4154648.07000...	Jake
3	4	29	2.010291e+09	2010.0	WEBER	57.0	North	262223	None	1757A80B-1895-4975-81DA-3AD2EBBA64C9	49057	1	{"rings": [[[422712.16000000015, 4554559.60999...	Jake
4	5	19	2.006191e+09	2006.0	SAN JUAN	37.0	South	14518	None	EC858EAC-D7E7-4748-B7A5-B2B744668178	49037	3	{"rings": [[[609343.4100000001, 4095382.08], [...	Jake

pandas provides powerful built in grouping and aggregation tools, along with Spatially Enabled DataFrames' geometry operations

In [8]:

counties_df.groupby('stateplane').count()

Out[8]:

	xid	countynbr	entitynbr	entityyr	name	fips	pop_lastcensus	pop_currestimate	globalid	fips_str	color4	SHAPE	emperor
stateplane
Central	13	13	13	13	13	13	13	0	13	13	13	13	13
North	8	8	8	8	8	8	8	0	8	8	8	8	8
South	8	8	8	8	8	8	8	0	8	8	8	8	8

In [9]:

counties_df['acres'] = counties_df['SHAPE'].apply(lambda shape: shape.area / 4046.8564)
counties_df.groupby('stateplane')['acres'].sum()

Out[9]:

stateplane
Central    2.725686e+07
North      8.633955e+06
South      1.842508e+07
Name: acres, dtype: float64

pandas Solutions to our Arcpy Problems¶

`row[0]` Solution: Field Names¶

Let's make Erik the emperor of the small counties that use State Plane North

In [10]:

counties_df.loc[(counties_df['pop_lastcensus'] < 100000) & (counties_df['stateplane'] == 'North'), 'emperor'] = 'Erik'
counties_df[['name', 'pop_lastcensus', 'stateplane', 'emperor']].sort_values('name').head()

Out[10]:

	name	pop_lastcensus	stateplane	emperor
9	BEAVER	7072	South	Jake
10	BOX ELDER	57666	North	Erik
11	CACHE	133154	North	Jake
27	CARBON	20412	Central	Jake
20	DAGGETT	935	North	Erik

Joined Tables Solution: Merged DataFrames¶

Let's add census data to our counties

In [11]:

census_fc_path = r'C:\Users\jdadams\AppData\Roaming\Esri\ArcGISPro\Favorites\opensgid.agrc.utah.gov.sde\opensgid.demographic.census_counties_2020'
census_df = pd.DataFrame.spatial.from_featureclass(census_fc_path)
counties_with_census_df = counties_df.merge(census_df[['geoid20', 'aland20']], left_on='fips_str', right_on='geoid20')
counties_with_census_df.head()

Out[11]:

	xid	countynbr	entitynbr	entityyr	name	fips	stateplane	pop_lastcensus	pop_currestimate	globalid	fips_str	color4	SHAPE	emperor	acres	geoid20	aland20
0	1	20	2.010201e+09	2010.0	SANPETE	39.0	Central	28437	None	02C0C074-657F-44AE-A886-44ADB97263BD	49039	2	{'rings': [[[448347.6200000001, 4407163.6], [4...	Jake	1.024680e+06	49039	4.117901e+09
1	2	11	2.010111e+09	2010.0	IRON	21.0	South	57289	None	2ACA2EB9-31B5-4D6A-A858-A0F6B16C64E2	49021	3	{'rings': [[[292688.5800000001, 4224956.960000...	Jake	2.112861e+06	49021	8.537474e+09
2	3	13	2.010131e+09	2010.0	KANE	25.0	South	7667	None	A250C849-8914-4E00-A80F-C1B45CD8F0A7	49025	4	{'rings': [[[425313.88999999966, 4154648.07000...	Jake	2.627117e+06	49025	1.033391e+10
3	4	29	2.010291e+09	2010.0	WEBER	57.0	North	262223	None	1757A80B-1895-4975-81DA-3AD2EBBA64C9	49057	1	{'rings': [[[422712.16000000015, 4554559.60999...	Jake	4.220563e+05	49057	1.492537e+09
4	5	19	2.006191e+09	2006.0	SAN JUAN	37.0	South	14518	None	EC858EAC-D7E7-4748-B7A5-B2B744668178	49037	3	{'rings': [[[609343.4100000001, 4095382.08], [...	Jake	5.075154e+06	49037	2.025317e+10

Renaming/Reordering Fields Solution: `df.rename()` and `df.reindex()`¶

"Emperor" is too bold; let's use "Benevolent Dictator for Life" instead.

In [12]:

renames = {
    'name': 'County Name',
    'pop_lastcensus': 'Last Census Population',
    'emperor': 'Benevolent Dictator for Life',
    'acres': 'Acres',
    'aland20': 'Land Area',
}
counties_with_census_df.rename(columns=renames, inplace=True)
counties_with_census_df.head()

Out[12]:

	xid	countynbr	entitynbr	entityyr	County Name	fips	stateplane	Last Census Population	pop_currestimate	globalid	fips_str	color4	SHAPE	Benevolent Dictator for Life	Acres	geoid20	Land Area
0	1	20	2.010201e+09	2010.0	SANPETE	39.0	Central	28437	None	02C0C074-657F-44AE-A886-44ADB97263BD	49039	2	{'rings': [[[448347.6200000001, 4407163.6], [4...	Jake	1.024680e+06	49039	4.117901e+09
1	2	11	2.010111e+09	2010.0	IRON	21.0	South	57289	None	2ACA2EB9-31B5-4D6A-A858-A0F6B16C64E2	49021	3	{'rings': [[[292688.5800000001, 4224956.960000...	Jake	2.112861e+06	49021	8.537474e+09
2	3	13	2.010131e+09	2010.0	KANE	25.0	South	7667	None	A250C849-8914-4E00-A80F-C1B45CD8F0A7	49025	4	{'rings': [[[425313.88999999966, 4154648.07000...	Jake	2.627117e+06	49025	1.033391e+10
3	4	29	2.010291e+09	2010.0	WEBER	57.0	North	262223	None	1757A80B-1895-4975-81DA-3AD2EBBA64C9	49057	1	{'rings': [[[422712.16000000015, 4554559.60999...	Jake	4.220563e+05	49057	1.492537e+09
4	5	19	2.006191e+09	2006.0	SAN JUAN	37.0	South	14518	None	EC858EAC-D7E7-4748-B7A5-B2B744668178	49037	3	{'rings': [[[609343.4100000001, 4095382.08], [...	Jake	5.075154e+06	49037	2.025317e+10

Now that we've got it all looking good, let's reorder the fields and get rid of the ones we don't want

In [13]:

field_order = [
    'County Name',
    'Benevolent Dictator for Life',
    'Acres',
    'Land Area',
    'Last Census Population',
    'SHAPE'
]
final_counties_df = counties_with_census_df.reindex(columns=field_order)
final_counties_df.head()

Out[13]:

	County Name	Benevolent Dictator for Life	Acres	Land Area	Last Census Population	SHAPE
0	SANPETE	Jake	1.024680e+06	4.117901e+09	28437	{'rings': [[[448347.6200000001, 4407163.6], [4...
1	IRON	Jake	2.112861e+06	8.537474e+09	57289	{'rings': [[[292688.5800000001, 4224956.960000...
2	KANE	Jake	2.627117e+06	1.033391e+10	7667	{'rings': [[[425313.88999999966, 4154648.07000...
3	WEBER	Jake	4.220563e+05	1.492537e+09	262223	{'rings': [[[422712.16000000015, 4554559.60999...
4	SAN JUAN	Jake	5.075154e+06	2.025317e+10	14518	{'rings': [[[609343.4100000001, 4095382.08], [...

Intermediate Feature Classes: New DataFrame Variables¶

With everything we've done, we've not written a single feature class to either disk or in_memory

counties_df
counties_with_census_df
final_counties_df

Finally, Write It All To Disk¶

In [14]:

final_counties_df.spatial.to_featureclass(r'C:\gis\Projects\HousingInventory\HousingInventory.gdb\counties_ugic')

Out[14]:

'C:\\gis\\Projects\\HousingInventory\\HousingInventory.gdb\\counties_ugic'

Spatial Joins¶

centroids_df = pd.DataFrame.spatial.from_featureclass(centroids_fc)
walksheds_df = pd.DataFrame.spatial.from_featureclass(walksheds_fc)

walk_centroids_df = centroids_df.spatial.join(walksheds_df, 'left', 'within')

Update an AGOL Hosted Feature Layer¶

feature_layer_item = gis.content.get(feature_layer_itemid)
feature_layer = arcgis.features.FeatureLayer.fromitem(feature_layer_item)
live_dataframe = pd.DataFrame.spatial.from_layer(feature_layer)

#: maniuplate/transform the existing data
cleaned_dataframe = do_stuff(live_dataframe)

feature_layer.edit_features(updates=cleaned_dataframe.spatial.to_featureset())

Resources?¶

Official Docs¶

Pandas docs: https://pandas.pydata.org/docs/user_guide/index.html
ArcGIS API for Python Reference: https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#geoaccessor

Example Code¶

Erik's 2020 UGIC Intro to Pandas presentation: https://agrc.github.io/Presentations/UGIC/2020/pandas.pdf
Updating AGOL with dataframe: https://github.com/agrc/palletjack/blob/main/src/palletjack/updaters.py#L175
So much dataframe craziness: https://github.com/agrc/housing-unit-inventory/tree/first-dev/src/housing_unit_inventory

Ditch the Cursor¶

Editing Feature Classes with Spatialy-Enabled DataFrames¶

ArcPy Is Great, And...¶

Problem one: `row[0]`¶

Problem 2: Transfering data between feature classes and tables¶

Problem 3: Renaming/reordering fields¶

Problem 4: Intermediate feature classes¶

Enter the Pandas!¶

Extending pandas Spatially¶

pandas Solutions to our Arcpy Problems¶

`row[0]` Solution: Field Names¶

Joined Tables Solution: Merged DataFrames¶

Renaming/Reordering Fields Solution: `df.rename()` and `df.reindex()`¶

Intermediate Feature Classes: New DataFrame Variables¶

Finally, Write It All To Disk¶

Spatial Joins¶

Update an AGOL Hosted Feature Layer¶

Resources?¶

Official Docs¶

Example Code¶

jdadams@utah.gov ¶

gis.utah.gov/presentations ¶

Ditch the Cursor¶

Editing Feature Classes with Spatialy-Enabled DataFrames¶

ArcPy Is Great, And...¶

Problem one: row[0]¶

Problem 2: Transfering data between feature classes and tables¶

Problem 3: Renaming/reordering fields¶

Problem 4: Intermediate feature classes¶

Enter the Pandas!¶

Extending pandas Spatially¶

pandas Solutions to our Arcpy Problems¶

row[0] Solution: Field Names¶

Joined Tables Solution: Merged DataFrames¶

Renaming/Reordering Fields Solution: df.rename() and df.reindex()¶

Intermediate Feature Classes: New DataFrame Variables¶

Finally, Write It All To Disk¶

Spatial Joins¶

Update an AGOL Hosted Feature Layer¶

Resources?¶

Official Docs¶

Example Code¶

jdadams@utah.gov¶

gis.utah.gov/presentations¶

Problem one: `row[0]`¶

`row[0]` Solution: Field Names¶

Renaming/Reordering Fields Solution: `df.rename()` and `df.reindex()`¶

jdadams@utah.gov ¶

gis.utah.gov/presentations ¶