Introduction to geoplanar#

Welcome to geoplanar, a package for planar enforcement for polygon (multipolygon) GeoSeries/GeoDataFrames. In this notebook, we will demonstrate some of the basic functionality of geoplanar using the example of a researcher interested in integrating data from the United States and Mexico, to study the US-Mexico international border region.

[1]:
import geoplanar
import geopandas

[2]:
mexico = geopandas.read_file("../../geoplanar/datasets/mexico/lvl0/mex_admbnda_adm0_govmex_20210618.shp")
mexico.plot()
[2]:
<AxesSubplot:>
_images/usmex_3_1.png
[3]:
import libpysal
[4]:
us = libpysal.examples.load_example('us_income')
[5]:
us.get_file_list()
[5]:
['/Users/ecv/mambaforge/envs/geoplanar_docs/lib/python3.7/site-packages/libpysal/examples/us_income/spi_download.csv',
 '/Users/ecv/mambaforge/envs/geoplanar_docs/lib/python3.7/site-packages/libpysal/examples/us_income/usjoin.csv',
 '/Users/ecv/mambaforge/envs/geoplanar_docs/lib/python3.7/site-packages/libpysal/examples/us_income/states48.gal',
 '/Users/ecv/mambaforge/envs/geoplanar_docs/lib/python3.7/site-packages/libpysal/examples/us_income/README.md',
 '/Users/ecv/mambaforge/envs/geoplanar_docs/lib/python3.7/site-packages/libpysal/examples/us_income/us48.shx',
 '/Users/ecv/mambaforge/envs/geoplanar_docs/lib/python3.7/site-packages/libpysal/examples/us_income/us48.shp',
 '/Users/ecv/mambaforge/envs/geoplanar_docs/lib/python3.7/site-packages/libpysal/examples/us_income/us48.dbf']
[6]:
us = geopandas.read_file(us.get_path("us48.shp"))

us.crs = mexico.crs us = us.to_crs(mexico.crs)

[7]:
us.plot()
[7]:
<AxesSubplot:>
_images/usmex_9_1.png
[8]:
usmex = us.append(mexico)
[9]:
usmex.plot()
[9]:
<AxesSubplot:>
_images/usmex_11_1.png
[10]:
usmex.head()
[10]:
AREA PERIMETER STATE_ STATE_ID STATE_NAME STATE_FIPS SUB_REGION STATE_ABBR geometry Shape_Leng Shape_Area ADM0_ES ADM0_PCODE ADM0_REF ADM0ALT1ES ADM0ALT2ES date validOn validTo
0 20.750 34.956 1.0 1.0 Washington 53 Pacific WA MULTIPOLYGON (((-122.40075 48.22540, -122.4615... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 45.132 34.527 2.0 2.0 Montana 30 Mtn MT POLYGON ((-111.47463 44.70224, -111.48001 44.6... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 9.571 18.899 3.0 3.0 Maine 23 N Eng ME MULTIPOLYGON (((-69.77779 44.07407, -69.86044 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 21.874 21.353 4.0 4.0 North Dakota 38 W N Cen ND POLYGON ((-98.73006 45.93830, -99.00645 45.939... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 22.598 22.746 5.0 5.0 South Dakota 46 W N Cen SD POLYGON ((-102.78793 42.99532, -103.00541 42.9... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
[11]:
usmex.shape
[11]:
(49, 19)
[12]:
usmex.tail()
[12]:
AREA PERIMETER STATE_ STATE_ID STATE_NAME STATE_FIPS SUB_REGION STATE_ABBR geometry Shape_Leng Shape_Area ADM0_ES ADM0_PCODE ADM0_REF ADM0ALT1ES ADM0ALT2ES date validOn validTo
44 13.517 20.877 46.0 46.0 Arkansas 05 W S Cen AR POLYGON ((-94.46148 34.19666, -94.45241 34.508... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
45 11.225 32.570 47.0 47.0 Louisiana 22 W S Cen LA MULTIPOLYGON (((-93.70736 30.23937, -93.69921 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
46 13.348 41.085 48.0 48.0 Florida 12 S Atl FL MULTIPOLYGON (((-80.78589 28.78493, -80.76264 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
47 16.928 40.823 51.0 51.0 Michigan 26 E N Cen MI MULTIPOLYGON (((-88.49746 48.17392, -88.62525 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0 NaN NaN NaN NaN NaN NaN NaN NaN MULTIPOLYGON (((-92.77034 15.15128, -92.77107 ... 354.605876 173.513956 México MX Mexico None None 2020-06-23 2021-06-18 None

We have appended the Mexico gdf to the US gdf. For now, however, we are going to zoom in on a subset of the border region to investigate things further:

[13]:
from shapely.geometry import box

clipper = geopandas.GeoDataFrame(geometry =[box(-109, 25, -97, 33)])

[14]:
usborder = geopandas.clip(clipper, us)
mexborder = geopandas.clip(clipper, mexico)
/Users/ecv/mambaforge/envs/geoplanar_docs/lib/python3.7/site-packages/ipykernel_launcher.py:2: UserWarning: CRS mismatch between the CRS of left geometries and the CRS of right geometries.
Use `to_crs()` to reproject one of the input geometries to match the CRS of the other.

Left CRS: None
Right CRS: EPSG:4326


[15]:
usborder.plot()
[15]:
<AxesSubplot:>
_images/usmex_18_1.png
[16]:
mexborder.plot()
[16]:
<AxesSubplot:>
_images/usmex_19_1.png
[17]:
usmex = usborder.append(mexborder)
usmex.reset_index(inplace=True)
[18]:
usmex.plot()
[18]:
<AxesSubplot:>
_images/usmex_21_1.png

Border discrepancies#

[19]:
base = usborder.plot(alpha=0.7, facecolor='none', edgecolor='blue')
_ = mexborder.plot(alpha=0.7, facecolor='none', edgecolor='red', ax=base)
_ = base.set_xlim(-101.4, -101.3)
_ = base.set_ylim(29.6, 29.75)

_images/usmex_23_0.png

Here we see an example of the kinds of problems that can occur when combining different geospatial datasets that have been constructed by different researchers. In this figure, a portion of the US-Mexico border is displayed, with the blue linestring indicating the border according to the US dataset, while the red linestring is the border according to the Mexican dataset.

There are two types of problems this induces. Consider a point that is situated to the south of the Mexican border but North of the US border. This can only occur when the Mexican linestring is north of the US linestring. Since under planar enforcement, a point can belong to at most a single polygon, this situation would be a violation - not to mention the kind of cartographic error that can lead to a border war.

A second error occurs when a point is north of the Mexican linestring, but south of the US linestring. In this case, the point is not contained by either the US or Mexico polygons. Here the point is within a gap polygon.

Fixing Overlaps/Overshoots#

[20]:
usmex = usborder.append(mexborder)
usmex.reset_index(inplace=True)
usmex['COUNTRY'] = ["US", "MEXICO"]

usmex.area
[20]:
0    42.837167
1    51.649657
dtype: float64
[21]:
border_overlaps_removed = geoplanar.trim_overlaps(usmex)
border_overlaps_removed.area # mexico gets trimmed
[21]:
0    42.837167
1    51.608919
dtype: float64
[22]:
border_overlaps_removed_1 = geoplanar.trim_overlaps(usmex, largest=False)
border_overlaps_removed_1.area # us gets trimmed
[22]:
0    42.796428
1    51.649657
dtype: float64

Fixing undershoots/holes#

Trimming the overlaps removes the areas where points belong to both national polygons. What remains after this correction are gaps (empty sliver polygons) where points belong to neither national polygon.

[23]:
base = border_overlaps_removed.plot(column='COUNTRY')
_ = base.set_xlim(-101.4, -101.3)
_ = base.set_ylim(29.6, 29.75)

_images/usmex_31_0.png
[24]:
gap_gdf = geoplanar.gaps(border_overlaps_removed)
[25]:
base = gap_gdf.plot()
_ = base.set_xlim(-101.4, -101.3)
_ = base.set_ylim(29.6, 29.75)

_images/usmex_33_0.png
[26]:
gap_gdf.shape
[26]:
(231, 1)

For the entire border region there are 231 gaps that exist. These can be corrected, by merging each gap with the larger neighboring national polygon:

[27]:
final = geoplanar.fill_gaps(border_overlaps_removed)
[28]:
base = final.plot(column='COUNTRY')
_ = base.set_xlim(-101.4, -101.3)
_ = base.set_ylim(29.6, 29.75)

_images/usmex_37_0.png
[29]:
h1 = geoplanar.gaps(final)
[30]:
h1.shape
[30]:
(0, 1)
[31]:
base = final.plot(edgecolor='k')
_ = usborder.plot(alpha=0.7, facecolor='none', edgecolor='white', ax = base)
_ = mexborder.plot(alpha=0.7, facecolor='none', edgecolor='red', ax=base)
_ = base.set_xlim(-101.4, -101.3)
_ = base.set_ylim(29.6, 29.75)


_images/usmex_40_0.png
[32]:
final.area
[32]:
0    42.837167
1    51.686067
dtype: float64

Changing the defaults#

[33]:
usmex = usborder.append(mexborder)
usmex.reset_index(inplace=True)
usmex['COUNTRY'] = ["US", "MEXICO"]

usmex.area
[33]:
0    42.837167
1    51.649657
dtype: float64
[34]:
border_overlaps_removed_mx = geoplanar.trim_overlaps(usmex, largest=False)

[35]:
base = border_overlaps_removed_mx.plot(column='COUNTRY')
_ = base.set_xlim(-101.4, -101.3)
_ = base.set_ylim(29.6, 29.75)

_images/usmex_45_0.png
[36]:
final_mx = geoplanar.fill_gaps(border_overlaps_removed_mx, largest=False)
[37]:
base = final_mx.plot(column='COUNTRY')
_ = base.set_xlim(-101.4, -101.3)
_ = base.set_ylim(29.6, 29.75)

_images/usmex_47_0.png
[ ]: