Testing multiple points of failure¶
This notebook forms the basis of “Hands-On 7” in the CCG course.
Show asset exposure based on flood return period maps, select all exposed assets in an area
note that if historic flood outlines are available, these could be used to select multiple assets instead
Demonstrate how the network effects of multiple asset failures would be disproportionately worse if there is no redundancy or potential for substitution.
Understand the risk of multiple failures across the network - target the combinations of failures of greatest consequence
By the end of this tutorial you should be able to: * Assess direct and some indirect impacts of multiple asset failures * Compare flooding within regions as source of multiple failure scenarios * Understand approaches to stress-testing the system under multiple failures
[ ]:
# Imports from Python standard library
import os
import warnings
from glob import glob
from math import factorial
from pathlib import Path
# Imports from other Python packages
import contextily as cx
import geopandas as gpd
import networkx as nx
import numpy as np
import pandas as pd
import seaborn as sns
from pyproj import Geod
from tqdm.notebook import tqdm
Change this to point to your data folder as in the previous tutorial:
[ ]:
dir = (
Path(os.getcwd()).resolve().parents[3]
) # get parent directory of snail package
data_folder = os.path.join(dir, "ghana_tutorial")
# data_folder = Path("YOUR_PATH/ghana_tutorial")
[ ]:
# TODO add a note / auto download the geoboundaries dataset
# https://data.humdata.org/dataset/geoboundaries-admin-boundaries-for-ghana/resource/e324acb9-fd6b-4f5b-96ee-e2bb11917942
0.5 Preliminary step¶
Go to the geoboundatries at: https://www.geoboundaries.org/countryDownloads.html
Search for Ghana in the Name field and download the data set from Source Open Street Map. It will show up as a folder in your downloads directory.
Put this folder in the data directory of this tutorial. The one you specify in the data_folder
variable. # TODO add these steps to do this automatically - https://www.geoboundaries.org/api.html#api
[ ]:
# https://www.geoboundaries.org/api/current/gbOpen/GHA/ADM1/
import requests
import json
from urllib.request import urlretrieve
import zipfile
url = "https://www.geoboundaries.org/api/current/gbOpen/GHA/ADM1/"
response = requests.post(url)
test = json.loads(response.content)
download_url = test["staticDownloadLink"]
file_path = os.path.join(data_folder, "geoBoundaries-GHA-ADM1-all.zip")
if not os.path.exists(Path(os.path.splitext(file_path)[0])):
# check if data already exists in this directory
# if len(os.listdir(Path(os.path.splitext(file_path)[0]))) == 0:
urlretrieve(download_url, file_path) # note: this can take a few minutes
with zipfile.ZipFile(file_path, "r") as zip_ref:
zip_ref.extractall(os.path.splitext(file_path)[0])
else:
print("data already exists")
1. Map exposure¶
[ ]:
roads = gpd.read_file(data_folder + "/GHA_OSM_roads.gpkg", layer="edges")
road_nodes = gpd.read_file(data_folder + "/GHA_OSM_roads.gpkg", layer="nodes")
[ ]:
regions = gpd.read_file(
os.path.join(
data_folder,
"geoBoundaries-GHA-ADM1-all",
"geoBoundaries-GHA-ADM1.shp",
)
)[["shapeName", "shapeISO", "geometry"]]
[ ]:
regions.head(5)
[ ]:
roads = gpd.sjoin(roads, regions).drop(columns="index_right")
Filter roads by region name to find all roads in Greater Accra:
[ ]:
accra_roads = roads[roads.shapeName == "Greater Accra Region"]
accra_roads.head(5)
[ ]:
exposure = gpd.read_parquet(
data_folder + "/results" + "/GHA_OSM_roads_edges___exposure.geoparquet"
)
[ ]:
exposure = gpd.sjoin(exposure, regions).drop(columns="index_right")
[ ]:
exposure.head(5)
Filter exposure by region, RCP and return period to find all roads exposed to a historical 100-year flood in Greater Accra:
[ ]:
accra_exposure = exposure[(exposure.shapeName == "Greater Accra Region")]
[ ]:
accra_exposure.head(5)
[ ]:
[col for col in accra_exposure.columns if "1980" in col]
[ ]:
accra_exposure.plot(
column="wri_aqueduct-version_2-inunriver_historical_000000000WATCH_1980_rp00100-gha"
)
[ ]:
flood_col = "wri_aqueduct-version_2-inunriver_historical_000000000WATCH_1980_rp00100-gha"
accra_exposure_100yr = accra_exposure[accra_exposure[flood_col] > 0.5].copy()
accra_exposure_100yr[["id", "road_type", "name", "length_m", flood_col]].head()
2. Multiple failures¶
Direct damage can be summed directly, if we assume that all roads are damaged in the same event:
[ ]:
(
"Total direct exposure, "
"in Accra under a historical 100-year flood, is estimated to be "
f"{int(accra_exposure_100yr.length_m.sum() // 1e3)}km (of {int(accra_exposure.length_m.sum() // 1e3)}km total roads)."
)
Indirect damage can be assessed in different ways, some beyond the scope of this notebook. In this section, we look at the effects of disruption on a single route across the Greater Accra region. In a fuller analysis, we could extend this to look at many trips made within the region, and calculate the number of passengers or value of freight disrupted, along with the effects on transport time and cost.
Start by creating a networkx graph from the roads, using from_id
, to_id
and length_m
:
[ ]:
G = nx.Graph()
G.add_edges_from(
(r.from_id, r.to_id, {"id": r.id, "weight": r.length_m})
for r in roads.itertuples()
)
Then find the shortest path from one node to another:
[ ]:
route_nodes = nx.algorithms.shortest_path(
G, "roadn_6700", "roadn_1011", weight="weight"
)
Then find the edges in the shortest path, and sum over their lengths to find the length of the route:
[ ]:
def edge_ids_from_nodes(G, route_nodes):
next_nodes = iter(route_nodes)
next(next_nodes)
return [G.edges[u, v]["id"] for u, v in zip(route_nodes, next_nodes)]
route_edge_ids = edge_ids_from_nodes(G, route_nodes)
[ ]:
route = roads[roads.id.isin(route_edge_ids)]
[ ]:
f"Best route: {round(route.length_m.sum() / 1e3, 2)}km"
[ ]:
ax = route.plot()
ax.set_title("Direct route")
cx.add_basemap(ax, crs=route.crs, source=cx.providers.CartoDB.Positron)
ax
Save figure to file:
[ ]:
fig = ax.get_figure()
fig.savefig(os.path.join(data_folder, "results", "direct_route.png"))
Define a function which runs the process we went through above: - build a graph - reweight all failed edges to have infinite cost - failed edges need to be provided as a list of ("from_id", "to_id")
tuples. - find the shortest route from source to target - return a dataframe with the route’s road segments
[ ]:
def calc_route(roads, failures, source, target):
G = nx.Graph()
G.add_edges_from(
(r.from_id, r.to_id, {"id": r.id, "weight": r.length_m})
for r in roads.itertuples()
)
reweight = {}
for from_id, to_id in failures:
reweight[(from_id, to_id)] = float("inf")
nx.set_edge_attributes(G, reweight, "weight")
route_nodes = nx.algorithms.shortest_path(
G, source, target, weight="weight"
)
route_edge_ids = edge_ids_from_nodes(G, route_nodes)
route = roads[roads.id.isin(route_edge_ids)]
return route
[ ]:
def plot_route(df):
ax = df.plot()
cx.add_basemap(ax, crs=route.crs, source=cx.providers.CartoDB.Positron)
return ax
Test a single road failure to find if disruption makes a difference to the overall route:
[ ]:
single_failures = [("roadn_8900", "roadn_9227")]
single_fail_route = calc_route(
roads, single_failures, "roadn_6700", "roadn_1011"
)
print(f"Best route: {round(single_fail_route.length_m.sum() / 1e3, 2)}km")
plot_route(single_fail_route)
The single road failure above has almost no effect. In our dataset, the lanes of this road are represented separately, so the routing algorithm finds a route which goes around the failed link by switching to the other lane, and the whole journey is only about 10m longer.
Let’s see the effect of both lanes flooded at the same time, which may be more realistic:
[ ]:
both_lanes_failures = [
("roadn_8900", "roadn_9227"),
("roadn_9226", "roadn_8899"),
]
both_lanes_fail_route = calc_route(
roads, both_lanes_failures, "roadn_6700", "roadn_1011"
)
print(f"Best route: {round(both_lanes_fail_route.length_m.sum() / 1e3, 1)}km")
plot_route(both_lanes_fail_route)
This results in a much longer route around the flooded link.
What if more than one road is disrupted at the same time? Let’s test what happens if we assume that all roads exposed to 1-in-100 year flood events anywhere in Greater Accra are impassible.
[ ]:
multi_failures = [
(road.from_id, road.to_id) for road in accra_exposure.itertuples()
]
multi_fail_route = calc_route(
roads, multi_failures, "roadn_6700", "roadn_1011"
)
print(f"Best route: {round(multi_fail_route.length_m.sum() / 1e3, 1)}km")
plot_route(multi_fail_route)
This gives a longer route again.
This is a quick way of coming up with a hypothetical flood event, but it is not a rigorous method of analysis. With historic flood outlines, we could test and validate this simple model against past events. With an event set output from a hydrological model (rather than just the return-period hazard map that we’ve been using), we could test an ensemble of potential events.
The next section looks at testing all possible combinations of failures, which doesn’t require any additional data or modelling.
3. Test combinations¶
We can calculate the number of possible combinations of failures, and it gets very large quite quickly.
For example, with three roads, {A, B, C}, there are three possible single failures ({only A}, {only B} or {only C}), three possible double failures ({A and B}, {B and C} or {A and C}), and one possible triple failure ({A, B and C}).
More formally, if a set has \(n\) elements, the number of ways of picking \(k\) elements from it can be shown to be:
and is zero when $ k > n $.
The function n_choose_k
calculates this:
[ ]:
def n_choose_k(n, k):
if k > n:
return 0
return int(factorial(n) / (factorial(k) * factorial(n - k)))
Try out a few values to see how the function behaves:
[ ]:
n_choose_k(3, 2)
[ ]:
n_choose_k(200, 2)
[ ]:
n_choose_k(200, 3)
Calculate some of the numbers of possible failure combinations within our road network:
[ ]:
n = len(roads)
print(f"With {n} roads")
for k in range(4):
print(
f"there are {n_choose_k(n, k):,} total possible combinations of {k} roads failing"
)
Use the np.random.choice
to sample failure combinations at random from all roads (regardless of whether they intersect with any hazard):
[ ]:
k = 500
ids = np.random.choice(roads.id, size=k, replace=False)
failed_roads = roads[roads.id.isin(ids)]
failed_roads.head(2)
[ ]:
random_failures = [
(road.from_id, road.to_id) for road in failed_roads.itertuples()
]
random_fail_route = calc_route(
roads, random_failures, "roadn_6700", "roadn_1011"
)
print(f"Best route: {round(random_fail_route.length_m.sum() / 1e3, 1)}km")
plot_route(random_fail_route)
Sample 100 different sets of 500 failures to test how the best route length (for this arbitrarily-chosen route) changes under random failure conditions:
[ ]:
k = 500
n_samples = 100
lengths = []
for _ in tqdm(range(n_samples)):
ids = np.random.choice(roads.id, size=k, replace=False)
failed_roads = roads[roads.id.isin(ids)]
random_failures = [
(road.from_id, road.to_id) for road in failed_roads.itertuples()
]
random_fail_route = calc_route(
roads, random_failures, "roadn_6700", "roadn_1011"
)
length = round(random_fail_route.length_m.sum() / 1e3, 1)
lengths.append(length)
[ ]:
sampled_failures = pd.DataFrame({"length_km": lengths})
Calculate basic summary statistics from this sample:
[ ]:
sampled_failures.describe()
Plot all the route lengths as a scatter plot, to get some visual idea of the distribution:
[ ]:
sampled_failures.reset_index().plot.scatter(x="index", y="length_km")
Plot the empirical cumulative distribution function to summarise the distribution in another way.
In the samples we’ve taken while testing this tutorial, it shows most of the time 500 random failures in the road network has little effect on the route length (around 55-60km), but some combinations of failures see a route length of up to around 180km.
[ ]:
sns.kdeplot(sampled_failures.length_km, cumulative=True)