Empirical Urbanist

View Original

Vision Zero Analysis at the Regional Scale

Bay Area HIN Index

Across the world, Vision Zero is a powerful and popular framework to conceptualize the safety of our cities' streets. It connects data to planning actions and recenters our conversations around designing cities for people. A critical task for cities considering any type of Vision Zero Plan is to identify which locations pose the highest risk of dangerous traffic collisions, so that they can be prioritized for improvements. To this end, a common strategy is to develop a High Injury Network (HIN), which identifies the streets with the largest concentration of collisions where a victim was killed or severely injured (KSI).

HINs are typically developed for a single city at a time. Consequently, our understanding of collision trends is often very focused, and we rarely have a chance to see what is going on across multiple jurisdictional boundaries. I have always been curious about what a regional HIN could show us, so, as part of recent projects, I have developed a series of Python-based tools to automate the development of robust, regional HINs. This article shares the details of a personal project: a regional HIN for the San Francisco Bay Area (Alameda, San Mateo, Santa Clara, and San Francisco Counties).

Breaking Down the Goals of a HIN

A HIN, at its core, is a prioritization analysis. Your goal as the HIN developer is to identify the highest priority locations along a network for further analysis and safety improvements. However, the HIN also serves as a communication tool: improving a small percentage of the city's streets can address the majority of KSI collisions occurring on the network.

For example, one of the first statistics presented on San Francisco's Vision Zero page is:

In San Francisco, more than 70 percent of severe and fatal traffic injuries occur on just 12 percent of city streets. 

This statistic focuses the problem definition of San Francisco's Vision Zero program.

So, some possible goals of a HIN could then be formulated as:

  • Identify the subset of the network where most collisions occur (>50%).

  • Limit the cumulative size of the network to a manageable subset. This subset should be realistically improvable over a short to intermediate time horizon (3-10 years). Many HINs identified as part of Vision Zero efforts are usually less than 20% of the network, though this amount is debatable.

Metrics Matter: Developing a Multimodal Index

Urban planners familiar with prioritization analysis and performance measures understand the importance of establishing good metrics to connect data to desired outcomes. Developing an effective HIN quickly at an appreciable scale requires metrics that can be combined into a single index.

The approach I chose for the San Francisco Bay Area was the following:

  • Collect 5 years (2011-2016) of collision data for San Francisco, San Mateo, Alameda, and Santa Clara Counties from TIMs. Filter out collisions on highways.

  • Weight collision points by severity.

  • Associate weighted collision densities with a study network using kernel density estimation (KDE).

  • Compute percentile scores for bicycle, pedestrian, and vehicle collisions on the associated density values.

  • Compute a final index score combining percentile ranks of the associated density values.

An unconventional aspect of this analysis is the use of KDE as a basis for a collision index. Typically, collision rates or indices are derived by spatially joining collisions to a study network. This ultimately runs into two problems, and both originate from trying to force a point pattern onto a predefined study network.

  • Modifiable Linear Unit Problem (MLUP): This is essentially the same observation as the Modifiable Areal Unit Problem (MAUP) applied to networks. If you associate collisions with predefined segments, you can run into issues with aggregation bias. This is because it is hard to define a non-arbitrary network segmentation. A commonly-accepted study network preparation method is to break the network at intersections, but this typically assigns shorter segments to downtown grids relative to rural roads. If you normalize by length or volumes, the indicated crash rates are still dependent on how you aggregated collisions to each segment.

  • Segments, Not Intersections: Conventional collision analysis associates each collision to a single segment to avoid double counting them. This is a great thing to do for future aggregations, but it comes at the disadvantage of forcing collisions at intersections to be divided amongst their approaches. This implicitly attributes each collision to one of the intersection's approaches, not to the intersection itself.

KDE addresses these issues by using an estimation methodology specifically designed for point pattern analysis. KDE is, in fact, so powerful that it not only underlies some effective clustering algorithms, but many transportation studies use it for collision heat maps because they provide a quick visual understanding of collision clusters. However, a heat map is often where most applications of KDE stop, and the underlying statistics are rarely incorporated into a transportation analysis directly. In essence, KDE allows a compromise between identifying high injury segments and intersections without having to examine them separately.

Using KDE for generating a HIN requires sampling the resulting heat map values on small network segments. The result is impressive:

  • Cartographically, associating density values to a network provides options for visualization that are in a vector format. In addition, by focusing on the street network, it limits the visualization to what transportation planners want to communicate.

  • Statistically, at large scales, raster analysis tends to have many low values across a high-resolution raster. This skews distributions towards low values. Generally, when heat maps are sampled to a network, the distributions are more normal. It also captures intersection effects that would not have been detected with a single segment association.

Collision Accumulation

The final collision index determines the order in which we add street segments to the HIN. This approach allows for a flexible framework to identify parts of a network to prioritize for improvements or further study. For example, while most HINs are purely based on the results of raw collision information, there is little to stop the inclusion of other variables such as those that relate to equity.

While I expected KSI collisions to be concentrated at the scale of a Bay Area, the degree of that concentration was surprising.

The video below helps visualize this process on a map by showing the HIN at different collision accumulation percentages.

Ideas for Future Work

Vision Zero is more than just a plan, but a framework that can be applied in multiple planning contexts and projects. In the future, I plan to focus on both communication and analysis strategies that can scale and fit into a scenario-oriented framework to address key planning uncertainties. Some future ideas include:

  • Identify potential similarity metrics to be used in a spatial clustering analysis (Mean Shift, DBSCAN, etc.) to identify preliminary groups of collisions. These collision clusters can be used in cross tabulations aimed at identifying trends in "why" or "what" caused collisions.

  • Integrate emerging data sources into future analysis from vendors such as EcopiaTechMapillary, and Mobileye that use advances in computer vision to create insight from photos and aerial imagery.

  • Connect summaries identifying collision causes and trends to potential improvements on key corridors. Based on the typical cross section, develop 3D models of potential improvements using a previously developed production process using Python and CityEngine. The goal would be to connect abstract data analysis to concrete recommendations that are specific to a location's collision profiles.

Limitations & Key Considerations

This process is not perfect by any means. The methodology outlined intends to serve as a robust quick response tool to generate HIN's so that safety considerations can be integrated into a more diverse portfolio of project types such as corridor studies, studies for smaller cities, and on regional scale projects where conventional methods require a high degree of data preparation. The approach used for the Bay Area HIN for example does not normalize for exposure and I did not smooth out the network with a dissolve. In addition, there are better methods of KDE estimation for networks that could have been used for this project, but I chose simplicity for this exercise. Regardless, it provides a quick response methodology to prioritize locations for further consideration and potential improvements.