NYCycle

javier-de-la-maza-AxXAAvuUDZk-unsplash.j

As seen in

January / March 2020

While I was at Smart Design, I worked on a project investigating bicycle safety in New York City using data analysis. We aimed to answer the question

Have bicycle routes made cycling safer?

Cycling Safety

Today

Vision zero is a road traffic safety project across the US that aims to reduce fatalities and serious injuries to zero. It became official policy in 2014.

‘In the last five years, DOT (Department of Transport NYC) has expanded and enhanced the on-street bike network by more than 330 miles, including more than 82 protected lane miles, with 20 miles installed in 2018. DOT installed over 66 lane miles of bike facilities, including 55 lane miles of dedicated cycling space in 2018. ‘ – Department of Transport NYC

Despite this, many people feel not enough has been done, with more than 1000 cyclists protesting in Washington Square Park in 2019.

The Data

Collisions and Routes

Both datasets were sourced from NYC Open Data via a free API.

Collisions

Routes

Route Types

Signed/Marked

Road painted with 'sharrows' - arrows and bicycle icons

Conventional

Typical lanes with painted white lines designating space for bicycles

Protected

A lane physically separated from other vehicles by cars or bollards

Visualisation

To visualise the datasets, a dashboard was created using Dash by Plotly. The interface allows visualisation of different views of collisions and routes throughout time.

routes_vis

collisions_vis

routes_collisions_vis

routes_vis

1/3

PREPROCESSING

These visualisations helped to give a good overview of the data we had. To delve deeper into understanding some of the patterns, we used data analysis techniques in Python using Jupyter Notebooks.

Matching Collisions to Routes

Brute Force

These two separate datasets needed to be combined, to discover exactly which collisions took place on which routes. The most obvious way of doing this would be to compare every route and every collision, also known as a brute force method.

R-Trees

A much more efficient method of pairing collisions with routes was to use an R-Tree algorithm. This tree data structure subdivides the map space, reducing the size of the problem. By using bounding boxes to decide whether or not to search a subtree, the number of comparisons that need to be made are significantly reduced. To facilitate this, each route was surrounded by a bubble, showing which routes it intersected with.

Normalising the data

It was anticipated that as the number of cyclists in NYC has been increasing, this would have a direct influence on the number of collisions taking place. A contradictory argument to this would be the 'safety in numbers' theory, by which the more cyclists there are, the fewer accidents occur. For this investigation we decided to assume the first approach, with a linear relationship between increase in cyclists and collisions. This was done by multiplying each year of collisions with a factor of correction, calculated using the change in number of cyclists over time,

Source: NYC DoT

RESULTS

Route type makes a difference

On average, protected routes saw a decrease in collisions after installation, whereas signed/marked routes saw an increase in collisions

When looking at collision count per route, we calculated what percentage of each route type saw an increase or decrease before and after installation
Conventional routes are the most common and saw slightly more increases than decreases
38% of all signed/marked routes saw an increase
30% of protected routes saw a decrease, compared with 24% seeing an increase

Cluttered road markings may be a cause

We wanted to see why some of these patterns were occurring. Google Streetview allowed us to see how streets changed over time. By looking at the best and worst case examples, we found that many of the worst performing routes had extremely cluttered road markings, especially at large intersections.

Spring Street - Broadway, November 2017

Spring Street - Broadway, June 2019

Intersections see a lot of incidents

When looking at routes on the dashboard, it was clear that many had concentrations of incidents at intersections rather than mid-route. This pattern is seemingly apparent across the city, however, there is no simple way to quantify it right now. (Read on to see how this was quantified in other cities)

LOOKING WIDER: OTHER CITIES

By looking at the data available from other US cities, we could compare the different methods used to protect cyclists. We chose two metropolitan cities, with good data sources. These can also be seen on the dashboard. The same methods of preprocessing and data analysis were used.

Sharrow location matters

Both San Francisco and Boston showed a different pattern of results to NYC when it came to signed and marked routes. In NYC these routes saw an increase in the number of collisions after installation, whereas in Boston and San Francisco they showed to have positive impact.

BOSTON

SAN FRANCISCO

Using Google streetview to take a closer look at the design of signed/marked routes (routes that use painted arrow/bicycle symbols), we found that the exact positioning of the painted markings could be the reason for these differences.

NYC

In NYC, sharrows are placed to the side, implying that they form part of a dedicated bike lane

Boston

In Boston, the sharrows are central in the lane, removing that possible implication, acting more as a reminder that bicycles are present. This could be why Boston sees much safer signed/marked routes.

San Francisco

San Francsico has centrally placed the sharrows. Again, this could be a contributing factor to the safety of San Francisco's signed/marked routes.

Crash locations change over time

Unlike the NYC dataset, both San Francisco and Boston record if each collision occurs on the street or at an intersection, allowing us to quantify the differences. In Boston, it was clear that the ratio of crashes tend towards occurring more mid-street than at intersections. However, in San Francisco the opposite pattern is shown, with more collisions taking place at intersections over time.

BOSTON

By using Google streetview, we could see that Boston streets have many more features for cyclists at intersections. These include conflict markings, bike boxes and two-stage turn boxes all defined by clear design guidelines. Comparatively, San Francisco has fewer of these and NYC has almost none. We think that this could possibly be the cause of the different location patterns.

SAN FRANCISCO

CONCLUSION

By analyzing the New York Dataset, and comparing it with Boston and San Francisco, we came to 3 key findings:

The positioning of ‘sharrows’ on signed/marked routes is critical, explaining why New Yorks’ signed/marked routes are much more dangerous than San Francisco and Boston’s
Cluttered road markings may be major contributors to road safety, as New York’s intersections have poor performance
Having specialist features at intersections, has increased safety of junctions in Boston and could be applied elsewhere

Overall, these cities are only as smart as the data they collect. Perhaps NYC has poor intersection performance as they aren’t collecting this data, leaving them ignorant of the problem?

As seen in

NYCycle

January / March 2020

While I was at Smart Design, I worked on a project investigating bicycle safety in New York City using data analysis. We aimed to answer the question

Have bicycle routes made cycling safer?

Cycling Safety

Today

Vision zero is a road traffic safety project across the US that aims to reduce fatalities and serious injuries to zero. It became official policy in 2014.

​

The Data

Collisions and Routes

Both datasets were sourced from NYC Open Data via a free API.

Collisions

Routes

Route Types

Signed/Marked

Road painted with 'sharrows' - arrows and bicycle icons

Conventional

Typical lanes with painted white lines designating space for bicycles

Protected

A lane physically separated from other vehicles by cars or bollards

Visualisation

To visualise the datasets, a dashboard was created using Dash by Plotly. The interface allows visualisation of different views of collisions and routes throughout time.

PREPROCESSING

These visualisations helped to give a good overview of the data we had. To delve deeper into understanding some of the patterns, we used data analysis techniques in Python using Jupyter Notebooks.

Matching Collisions to Routes

Brute Force

These two separate datasets needed to be combined, to discover exactly which collisions took place on which routes. The most obvious way of doing this would be to compare every route and every collision, also known as a brute force method.

R-Trees

Normalising the data

Source: NYC DoT

RESULTS

Route type makes a difference

On average, protected routes saw a decrease in collisions after installation, whereas signed/marked routes saw an increase in collisions

Cluttered road markings may be a cause

Intersections see a lot of incidents

LOOKING WIDER: OTHER CITIES

Sharrow location matters

Crash locations change over time

CONCLUSION