top of page

As seen in 


January / March 2020

While I was at Smart Design, I worked on a project investigating bicycle safety in New York City using data analysis. Our results were published FastCompany. We aimed to find out whether bicycle routes have actually made cycling safer. 
Cycling Safety
Vision zero is a road traffic safety project across the US that aims to reduce fatalities and serious injuries to zero. It became official policy in 2014.


  ‘In the last five years, DOT (Department of Transport NYC) has expanded and enhanced the on-street bike network by more than 330 miles, including more than 82 protected lane miles, with 20 miles installed in 2018. DOT installed over 66 lane miles of bike facilities, including 55 lane miles of dedicated cycling space in 2018. ‘ – Department of Transport NYC


Despite this, many people feel not enough has been done, with more than 1000 cyclists protesting in Washington Square Park in 2019.

The Data
Collisions and Routes
Both datasets were sourced from NYC Open Data via a free API.
Route Types
Road painted with 'sharrows' - arrows and bicycle icons
Typical lanes with painted white lines designating space for bicycles
A lane physically separated from other vehicles by cars or bollards
To visualise the datasets, a dashboard was created using Dash by Plotly. The interface allows visualisation of different views of collisions and routes throughout time. 
These visualisations helped to give a good overview of the data we had. To delve deeper into understanding some of the patterns, we used data analysis techniques in Python using Jupyter Notebooks.
Matching Collisions to Routes
Brute Force
These two separate datasets needed to be combined, to discover exactly which collisions took place on which routes. The most obvious way of doing this would be to compare every route and every collision, also known as a brute force method. 
Screenshot 2020-06-01 at 21.01.42.png
A much more efficient method of pairing collisions with routes was to use an R-Tree algorithm. This tree data structure subdivides the map space, reducing the size of the problem. By using bounding boxes to decide whether or not to search a subtree, the number of comparisons that need to be made are significantly reduced. To facilitate this, each route was surrounded by a bubble, showing which routes it intersected with.
Normalising the data
It was anticipated that as the number of cyclists in NYC has been increasing, this would have a direct influence on the number of collisions taking place. A contradictory argument to this would be the 'safety in numbers' theory, by which the more cyclists there are, the fewer accidents occur. For this investigation we decided to assume the first approach, with a linear relationship between increase in cyclists and collisions. This was done by multiplying each year of collisions with a factor of correction, calculated using the change in number of cyclists over time,
Route type makes a difference
On average, protected routes saw a decrease in collisions after installation, whereas signed/marked routes saw an increase in collisions
  • When looking at collision count per route, we calculated what percentage of each route type saw an increase or decrease before and after installation

  • Conventional routes are the most common and saw slightly more increases than decreases

  • 38% of all signed/marked routes saw an increase

  • 30% of protected routes saw a decrease, compared with 24% seeing an increase

Cluttered road markings may be a cause

We wanted to see why some of these patterns were occurring. Google Streetview allowed us to see how streets changed over timeBy looking at the best and worst case examples, we found that many of the worst performing routes had extremely cluttered road markings, especially at large intersections.

spring-broad nov 17.png

Spring Street - Broadway, November 2017

spring-broad jun 19.png

Spring Street - Broadway, June 2019

Intersections see a lot of incidents

When looking at routes on the dashboard, it was clear that many had concentrations of incidents at intersections rather than mid-route. This pattern is seemingly apparent across the city, however, there is no simple way to quantify it right now. (Read on to see how this was quantified in other cities)

intersection 3.png

By looking at the data available from other US cities, we could compare the different methods used to protect cyclists. We chose two metropolitan cities, with good data sources. These can also be seen on the dashboard. The same methods of preprocessing and data analysis were used.

Sharrow location matters

Both San Francisco and Boston showed a different pattern of results to NYC when it came to signed and marked routes. In NYC these routes saw an increase in the number of collisions after installation, whereas in Boston and San Francisco they showed to have positive impact.





Using Google streetview to take a closer look at the design of signed/marked routes (routes that use painted arrow/bicycle symbols), we found that the exact positioning of the painted markings could be the reason for these differences. 



In NYC, sharrows are placed to the side, implying that they form part of a dedicated bike lane


In Boston, the sharrows are central in the lane, removing that possible implication, acting more as a reminder that bicycles are present. This could be why Boston sees much safer signed/marked routes.

San Francisco

San Francsico has centrally placed the sharrows. Again, this could be a contributing factor to the safety of San Francisco's signed/marked routes.

Crash locations change over time

Unlike the NYC dataset, both San Francisco and Boston record if each collision occurs on the street or at an intersection, allowing us to quantify the differences. In Boston, it was clear that the ratio of crashes tend towards occurring more mid-street than at intersections. However, in San Francisco the opposite pattern is shown, with more collisions taking place at intersections over time.



By using Google streetview, we could see that Boston streets have many more features for cyclists at intersections. These include conflict markings, bike boxes and two-stage turn boxes all defined by clear design guidelines. Comparatively, San Francisco has fewer of these and NYC has almost none. We think that this could possibly be the cause of the different location patterns.




By analyzing the New York Dataset, and comparing it with Boston and San Francisco, we came to 3 key findings:


  • The positioning of ‘sharrows’ on signed/marked routes is critical, explaining why New Yorks’ signed/marked routes are much more dangerous than San Francisco and Boston’s

  • Cluttered road markings may be major contributors to road safety, as New York’s intersections have poor performance

  • Having specialist features at intersections, has increased safety of junctions in Boston and could be applied elsewhere

Overall, these cities are only as smart as the data they collect. Perhaps NYC has poor intersection performance as they aren’t collecting this data, leaving them ignorant of the problem?

bottom of page