Areas most prone to crashes over New Zealand using Spatial analysis

In this globalized era, vehicles have become the important need for transportation that can be of any goods, services or personal. Vehicles allow us to make our travel hassle free, whether it’s public or private. Apart from all its benefits, crash is one of the problems that is because of increasing number of vehicles on any mode, like roads, marine, air etc. These crashes are responsible for many injuries and fatal. Many people loss their lives because of crashes. This project focusses on various crashes all over New Zealand and its effects on human. This analysis talks about the location that is most prone to the crashes and the year which had maximum crashes in the last two decades.

Background and Research

New Zealand is one of the developed countries which has good means of transport. The crashes are common which leads to huge loss both humanistic and materialistic. In this analysis, descriptive approach is followed in order to learn about the crashes from 2000 to 2020. The regional map is used for visualisation as the data is very concise, so using regional map is a great option so that the crashes can be interpreted in the fine way.

Data collection

The data is collected from NZ Transport Agency.

  • The data is all about crashes happened in different locations of New Zealand. The data is available from 2000 to 2020.

To answer my research question i.e the location which is most prone to crashes and the year in which type of crashes are highest in last two decades, I analysed the data based upon Crash Location in the dataset. The maximum crashes of each category are also analysed and the area having maximum number each crash is targeted. The method of research is based upon the analysis on the data, it is descriptive analysis. The process that is followed to reach to the conclusion is shown in a flow chart in Figure1. Firstly, data is collected from the NZ transport which is crash data from the year 2000 to 2020.The data contained shape file of the crashes and comma separated value files contained further information about the crashes. The data size is very large. It has 731474 rows and 72 columns. For clear understanding of data, Pivot table is used for the analysis. Pivot table gives us flexibility in analysis than a normal table. It is made to see all crashes in each year in each location. The maximum value and average value for total crashes in each location in last 20 years and a separate comma separated value file is made which contained only three columns Crash location 1, maximum value and average value. Since there are four types of crashes, so four pivot table are made that gave the glimpse of crashes count for each location at every year. To find the accurate value of each location, the maximum count of crashes is calculated from 2000 to 2020. Four files are hence created containing location and crashes.

Figure 1 showing flowchart of analysis

The dataset contained many missing values which may hinder the analysis. So, missing values in the dataset are removed by replacing it by the average value of that crash over the years so that data can be analyzed. ArcGIS contains regional file of New Zealand, shape file for all crashes and five comma separated value files: four for each type of crashes and one for average value of all crashes for different years. Join operation is performed five times: -

  1. Shape file and total crashes comma separated value file. This operation is performed by private key: — crash location 1.

The attribute table is checked after validation of join in order to see the modifications in the data frame. Each category map is plotted.

Analysis of the maps

The map with total number of crashes all over New Zealand is shown in Figure2. The red colour in the map shows the highest number of crashes (between 1287 and 3300 in number). Figure 4 shows the fatal injury crashes. The green symbol symbolises the regions with maximum number of fatal injuries crashes out of total fatal injury cases (between 398 and 701). Crashes that had minor injuries are shown in Figure 5. The green symbol refers to the maximum number of minor injury cases during the crashes over last two decades. Its value is between 4991 and 11248. Other crashes which are less than this range in any area is not highlighted in the figure so that most affected areas stand out. Third category is the non-injury crashes that might have not impacted human and impacted loss of property. This is shown in Figure 6 where green colour symbols suggest high cases which ranges between 13395 and 40687. The point telling about the crashes that are less than 13395 is made small so that region with more crashes are better visible. The last is the serious injury crash. Figure 7 shows the map with serious injuries where the green symbols has the dense values while small points have lesser value. It is between 865 and 2179.

Figure 2 showing all crashes all over New Zealand
Figure 3 showing crashes at location SH1 from 2000 to 2020
Figure 4 showing fatal injury crashes all over New Zealand
Figure 5 showing minor injury crashes all over New Zealand
Figure 6 showing non- injury crashes all over New Zealand
Figure 7 showing serious injury crashes all over New Zealand

After plotting each map, the maximum crashes for each category is highlighted and its coordinated are noted. The statistics feature is used to check the results of ArcMap.

During the analysis it is found that SH 1N is the most vulnerable place for crashes.

  • For crash location 1 SH 1N has the highest number of crashes i.e 3300 in the year 2017(as shown in Figure 3).

SH1 came up as the most dangerous area in terms of crashes. It is the state highway, so this might be the result of rash driving over highway. Further, since fatal injury crashes are most dangerous because lives matter so, its percentile is calculated with statistics in ArcMap. It is found that 1.28 percent of crashes injuries included fatal injuries where SH1 is the most prominent place.

Conclusion and Limitations

After the analysis, it is found that all the type of crashes is more common in north island, in Auckland, specifically in SH1. 2017 was the year with highest number of crashes all over New Zealand. The fatal cases cover nearly 1.29 percent of the total crash cases.

The analysis is based upon the effects of crashes i.e fatal cases, minor injury, serious injury or non-injury which are most often human based. The further research can be done on the causes of crashes or searching for any particular group that is responsible for crashes which is not analyzed in this research.




Masters in Applied data science, University of Canterbury, New Zealand. Data scientist who loves to play with the data and make sense from it.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

“Our customers are our Heroes”​ E&P triples Operations keeps overhead the same.

How Data Science + AI Could Help Fight Off A Recession

Data-science Series (Visual Programming with orange tool):

All Information is Actionable

Deep Hive: Deep learning live on stage

Predicting Strokes

Weighting survey data with the pewmethods R package

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Aki Kapoor

Aki Kapoor

Masters in Applied data science, University of Canterbury, New Zealand. Data scientist who loves to play with the data and make sense from it.

More from Medium

Edge-selective Feature Weaving for point cloud matching (MIRU2022)

Can we query a table with T5?

Cloud Consulting Services | Enterprise Cloud Solutions | Ziffity

Hippo: Scalable Data Pre-processing Infrastructure