Introduction

The Metropolitan Police District (MPD) is the area of Greater London policed by the Metropolitan Police (Met Police). This area contains the 32 Greater London boroughs with the exclusion of the City of London.

I explored crime data provided by the Met Police to gain insight into the level of crime and type of recorded offences in the MPD. Furthermore, using number of recorded offences as an indicator of safety, I aimed to answer the question: what were the safest and most dangerous boroughs within the London Metropolitan Police District in 2018?

For the year 2018, Sutton was found to be the safest borough in the MPD and Westminster was found to be the most dangerous.

Crimes committed in the City of London were not covered as they are under the jurisdiction of the City of London Police. This project was also carried out at the end of 2019. Analysis may be updated in the future to include the City of London, crime rates (as opposed to ‘number of recorded offences’) and take more recent years into account

Code and files used to complete this report can be found at https://github.com/jhfran/london-safest-boroughs

Objectives

To explore crime MPD wide and ultimately find the safest and most dangerous boroughs within the MPD I decided to:

  1. Start by looking at an overview of the crime rates in London between 2008-2018. I wanted to find out whether the number of recorded offences increased or decreased between 2008-2018 in the Metropolitan Police District of London

  2. Choose a recent year for further analysis

  3. Gain further insight into the crimes committed in the MPD by analysing the offence categories with the highest number of recorded offences for the chosen year

  4. Analyse the offence types to understand the number of offences at the offence type level

  5. Examine a heatmap and rank borough safety using the number of recorded offences as an indicator

  6. Group boroughs with similar safety levels through k-means clustering. Furthermore, I wanted to see if I could find a cluster of boroughs with the highest crime levels and a cluster of boroughs with the lowest crime levels

  7. Finally, look at the top 3 offence types committed in the safest and most dangerous MPD boroughs that were identified through clustering

Data

For this project, I analysed ‘MPS Borough Level Crime’ which was sourced from the London Datastore website (Greater London Authority, 2019). The data covers the number of recorded offences in the 32 MPD boroughs at the offence category and offence type levels for each month over the years 2008-2018. This dataset has 1,056 observations and 135 variables. These variables are a mix of categorical and numeric types. Additionally, the data was provided by the London Metropolitan Police Service and therefore does not cover offences under the City of London Police jurisdiction - recorded offences in the City of London. After loading the packages and reading in the data in R, I carried out initial data wrangling. I began by renaming the columns as more identifiable and easier names to code. For example, the column ‘200801’ was changed to ‘Jan2008’ and ‘Major Category’ was changed to ‘Offence Category’. I then checked for missing data (using sapply) and created new columns with the aid of the mutate function and pipes to show the total offences for each year (using the month columns for each year). Additional wrangling was done throughout the analysis using functions such as group_by, filter, select, dcast, summarize and top_n to name a few.

Results

Offences between 2008-2018

In Figure 1 below, I have plotted the total number of recorded offences in the MPD for each year between 2008-2018. 2008 saw the highest crime rate of 847,795 recorded offences after which there was a decline in crime rates every year until 2013 (which had the lowest crime rate for those 10 years of 646,953 total recorded offences). It should be noted that there was a sharp decline in crime of 17.99% between 2012 and 2013.As we can see in Figure 1, crime rates have increased every year between 2013 - 2018 with the sharpest increase in recorded offences from 760,609 offences in 2016 to 824,620 offences in 2017 - an 8.42% increase.

To take a deeper look at crime within the MPD, I wanted to isolate a single year for further analysis. 2008 had the highest number of offences for all 10 years and there was a significant drop in crime in 2013. However, preference was given to analysing crime for the year 2018 given that it was the most recent year with data for all 12 months and it has the second highest crime rate between 2008 and 2018 of 839,422 recorded offences. Therefore, going forward, further analysis was only done for the year 2018.

Offences at the category level

One of the things I wanted to understand was the type of crimes taking place and the levels at which they are committed. As displayed in Figure 2, theft and handling was the most prevalent offence category. It accounted for 37.89% of the total recorded offences in 2018 in the MPD area - a total of 318,019 recorded offences. Violence against the person accounted for 30.86% of the total recorded offences - 259,086 recorded offences in 2018. These two categories contained approximately 68.75% of all reported offences in 2018. The 7 other offence categories were shown to have less than 100,000 recorded offences for the year 2018.

In addition to crime at the category level, I decided to analyse crime in the MPD at the type level in order to deduce if single offence types were responsible for the high levels of recorded offences in the theft and handling and violence against the person offence categories.

Offences at the type level

According to Figure 3 below, the theft and handling category had four offence types in the top 10 offence types committed in 2018:

• Other theft - 13.41% of overall crime and the most recorded offences of any offence type

• Theft from motor vehicle - 7.87% of overall crime

• Theft from shops - 5.26% of overall crime

• Other theft person - 5.12% of overall crime

Violence against the person had 3 of the top 10 offence types committed in 2018:

• Harassment - 10.83% of overall crime

• Common Assault - 8.59% of overall crime

• Assault with injury - 6.10% of overall crime

Other theft is a ‘catch-all’ term for thefts that are not in other categories. A few examples for other thefts include ‘theft by an employee, blackmail and making off without payment’ (Met Police, 2019). Additionally it includes personal theft where there was no contact between the offender and the victim, profiting from or concealing knowledge of the proceeds of crime, theft of mail and even theft of electricity (Home Office, 2011).

‘Harassment is when a person is subject to persistent threatening or abusive behaviour’ (Met Police, 2019). It can cause significant distress and can be done in person, over the phone, online and through stalking. Harassment can go on for years without the victim knowing who it is. With so many ways to carry out harassment (especially anonymously) it is unsurprising that it is the highest type of violence against the person offence.

Offences at the borough level and dissimilarity matrix

Figure 4A shows the recorded offences for each borough. Westminster, by far, has the most recorded offences (62995) and Sutton has the lowest number of recorded offences (12092) before clustering, showing that they are clearly the most dangerous and safest boroughs respectively.

I also decided to use a distance matrix to further visualise the distance between the boroughs in Figure 4B. Similar boroughs are closer in distance to each other and have a lighter blue colour in the square of the matrix. As the boroughs have more dissimilarities between each other, the colour becomes red (the brighter the colour, the bigger the differences). With the exception of itself, Westminster is red for every borough. However the other boroughs have a mixture of red and blue and are more varied than Westminster. For example, Tower Hamlets, Lambeth, Southwark, Newham show quite large dissimilarities with Westminster, Barking and Dagenham, Havering, Bexley, Harrow, Sutton, Richmond upon Thames, Kingston upon Thames and Merton. However, they have relatively small dissimilarities with other boroughs.

Clustering Boroughs

Additional investigation was done to ascertain whether or not it was possible to divide boroughs (our data points) into clusters using K-means clustering. Another aspect to highlight is that these data points are 9 dimensional - the 9 dimensions are the offence categories.

Only 31 of the 32 boroughs were used in the cluster analysis. As discovered in Figure 4A and Figure 4B, Westminster is a clear outlier having 27,054 more offences than the borough with the second highest number of offences - Newham. The k-means method is sensitive to outliers as it aims to optimize the sum of squares. The algorithm updates cluster centres using the mean of the data points close to each centroid. Therefore, outliers with large deviations have significant impacts on the mean of the entire cluster, moving the centroid closer to the outliers. As such, Westminster will be excluded from this analysis.

Additionally, before performing k-means cluster analysis, it was necessary to decide the best number of clusters. In order to find that number I plotted the within sum of squares vs cluster sizes in Figure 5A. I then used the elbow method to determine the cluster size (K) at the ‘elbow’ of the plot. We can see from this plot that the elbow was found at cluster size 3.

Preferably, I would have liked to see a slower and smoother decline after cluster size 3 but it is clear enough from Figure 5A that 3 is the best number of clusters for this data.

Using a K of 3, three distinct clusters were identified (without overlap) as shown in Figure 5B. As expected, these findings match our results in Figure 4A. Table 1 also shows Cluster 1 of size 8 had the lowest means across all offence categories. Cluster 2 of size 15 had the second lowest means across all offence categories except for other notifiable offences - it had the highest mean in this category, slightly higher than Cluster 3 of size 8 (which had the highest means in all categories except other notifiable offences). It should also be noted that although there are 9 dimensions, principal component analysis was done and the data points were plotted in Figure 5B in accordance with principal components one and two which accounted for 73% and 9.9% of the variance in the data respectively.

Figure 6: Table showing cluster means for 9 variables

Burglary Criminal Damage Drugs Fraud or Forgery Other Notifiable Offences Robbery Sexual Offences Theft and Handling Violence Against the Person
Cluster 1 1884.38 1263.00 516.37 17.880 320.000 440.75 381.63 4961.88 4966.3
Cluster 2 3144.40 1969.40 1075.33 40.800 615.870 881.33 642.87 92860.67 8470.0
Cluster 3 3579.13 2088.38 1686.00 48.625 607.625 1655.88 820.75 12984.75 9853.5

Safest and most dangerous boroughs

Sutton, Kingston upon Thames, Richmond upon Thames, Merton, Harrow, Bexley, Barking and Dagenham and Havering, identified in Cluster 1, are the safest boroughs in the MTD.

Although other theft accounted for the highest number of offences MTD wide, for each of the safest boroughs, harassment was the offence type with the most recorded offences as shown in Figure 7A. With the exclusion of Richmond upon Thames, other theft was the second highest offence type for 7 of the 8 safest boroughs. Surprisingly, other theft wasn’t one of the top 3 offence types in Richmond upon Thames. Theft from motor vehicle was the second highest in the borough followed by burglary in a dwelling as the 3rd highest offence in the borough.

Westminster and the 8 boroughs in Cluster 3 - Newham, Southwark, Camden, Lambeth, Tower Hamlets, Hackney, Haringey, Croydon and Brent were discovered to be the most dangerous boroughs in the MTD.

Unlike the safest boroughs, other theft was the offence type with the highest number of offences across 8 of the 9 most dangerous boroughs (Figure 7B). In Croydon, however, harassment was the highest recorded offence followed by other theft and common assault.

Limitations

One of the limitations of the dataset was the fact that analysis had to be restricted to the Metropolitan Police District. The dataset included 32 boroughs but did not include the City of London. This is because the City of London has its own police service - the City of London Police and therefore these crimes were not included in the dataset obtained from the Metropolitan Police.

Additionally, like Westminster, the City of London is a major tourist and commuter destination and it has a prominent nightlife as well. Through analysis of the Greater London county as a whole, we may have seen City of London emerge as a dangerous borough as well.

Another limitation was the vagueness of some of the offence types, e.g., other theft which is the top offence in the MPD. It makes it more difficult to assess the exact crimes committed in these boroughs.

Conclusion

References

Home Office (2011). User Guide to Home Office Crime Statistics. [online] Available at: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/116226/user-guide-crime-statistics.pdf [Accessed 05 Oct. 2020].

Met.police.uk. (2019). Harassment | Crime prevention | The Met. [online] Available at: https://www.met.police.uk/cp/crime-prevention/harassment/af/Harassment/ [Accessed 05 Oct. 2020].

Greater London Authority. (2019). Recorded Crime: Geographic Breakdown – London Datastore. [online] Available at: https://data.london.gov.uk/dataset/recorded_crime_summary [Accessed 05 Oct. 2020].

Police.uk. (2019). Advice and crime prevention. [online] Available at: https://www.police.uk/pu/advice-crime-prevention/ [Accessed 05 Oct. 2020].