A Checklist for International Gun Death Comparisons - Possible Machine Learning Project?


Very often comparisons with other countries are made in order to imply that the US is more violent due to lax gun laws/high gun ownership.

This post is about a checklist of common errors that are made in such comparisons, all of which can be detected with supervised machine learning if enough sample articles are used to train a classifier (sikit-learn alone is sufficient, neural net libraries can be put to use too).

If my memory serves me right, a very condensed version of this first appeared on the now defunct “Defcad” forums half a decade ago. It was expanded by a data scientist in training who is considering using this as a test case for an open source project in the works (@jonstokes, if you’re Keybase account is still accessible they can get in touch immediately).

Judging by statements the founder of this site has made on social media (and my own observations), it looks like these errors are made by prohibitionists a lot: https://twitter.com/jonst0kes/status/973964184827383809

(FYI @jonstokes, that error being discussed in that Twitter thread is #9 in the list below)

So why not put a pipeline together that has been trained to classify whether a document does or doesn’t contain any of the following information? Here’s some crucial things any statistical comparison with other countries must account for.



The Criteria: This is a full list of things that can be checked for using binary text classification alone. All citations are either raw data or utilize it in order to reach conclusions


#1-2 are obvious basics, #3-5 all deal with how countries are selected and compared, 6-8 deal with how concentrated violent crime tends to be (regardless of the gun laws in a given area of the US), and #9-11 have to do with how before/after comparisons are made between countries as gun laws changed in the last few decades. #12 is especially open for discussion - what omitted variables might there be that can impact violent crime rates even after all the previous steps have been accounted for?

Right now, all it takes is a two-feature test (population differences, and the use of total homicides as a metric to account for changes in criminal behavior) to find that there may be no correlation between how strict a state’s gun laws are (by The Brady Campaign’s grade) and what the Justice Department’s homicide rate for that state is likely to be: https://tinyurl.com/zerocorrelationgunlaws That means it would take additional factors for any conclusion to be drawn one way or another.

This is a draft criteria to help highlight how wrong many comparisons between countries can be. State-by-state comparisons can be complex enough, but the way that international gun death comparisons are done by news outlets and academic studies is especially fraught with methodological errors.

Here’s the criteria:


1. POPULATION DIFFERENCES: An analysis must use per capita/person metrics of some kind, or mention something about dividing total deaths by the population of a giving country. This is too obvious for any responsible analysis to ignore.



2. SUBSTITUTION EFFECT: Total murders must be used rather than just ones committed with guns, at least if we want to know what the impact of stricter gun laws in the US might be. People might switch weapons in some instances, and a new set of people may decide to commit crimes because they see the risk of retaliation as being lower.




3. CHOICE OF COUNTRIES: An analysis should avoid cherry picking which countries are included in comparisons. A good overview if this issue can be found here, and another follow-up by the same author can be found here. Additionally, this analysis takes a look at the same issue.






4. MURDER DEFINITIONS: Believe it or not, different countries often have different definitions when it comes to what gets reported as a murder. The most obvious example of this is that the UK only reports a murder as such if someone is actually convicted for the killing. The US murder rate would be only 65% of what it really is by that standard.






5. GUN LAW DESCRIPTIONS: How gun laws are portrayed between countries is relevant. It may not be justified to blame a mass shooting in the US on “lax gun laws” if the shooting took place in an area that specifically banned them. Also, what the laws are in any given country often varies within short periods of time. Here’s just one example:




6. CITY POPULATION FACTOR: Murder rates double in cities with over 250,000 people (and quadruple when the figure exceeds 500,000), which is something discussed in this video that was widely circulated after the Sandy Hook shooting. A more in depth look at this factor can be found here. The US has far more such cities than other nations.






7. SPATIAL CONCENTRATION: Violent crime in the US is heavily concentrated in specific parts of the country, and taking this to account can reverse the apparent “link” between more guns in a given area and more crime. Once localized crime data is used (county/city level rather than state-wide), gun laws/ownership is inversely correlated with crime.




8. CRIME NETWORK EFFECTS: At the most detailed level of analysis, it appears that firearm crime is highly concentrated among less than 10% of individuals in the most violent cities of the US. In other words, who you know can matter more in predicting your likelihood of being a crime victim than what the laws in a given area are.






9. TIME-SERIES ANALYSIS: To sort out cause and effect (or at least try), you have to look at murder trends across time between different countries/states. If crime in other countries was lower to begin with (before gun bans), you can’t attribute low crime rates to the gun control itself. Right to carry is inversely correlated with crime by this measure.






10. THE START/END POINTS: Often when doing the step mentioned above, people might cherry-pick what the starting and ending points are in drawing a trend line. Picking a high-crime starting point and a low-crime ending point to make a gun ban look successful can be misleading. People on the pro-gun side make this mistake too.


11. ORDER OF CAUSATION: High crime can lead to people buying more guns, rather than more guns leading to more crime. This can be especially apparent at the individual level where victims of crimes become more motivated to acquire guns.





12. MORE OMITTED VARIABLES: Alright, so what else is there that could affect crime rates that have nothing to do with gun laws or rates of ownership? Honor cultures in the south (as well as the climate in southern states) might be significant factors. And there is some evidence that a third of the decrease in murders over the last few decades can be attributed to better ER practices in cities like LA (meaning that guns/permits may have played less of a role than some people have argued in reducing crime over that time frame). This can be inferred from the change in survival rate of getting shot. And the cocaine epidemic of the 90’s might be a factor as well (though this could be an argument that drug legalization would reduce crime more than gun control).




Raw Links in Order of Appearance:


  1. https://www.washingtonpost.com/news/volokh-conspiracy/wp/2015/10/06/zero-correlation-between-state-homicide-rate-and-state-gun-laws/

  2. https://mises.org/blog/mistake-only-comparing-us-murder-rates-developed-countries

  3. https://www.mises.org/blog/guns-how-ny-times-manipulates-data

  4. http://crimeresearch.org/2014/03/comparing-murder-rates-across-countries/

  5. http://rboatright.blogspot.com/2013/03/comparing-england-or-uk-murder-rates.html

  6. http://www.npr.org/2015/03/30/395069137/open-cases-why-one-third-of-murders-in-america-go-unresolved

  7. https://blogs.wsj.com/law/2014/11/18/israel-to-ease-gun-laws-in-wake-of-deadly-attack-says-security-minister/

  8. https://www.youtube.com/watch?v=Ooa98FHuaU0

  9. http://blog.nycdatascience.com/student-works/pressure-cooker-higher-population-densities-increase-crime/

  10. http://crimeresearch.org/2017/04/number-murders-county-54-us-counties-2014-zero-murders-69-1-murder/

  11. http://www.sciencedirect.com/science/article/pii/S0277953614000987

  12. https://www.youtube.com/watch?v=fxGSoKF-m98

  13. http://abcnews.go.com/TheLaw/FedCrimes/story?id=6773423

  14. http://crimeresearch.org/2013/12/murder-and-homicide-rates-before-and-after-gun-bans/

  15. https://commons.wikimedia.org/wiki/File:Rtc.gif

  16. http://www.pewsocialtrends.org/2013/05/07/gun-homicide-rate-down-49-since-1993-peak-public-unaware/

  17. https://link.springer.com/article/10.1057/jphp.2009.26

  18. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=878132

  19. http://www.gallup.com/poll/199235/crime-victims-likely-own-guns.aspx

  20. https://www.psychologytoday.com/blog/the-human-beast/200904/is-southern-violence-due-culture-honor

  21. https://www.bjs.gov/content/pub/pdf/htius.pdf


The Psychology Today article you linked actually attempts to rebut the claims of the Nisbett study, so I would take a closer look at that.

Otherwise, this is a fantastic resource. Thanks for pulling it all together.


This is such an awesome collection, thank you!


This list does not take a side one way or another on the subject of “honor cultures” in the south. #12 is really meant to be a grab bag of possible things to look into further, the PsychToday piece seems like a good overview of the subject (in particular it’s weaknesses).

I doubt that variable (if it’s significant at all) is serious enough to consider in an early text classifier iteration.

That said, most of the things on this list are things any international comparison should at least mention as things to consider. To get an idea of what a garbage article looks like, Vox has us covered:



I don’t mean to bash Vox as a whole here, I’m just using this as a relevant example. Out of the dozen or so things listed in the original post above, Vox only took the first issue (population differences) into consideration.

There’s a brief mention of how Australia was already less violent prior to their major restrictions being passed (so it partially mentions the need for time series analysis) but naturally this is buried in a single paragraph that most readers/people who pass the link would overlook:

It’s difficult to know for sure how much of the drop in homicides and suicides was caused specifically by the gun buyback program and other legal changes. Australia’s gun deaths, for one, were already declining before the law passed.

As Dylan Matthews noted for Vox, the drop in homicides wasn’t statistically significant because Australia has a pretty low number of murders already.

Let all this sink in. That Vox article is one of the most shared pieces on the subject of international gun laws/deaths and they barely make an effort to do just a few things on that list above.

Again, binary text classification alone would be enough to check for these omissions, we just need a way to gather training data…


This looks awesome. I’ve been out of town at a long-range shooting clinic, and am about to drive across the country with the family, but will try to take a look this week.

My Keybase account should still be working. I’ll set it up and PM you @Reduce-Incarceration


I’m currently doing some reading on web scraping and the preprocessing of PDFs, that’s the only missing step in the pipeline of what I have in mind.

Acquiring and wrangling data always takes the most time of any machine learning project so I guess I’m not surprised.

FYI, I’m not seeking to dismiss news articles or academic studies on the basis of their findings but rather on their methods. Demonstrating that their’s an association between two variables, that the association isn’t the result of confounding variables, and that the order of causation truly involves changes in gun ownership/laws coming before changes in crime rates.