The Mystery of AI Gunshot-Detection Accuracy Is Finally Unraveling

How accurate are gunshot detection systems, really? For years, it's been a secret, but new reports from San Jose and NYC show these systems have operated well below their advertised accuracy rates.
Bullet fired from a handgun
Photograph: moodboard/Getty Images

Liz González’s neighborhood in East San Jose can be loud. Some of her neighbors apparently want the whole block to hear their cars, others like to light fireworks for every occasion, and occasionally there are gunshots.

In February 2023, San Jose began piloting AI-powered gunshot detection technology from the company Flock Safety in several sections of the city, including Gonzalez’s neighborhood. During the first four months of the pilot, Flock’s gunshot detection system alerted police to 123 shooting incidents. But new data released by San Jose’s Digital Privacy Office shows that only 50 percent of those alerts were actually confirmed to be gunfire, while 34 percent of them were confirmed false positives, meaning the Flock Safety system incorrectly identified other sounds—such as fireworks, construction, or cars backfiring—as shooting incidents. After Flock recalibrated its sensors in July 2023, 81 percent of alerts were confirmed gunshots, 7 percent were false alarms, and 12 percent could not be determined one way or the other.

For two decades, cities around the country have used automated gunshot detection systems to quickly dispatch police to the scenes of suspected shootings. But reliable data about the accuracy of the systems and how frequently they raise false alarms has been difficult, if not impossible, for the public to find. San Jose, which has taken a leading role in defining responsible government use of AI systems, appears to be the only city that requires its police department to disclose accuracy data for its gunshot detection system. The report it released on May 31 marks the first time it has published that information.

The false-positive rate is of particular concern to communities of color, some of whom fear that gunshot detection systems are unnecessarily sending police into neighborhoods expecting gunfire. Nonwhite Americans are more often subjected to surveillance by the systems and are disproportionately killed in interactions with police. “For us, any interaction with police is a potentially dangerous one,” says Gonzalez, an organizer with Silicon Valley De-Bug, a community advocacy group based in San Jose.

San Jose did not attempt to quantify how many shooting incidents in the covered area the Flock System failed to detect, also known as the false-negative rate. However, the report says that “it is clear the system is not detecting all gunshots the department would expect.”

Flock Safety says its Raven gunshot detection system is 90 percent accurate. SoundThinking, which sells the ShotSpotter system, is the most popular gunshot detection technology on the market. It claims a 97 percent accuracy rate. But the data from San Jose and a handful of other communities that used the technologies suggest the systems—which use computer algorithms, and in SoundThinking’s case, human reviewers, to determine whether the sounds captured by their sensors are gunshots—may be less reliable than advertised.

Last year, journalists with CU-CitizensAccess obtained data from Champaign, Illinois, showing that only 8 percent of the 64 alerts generated by the city’s Raven system over a six-month period could be confirmed as gunfire. In 2021, the Chicago Office of Inspector General reported that over a 17-month period only 9 percent of the 41,830 alerts with dispositions that were generated by the city’s ShotSpotter system could be connected to evidence of a gun-related crime. SoundThinking has criticized the Chicago OIG report, saying it relied on “incomplete and irreconcilable data.”

This week, New York City’s comptroller published a similar audit of the city’s ShotSpotter system showing that only 13 percent of the alerts the system generated over an eight-month period could be confirmed as gunfire. The auditors noted that while the NYPD has the information necessary to publish data about ShotSpotter’s accuracy, it does not do so. They described the department’s accountability measures as “inadequate” and “not sufficient to demonstrate the effectiveness of the tool.”

Champaign and Chicago have since canceled their contracts with Flock Safety and SoundThinking, respectively.

“Raven is over 90 percent accurate at detecting gunshots with around the same accuracy percentage at detecting fireworks,” Josh Thomas, Flock Safety senior vice president of policy and communications, tells WIRED in a statement. He adds that this is “based on measurements across the board,” meaning data from all Flock's customers. “And critically, Raven alerts officers to gun violence incidents they never would have been aware of. In the San Jose report, for example, of the 111 true positive gunshot alerts, SJPD states that only 6 percent were called in to 911.”

Eric Piza, a professor of criminology at Northeastern University, has conducted some of the most thorough studies available on gunshot detection systems. In a recent study of shooting incidents in Chicago and Kansas City, Missouri, his team’s analysis showed that police responded faster to shooting incidents, stopped their vehicles closer to the scene of shootings, and collected more ballistic evidence when responding to automated gunshot alerts compared to 911 calls. However, there was no reduction in gun-related crimes, and police were no more likely to solve gun crimes in areas with gunshot sensors than in areas without them. That study only examined confirmed shootings; it did not include false-positive incidents where the systems incorrectly identified gunfire.

In another study in Kansas City, Piza found that shots-fired reports in areas with gunshot sensors were 15 percent more likely to be classified as unfounded compared to shots-fired reports in areas without the systems, where police would have relied on calls to 911 and other reporting methods.

“If you look at the different goals of the system, research shows that [gunshot detection technology] typically tends to result in quicker police response times,” Piza says. “But research consistently has shown that gun violence victimization doesn’t reduce after gunshot detection technology has been introduced.”

The New York City comptroller recommended the NYPD not renew its current $22 million contract with SoundThinking without first conducting a more thorough performance evaluation. In its response to the audit, the NYPD wrote that “non-renewal of ShotSpotter services may endanger the public.”

In its report, San Jose’s Digital Privacy Office recommended that the police department continue looking for ways to improve accuracy if it intends to keep using the Raven system.

Pointing to the report’s finding that only 6 percent of the confirmed gunshots detected by the system were reported to police via 911 calls or other means, police spokesperson Sergeant Jorge Garibay tells WIRED the SJPD will continue to use the technology. “The system is still proving useful in providing supplementary evidence for various violent gun crimes,” he says. “The hope is to solve more crime and increase apprehension efforts desirably leading to a reduction in gun violence.”

The department’s media relations division could not immediately locate information about how much SJPD pays Flock Safety for the Raven system or how long its contract is for.

Darcie Green, a community health advocate and member of San Jose’s former Reimagining Public Safety Community Advisory Group, says that disclosing data about the accuracy of the system is a good step, but the city also needs to examine whether the technology is actually making the city safer or whether the money and human resources devoted to responding to real and false gunshot alerts could be better invested elsewhere.

“Obviously we want to reduce gun violence, that’s everybody’s goal,” she says. “For these dollars that we’re spending on enforcement and punishment, we should be spending three-, four-, fivefold on programming and improvement. There are so many people for whom calling the police is not a solution.”

Gonzalez, the East San Jose community organizer, says the lack of 911 calls about shootings is a reflection of how little she and others in the community trust police responses to their neighborhood.

“There was a time when a helicopter was searching for somebody in our neighborhood, and I could swear that they were in my backyard, but I didn’t want to call the police because potentially they could shoot me or my family mistaking me for that person,” she says. “I would rather have the person in my backyard than the police looking for them.”

Updated 6/25/2024, 3:55 pm ET: Added additional detail about how Flock Safety calculates its accuracy levels.