Two years ago, the Spanish national police introduced a tool named Veripol in police stations to help detect false complaints, such as a person declaring a robbery that never happened. It is the first time such a tool is used in Spain, and probably worldwide.
Veripol is a computer program that scans complaints for robbery, pickpocketing and purse snatching. It assesses the likelihood that the complaints are not true. When it was introduced in 2018, it was meant to help police investigators decide on a report’s veracity. The main goal is to save time and prevent insurance fraud – for example, when someone falsely reports the theft of a mobile phone.
According to data the Ministry of Interior provided to AlgorithmWatch, Veripol has been used on approximately 84,000 complaints since its launch in October 2018.
Simulation of crime is a crime itself in Spain and refers to people who assure they were victims of a theft, for example, but do not accuse a specific person for it. In 2019, the only complete year for which data is available, 49,702 complaints were processed by Veripol, 2,338 of which resulted in a case for simulation of crime. The classification of a complaint as a simulation is not based on Veripol’s result alone.
A source at the Ministry of Interior told AlgorithmWatch that Veripol flags about one in 20 complaints as simulations of crime. Overall, this ministerial department said that about a third of all queries for these crime categories was processed by Veripol.
The Ministry also said that, since Veripol was introduced, the number of complaints for robbery and specific types of theft had decreased, and that less cases of simulation of crime had been detected. The data on registered crimes from the National Statistics Institute only has numbers updated until 2019, but tends to confirm the Ministry’s view that the number of simulations decreased. However, given that the overall crime numbers remain constant as the usage numbers of Veripol decrease, it could be an indication that Veripol is less and less popular among police officers.
Not so widespread
Veripol is available in about 240 police stations operated by the National Police (Catalonia and the Basque Country have their own police forces, and policing duties are shared with the Civil Guard). For now, Veripol is only used by the National Police, but sources from the Ministry have confirmed plans to use it for a broader range of crimes and to introduce it to the Civil Guard’s systems.
On paper, the algorithm is used by trained agents only. However, several interviews with police officers who are members of the association “Police for the 21st century”, an advocacy group that is at odds with the government on some issues, and of the Federal Police Union, reveal a different picture. Many agents have yet to receive proper training, and some of them find the system hardly accurate.
Many said that Veripol’s general use was quite limited. They consider it the type of tool that works for a “theoretical approach” but that does not integrate with their daily work at the police station. Firstly because of the lack of training, and secondly because many officers said that the way reports are filed is at odds with Veripol’s usage.
When a person files a complaint at a police station, a police officer transcribes the complainant’s words in his or her own terms. Even when the complaint is pre-recorded online, officers may change it when the complainant goes to the station to sign it.
“We never do a strict transcription of what the complainant says”, a police officer told AlgorithmWatch. “For example, we need to start most sentences with the word ‘that’. The public servant writes what he has understood from your story with his own words,” the officer added. Part of the training on Veripol consists of teaching officers how to write in a way the software can process.
A source from the Ministry of Interior said that Veripol was a complementary tool and that its use depended on the ability and expertise of agents. Further investigation must be done once the algorithm gives a result, and that requires human capacity, the source added. A complaint is only catalogued as false once the accuser admits it was untrue.
Trained by one agent
Although Veripol is not classified as secret, few details have been made public since its launch. The algorithm it is based on was developed by a former police officer who holds a doctorate degree in mathematics, and three Spanish researchers. They co-authored a scientific study published in the journal Knowledge-Based Systems. In it, they claim that Veripol only labeled 5 of 100 true complaints as false in the training data set, while mislabeling 11 out of 100 false complaints as true.
It was trained on a corpus of 1,122 robbery reports filed in Spain in 2015 consisting of 534 true reports and 588 false ones. All of them were anonymized.
For the labeling of true and false reports, an officer with “extensive experience in interrogation, lie detection and investigation” was involved. He reviewed and classified the 1,122 reports over a two-year period. The authors argue that this was the best possible methodology to build a corpus, since “the real ratio of false reports” could not be known.
Based on the official statistics at the time, only one in 25 robbery reports registered in Spain in 2015 were false. In a twist of questionable logic, the researchers claim that, because about four in five robbery cases remain unsolved, the real number of false complaints must be higher.
They base their estimate on the fact that “the most successful Police Department in clearing false robbery reports” had a “falsehood ratio” of 57% for all the reports they filed in 2015.
A test for bias?
Reports submitted to Veripol are written by trained agents, which means that the text passes by a first filter before it is processed by the software. That, and the way Veripol was trained (a specialized agent defined reports as true or false), shows that the “parameters analyzed are not objective”, according to Sheila Queralt, a sworn legal-linguistic expert and head of SQ Forensic Language, a company based in Barcelona.
Ms Queralt said that the issue of false reports was already riddled with uncertainty. For her, Veripol simply turned the human uncertainty into machine uncertainty. “We don’t know how the software is manipulated since it does not work with the direct testimony of the complainant but with the transcription or summary that a police agent wrote”, she added.
Veripol works by identifying words and phrases that linguistic studies have shown to be identifiers of possible lies, Ms Queralt said. For example, using many adjectives or refusing to describe a scene are considered hints that a complainant is lying. “What matters most to me, apart from knowing to what extent it works, is how the corpus from which the algorithm is extracted was built”, she explained.
Indeed, specific words seem to carry much weight. One of the police officers interviewed for this article said that the word “pocketknife” could be enough for Veripol to classify a report as true.
Lie detectors have a long history of malfunctioning, although fraud detection through text analysis in a police department is something new, said Fieke Jansen, a researcher at the Data Justice Lab.
Ms Jansen said that there is a gender, generational, class and ethnic aspect to this matter as well. Unless the police ensure that the algorithm does not discriminate unfairly, the system could unconsciously look for specific words or sentence structures that are most often used by certain demographics.