Improving name matching in high-stakes, high-velocity scenarios

17th April 2023 NeilWalker

Until recently, when envisioning Antwerp and Rotterdam, people thought of art nouveau architecture, wide boulevards, and tulip gardens.

But drug trafficking has transformed these Belgian and Dutch port cities. Antwerp and Rotterdam have morphed into European gateways for the Latin American cocaine trade. In Antwerp, police seized more than 110 tons of cocaine in 2022, up more than 100 percent from 2018. In Rotterdam, more than 50 tons of cocaine were confiscated in 2022 alone. Officials believe that drugs seized represent less than 10 percent of the total trafficked.

The drugs wrought violence. Antwerp has suffered dozens of grenade attacks. Belgium’s justice minister now lives in a safehouse. Murders have plagued the Netherlands.

To combat the flow of drugs and other illicit cargo — including weapons and trafficked humans — border security agencies in Europe are following the lead of similar agencies in the United States and the United Kingdom. They’re implementing artificial intelligence (AI) technologies for name matching and entity recognition. These capabilities help officials halt the flow of contraband — often before it leaves the docks.

The lay of the sea
Belgium and the Netherlands are small countries: their arable land is not unlimited. It’s no surprise, therefore, that food is a major import. Illegal drugs are often hidden in consignments of fruit.
But where?

Antwerp and Rotterdam are both major ports. At the Port of Antwerp (the second largest in Europe) more than 231 million tons of cargo are loaded or unloaded each year. How can border officials determine which container ship has cocaine hidden among its avocados?

By examining data.
Required onboard paperwork lists each ship’s point of origin and destination; the name of the export company using it; that company’s address; a cargo manifest; and other identifying documents. The truck picking up the cargo carries vehicle registration and a driver’s license.

Effectively examining and enhancing this information — then comparing it against watchlists and aligning it with existing intelligence — can aid border officials in spotting and stopping drug traffickers and other criminals. AI capabilities including name matching and named entity recognition (NER), help in this effort.

But too often, border security organizations lack these capabilities.

The limitations of search platforms
Currently, many border security organizations use full-text search engines for name matching and entity recognition. These search engines may be powerful and well-respected, but they were not built for these tasks.

When used for name matching, search engine capabilities lie somewhere between match/no match determinations and fuzzy matching. Older systems rely on complex rule sets to determine whether one name matches another, but they can’t account for every possible name variation.

Fuzzy matching — a computing approach that considers degrees of truth — is an improvement over exact name matching systems in several ways. By taking into account real-world issues such as typos, misspellings, alternate spellings, and disordered data components, it is much more likely to accurately match names across two or more datasets. But fuzzy matching still can’t analyze the many ways that names vary across languages, scripts, and cultures. Missing potential matches is simply not good enough for border security. That’s why AI-powered name matching is needed.

The value of AI name matching technologies
Imagine you’re screening for a known bad actor named Juan Andrés Herrera. Search platforms may only find instances of his full name, spelled correctly, in Spanish. AI-powered name-matching technologies, meanwhile, can return instances of John Andrew Smith (Juan’s name in English) and Jean André Favre (French). They can find instances of his name rendered in a non-Latin scripts.

They can detect aliases or nicknames such as Juju Herrera, and misspellings such as Jahn Herrera.

AI name matching technologies also outperform search engines and older name matching approaches when matching addresses and corporate names.

Corporate names present a particular challenge. Names might differ by a synonym such as “PennyLuck Drugs” and “PennyLuck Pharmaceuticals,” or regularly use initialisms. Name matching technologies that handle organizational names well can spot these. You may regularly dine at your favorite steak house, “WBM,” and pick up your prescriptions at a chain that calls itself “VFP.” Name matching can link these initialisms to the companies’ official names, “World’s Best Meats LLC” and “Very Fine Pharmaceuticals, Inc.”

Linking entities to real-world identities
Named entity recognition (NER) capabilities find records in different data sets that refer to the same entity, then link that information to real-world people, places, and organizations — while considering the context in which an entity is mentioned.

But there are a lot of Juan Andrés Herreras in the world. How do you avoid needlessly consuming investigators’ time with a barrage of info on all the Juan Andrés Herreras you don’t care about?

NER capabilities automatically narrow the field, presenting investigators with the information relevant to the “Juan Andrés Herrera” they’re querying. They automatically reject mentions of Juan Andrés Herrera, a public relations executive living in New York City. They reject obituaries written about Juan Andrés Herrera. They reject news stories about Juan Andrés Herrera winning a mathematics award at his 8th grade graduation in 2022. Concurrently, NER software flags reports of a Juan Andres Herrera, 32, of Bogota twice arrested for drug trafficking.

Faster data triage with named entity recognition
In 2021, Belgian police were able to decipher the encryptions used by SKY ECC — a secure messaging app widely used by drug traffickers. Roughly 164,000 drug traffickers around Antwerp were using SKY ECC to exchange 1.5 million messages daily. Cracking Sky ECC gave authorities access to 1 billion messages. But Belgian authorities simply did not have the manpower to analyze them all in a timely manner. The Belgian federal prosecutor estimated that it would take the police 685 years to read every message. To analyze all those chats in a day, the Belgian police could have leveraged NER to simultaneously extract the people, places, and organizations mentioned. Instead of searching for specific names — which they wouldn’t have known in the first place – the authorities could review a list of extracted names and prioritize which messages to review first.

Adopting AI name matching for border security
AI-powered name matching technology is currently in use to check the names of millions of people crossing international borders. By using advanced name matching technology, border security agencies can improve the accuracy of verification operations, reduce the number of false positives that require extra screening, and enhance overall border security by ensuring no matches are missed.

By Declan Trezise, Director of Global Solutions Engineering, Babel Street