European research about using Artificial Intelligence and Interoperability for Solving Challenges of OSINT and Cross-Border Investigations
By Amr el Rahwan, Police Colonel Engineer (Ret.), Global Security Expert.
On the 22nd of November 2022, the European Law Enforcement Research Bulletin of the European Union Agency for Law Enforcement Training (CEPOL) published the practical research paper titled “Artificial Intelligence and Interoperability for Solving Challenges of OSINT and Cross-Border Investigations” conducted and written by the former Police Colonel Engineer Amr el Rahwan.
The paper describes a newly proposed Person-Centric OSINT approach using Artificial Intelligence (AI) and interoperability to solve the challenges that emerge during investigations, such as multiple-identity, identity frauds, exchanging Cross-Border information, and the complexity of OSINT investigations.
As described by el Rahwan, the war between Russia and Ukraine created the essential need for exchanging Cross-Border information and for preventing, detecting, and investigating terrorism and serious crime across Europe and the neighboring countries. The complexity of OSINT is challenging because only officers with strong information technology skills and backgrounds can obtain optimum results from OSINT investigations for revealing the identity of target persons of interest, such as criminals, imposters, terrorists, and spies.
Furthermore, the paper highlights the relevant technologies that could be used for solving the mentioned challenges, especially using interoperability and pre-trained Artificial Intelligence algorithms.
Moreover, understanding the existing technology limitations is essential for obtaining good results and recommending the best practice for achieving optimal results. Importantly, the newly introduced Person-Centric OSINT approach will allow detectives and investigators with basic IT skills to achieve good results in identifying suspects and victims of terrorism and serious crimes without being overwhelmed with learning advanced IT or OSINT.
Finally, the paper describes the required training for law enforcement officers in each Member State, and it concludes the required training for compliance with EU interoperability standards, the required support for purchasing and implementing AI, interoperability, and Single Search Interface, the required capacity building for technical, functional, and operational officers, and essential AI training on Facial Recognition and Per¬son-Centric OSINT for Cross-Border investigations.
Although the eu-LISA will implement the interoperability framework in 2023, new challenges will emerge, such as investigating multiple-identity and identity frauds due to the different formats and structures of data, low quality of biographic and biometric data, and low accuracy of matching algorithms. For example, when the border authorities receive the Advance Passenger Information (API) and the Passenger Name Record (PNR) of air and sea passengers, it is difficult to exchange and match the identity of one passenger with his/her records stored in the EES, ETIAS, SIS, and VIS due to lack of interoperability and different data structures and formats.
Important to mention that the central EU systems have gaps in covering all the persons of interest living or travelling to the Member States of the European Union. The gap could be summarised in three types of persons of interest: the short stay visa-exempted third country travellers, permanent foreign residents, and EU citizens. The eu-LISA will implement the ETIAS system and units for solving the gap for the visa-exempted TCNs. However, none of the existing or newly established central European information systems will solve the gap for permanent TCN residents and EU citizens. Each Member state is responsible for solving that gap by creating national systems and achieving interoperability between the national and central information systems as per the EU regulations for interoperability.
The fraudulent actions and wrong matches are other issues created due to the lack of interoperability and low accuracy of some biometric modalities. For example, the fingerprints of a third-country national could be enrolled in the VIS system with specific identity information, while the fingerprints of the same third-country national might be enrolled in the EURODAC system using different identity information. A second example is that the different facial images of a third-country national could be enrolled in the VIS and EURODAC systems. When submitting a facial query to both systems, the results could be two lists of candidates, instead of one “hit/no hit” from each system, due to the low quality of facial images and the low accuracy of facial recognition algorithms.
Cross-Border information exchange is required when revealing the identity of an involved suspect or victim depending on identity information or criminal information that resides in a foreign country outside the borders of European countries. Furthermore, exchanging of cross-border information is required by immigration authorities for the identification and security clearance of TCN asylum seekers and travellers. Cross-Border investigations are challenging because there is no proper way or technical solution for exchanging cross-border information, and the officers in the EU countries don’t have access to the cross-border databases and information systems. The third case of the cases section will simulate the challenge and the solution for a valid hypothetical scenario for cross-border investigation.
Using tools and methods of OSINT is challenging because it contains various information technology elements such as domains, websites, protocols, headers, codes, scripts, IP addresses, certificates, hashes, usernames …etc. It requires strong IT skills to obtain optimum results in revealing the identities of suspects or victims related to terrorism or serious crime.
Furthermore, the officers don’t get the optimum results from the OSINT tools because they need to understand the tools’ mechanism, accuracy, and demographics. Also, they may not differentiate between image recognition and facial recognition in many cases.
Finally, the different encounters of the same identity are not linked across the different data sources, creating multiple-identity and fraudulent identity challenges due to lack of interoperability and the variations of names and languages.
Artificial Intelligence technology and interoperability are keys to solving the multiple-identity and fraudulent identity issues.
Named Entity Recognition, or NER, will help the investigators understand and target the information useful for investigations, such as names, jobs, and addresses while decreasing the focus on the less useful or less relevant information.
Natural Language Processing (NLP) AI algorithms can be used for biographic matching across multiple information systems. These algorithms can be trained to link between the different name variations of similar identities. They can detect identity fraud when the same person’s fingerprints are enrolled in two systems or more under different identities.
Name matching is essential for linking or unlinking identities. Yet, understanding names is challenging because the same name is written and pronounced differently across different languages, and the name may have variations due to regional and cultural effects. Furthermore, there are no clear rules for defining nicknames, and a nickname may sound very far from the original name, such as Sasha, a nickname for Alexander. Understanding name variations across different languages and using AI algorithms for fuzzy biographic matching will improve investigation results and solve the problems of multiple-identity and frauds.
In the Arabic language, for example, it is easy for Arabic speakers to identify persons of interest with Arabic names. Still, it is challenging for non-Arabic speakers because the Arabic names have a lot of variations when translated to other languages. Another challenge is that many Arabic letters don’t have any phonetical equivalent in Latin-based languages.
The newly proposed Person-Centric OSINT constructs the lost bridge between OSINT and biometrics, especially facial recognition. The Person-Centric OSINT approach uses open-source data to investigate cases and assemble their identity footprints to reveal their identities on the internet. The searches will be limited to a biometric search using a facial image and a biographic search using first name & family name, email address, or telephone number.
Searches will always have a single starting point in the Person-Centric approach, either starting with a facial search or a biographic search. The elements of the results will be submitted for successive iterations of searches until the identity is revealed or more information is gained.
Providing high-quality training for law enforcement officers is an essential step for solving the investigation challenges. Importantly, the training programs should contain Artificial Intelligence mechanisms, limitations, and demographics, and it is recommended to cover the proposed Person-Centric OSINT approach.
Moreover, the training programs for each EU and non-EU Member State are recommended to include the following: Training for compliance with the EU interoperability regulations and standards and the new EU systems such as the EES, ETIAS, and ESP, Providing support for purchasing and implementing Artificial Intelligence, interoperability, and SSI “Single Search Interface”, Capacity building for the border security and law enforcement agencies’ technical, functional, and operational officers, and Training on facial recognition, facial OSINT, and Person-Centric OSINT for cross-border investigations.
Finally, the training tools should include mock trials and criminal case simulation, and the training syllabuses should cover using modern technologies and digital skills for solving the challenges of multiple-identity, fraud, and cross-border investigation. The below image depicts the recommendations.