Screening checks are a different challenge to normal text searches: your query is supposed to describe a person or company in some detail to allow the OpenSanctions API to check if that entity (or a similar one) is flagged.
OpenSanctions' API server is a powerful way to query and access the entities in our database. The simplest way to use the API is to sign up for an API key and start screening your counterparties (customers, suppliers, etc.) international watchlists. Commercial license customers can also use an on-premise deployment to perform the same process within their own infrastructure.
The most basic way to do those bulk searches might be running simple text queries against the /search
endpoint - but this will lead to imprecise and incomplete results. Instead, this guide will show you how to use the /match
endpoint to get more precise screening results using query-by-example to do multi-attribute lookups. You can experiment with the /match
endpoint via the advanced screening search.
Let's say, for example, that you have a customers dataset that specifies the name, birth date, nationality and perhaps a national ID number for each person you want to check.
The first step would then be to implement a piece of code formats each of these entries conform with the entity format used by OpenSanctions, assigning each of the columns in your source data to one of the fields specified in the data dictionary (This, of course, works not just for people, but also companies, vessels, even crypto wallets).
Here's an example entity in JSON format:
{
"schema": "Person",
"properties": {
"firstName": ["Arkadii"],
"fatherName": ["Romanovich"],
"lastName": ["Rotenberg", "Ротенберг"],
"birthDate": ["1951"],
"nationality": ["Russia"],
}
}
A few things to note:
schema
defines the type of entities to match this example against. Of course, the schema could also be Company
, or a LegalEntity
(a more general entity type that matches both people and companies!).ru
) for the entity.Generating this JSON form of your records should be a simple exercise. Do not worry too much about details like whether a country name should live in the country
or jurisdiction
properties: the matching happens by data type (in this case: country), not precise field name.
OpenSanctions combines watchlists from hundreds of different data sources - some are sanctions lists, others databases of national politicians, even entities involved in crime. For this introduction, we'll query the whole database by using the default
collection endpoint. This will produce results across all available data categories. In order to reduce false alarm rates, you will need to apply specific query filters later.
What data sources and collections will be queried is determined by the URL of the matching endpoint used in your integration, e.g. https://api.opensanctions.org/match/default
.
In order to avoid the overhead of sending thousands upon thousands of HTTP requests, you can group the entities to be matched into batches, sending a few of them at a time. A good batch size is 20 or 50, not 5000.
Below is an example Python script that demonstrates how to use the matching API. Note that when running this for your own data, you'll need to add a data source, and a place to store the highest-scoring matches for analyst review.
Note: This example uses authentication to access the hosted OpenSanctions API. If you're running the yente application, you can remove the API key header.
import requests
from pprint import pprint
# The OpenSanctions service API. This endpoint will only do sanctions checks.
URL = "https://api.opensanctions.org/match/sanctions?algorithm=logic-v2"
# Read an environment variable to get the API key:
API_KEY = os.environ.get("OPENSANCTIONS_API_KEY")
# Create an HTTP session which manages connections and defines shared header configuration:
session = requests.Session()
session.headers['Authorization'] = f"ApiKey {API_KEY}"
# A query for a person by name and birth date. Person names can be given as name parts
# (ideal - shown here), or as a single "name" value (see company search below, less precise).
EXAMPLE_1 = {
"schema": "Person",
"properties": {
"firstName": ["Arkadii"],
"fatherName": ["Romanovich"],
"lastName": ["Rotenberg", "Ротенберг"],
"birthDate": ["1951"],
},
}
# Similarly, a company search using just a name and jurisdiction.
EXAMPLE_2 = {
"schema": "Company",
"properties": {
"name": ["Stroygazmontazh"],
"jurisdiction": ["Russia"],
},
}
# We put both of these queries into a matching batch, giving each of them an
# ID that we can recognize it by later:
BATCH = {"queries": {"q1": EXAMPLE_1, "q2": EXAMPLE_2}}
# This configures the scoring system.
PARAMS = {"algorithm": "best"}
# Send the batch off to the API and raise an exception for a non-OK response code.
response = session.post(URL, json=BATCH, params=PARAMS)
response.raise_for_status()
responses = response.json().get("responses")
# The responses will include a set of results for each entity, and a parsed version of
# the original query:
example_1_response = responses.get("q1")
example_2_response = responses.get("q2")
# You can use the returned query to debug if the API correctly parsed and interpreted
# the queries you provided. If any of the fields or values are missing, it's an
# indication their format wasn't accepted by the system.
pprint(example_2_response["query"])
# The results are a list of entities, formatted using the same structure as your
# query examples. By default, the API will at most return five potential matches.
for result in example_2_response['results']:
pprint(result)
The results returned by the /match
API contain basic information about each candidate or matching entity. For sanctioned entities, note of the programId
property: it describes the sanctions program under which the designation was made.
If you want to retrieve additional details regarding an entity you can use the /entities/<id>
endpoint to retrieve a nested representation that includes details about family and business relationships, and detailed sanctions designations. Each sanction-designated entity (Person
, Company
, Vessel
etc.) can be tied to several Sanction
objects. A Sanction
object describes details about the sanctions imposed by an authority against an entity: the start and end dates, name and country of the authority, and the programId
, which can be expanded into additional details on the relevant policy regime.
Of course, you can also view the OpenSanctions entity page (https://opensanctions.org/entities/<id>
) for each result to see their documented connections to other items.
If one of your queries returns a result, this is not immediately cause for alarm: the database for politically exposed persons in particular contains many individuals with common names, and matches will be fairly frequent. Instead, you should invest time to fine-tune the configuration of the matching system, and eventually also set up a process for human review.
The following strategies can be used to reduce error rates in results returned from the API: