Category: yente · Last updated: · Permalink
Many (especially regulated) deployments of yente
are behind a firewall. In order for this to work, both the software itself and the data it uses for screening need to be available inside the secure environment.
It's hard to give generalised advice on this that applies to all secured environments, but some general hints include:
ghcr.io/opensanctions/yente
), which can be mirrored in a local Docker registry before being deployed.data.opensanctions.org
, mirror.opensanctions.net
. Permitting the deployed yente container to access these two hosts is the easiest firewall configuration option.By default, yente
will regularly fetch metadata and data updates from the domain names listed above. If this is not an option, you can separate the data download from the operational environment used by yente
. This requires some environment-specific design, but in general has two components:
yente
and place them in a location that is accessible to the Python application during its runtime. This could, for example, be a docker volume mount or by building a docker image layered on top of the official images which contains the data.yente
consume these local files instead of trying to access the internet.The basic commands for fetching the metdata and data files are as follows:
wget -O index.json https://data.opensanctions.org/datasets/latest/index.json
wget -O entities.ftm.json https://data.opensanctions.org/datasets/latest/default/entities.ftm.json
Note that this needs to be repeated in a regular interval, e.g. using a crontab on a bridge/jump host.
You can then use a custom manifest file which points to the location where the docker container can access the files that have been fetched:
catalogs:
# The catalog file is loaded in order to get all the metadata for the datasets
# included in the database.
- url: "file:///path/to/dmz/index.json"
datasets:
- name: offline
title: Wrapper dataset for offline data
children:
- default
entities_url: "file:///path/to/dmz/entities.ftm.json"
load: true
OpenSanctions is free for non-commercial users. Businesses must acquire a data license to use the dataset.