Over the last 18 months, we’ve collaborated with numerous companies to deploy our on-premise screening API, yente
, within their infrastructure. yente
enables these organizations to establish a flexible and effective sanctions and PEP screening process, offering cost efficiency and robust data privacy.
Two of the most commonly requested features of yente
have been the need for rapid, incremental data updates in the service, and the ability to deploy yente
to the AWS cloud utilizing the OpenSearch service as a backend.
In response, yente
4.0 focuses on easing the operational workload by adding support for incremental updates and the OpenSearch backend.
Amazon’s Not-Quite-ElasticSearch
Due to a licensing dispute with Elastic, Amazon ceased offering a hosted ElasticSearch service in 2021, creating their own engine, OpenSearch. While both engines perform similar functions, they are gradually diverging in API and scope.
yente
4.0 now supports OpenSearch in self-hosted, AWS-hosted, and serverless modes, facilitating easy deployment as a cloud native application (e.g., using Fargate) and achieving auto-scaled performance with minimal operational complexity.
What’s Changing?
Previously, yente
would also download a complete release of the OpenSanctions dataset multiple times a day. However, as our database’s size has grown, this process has become increasingly burdensome. Version 4.0 changes this by downloading the full dataset only once to build the initial backend index, and then updating itself using change data files between each new database version.
This improvement allows for much quicker and resource-saving updates, ensuring the data used for screening is updated the instant we publish it. Moreover, the delta update mechanism developed for yente
can be utilized by other customers of the bulk data product as an easy way to keep their data synchronized with upstream changes.
What’s Staying the Same?
As highlighted, this major release aims to reduce the operational burden of running yente
on your infrastructure. To simplify migration to this new version, we have decided not to make changes to the core element: scoring. The algorithms used to rank and score results remain unchanged from yente
3.8.
While we plan to enhance the scoring and matching systems of yente
in the future, our primary goal with 4.0 was to solidify the fundamental working of the service.
How to Update to yente 4.0
The new version of yente
is available as a docker image at ghcr.io/opensanctions/yente:4.0.0
. Customers using the default docker-compose.yml
can trigger an update by setting the new tag, and running docker compose pull && docker compose restart app
.
In order to reflect OpenSearch compatibility, we've renamed some of the environment variables used to configure the search index. See the settings documentation for the relevant changes.
As always, we're keen for any feedback you may have!