Status

Approved

Zhivko Angelov, Ivan Kunyankin, Denis Iudovich, Kiril Stoilov

Context

We need a Matching engine component that connects Candidates with the Open positions. Matching engine is essential to simplify the hiring process for employers and increase the probability for the candidate to find a suitable position.

The proposed solution should require as little effort from candidates/employers as possible and yet ensure relevancy of the candidate-job posting pairs.

Decision

The proposed approach includes matching that is based on semantic similarity between resumes and job postings.

Proposed approach

The proposed Matching algorithm consists of the following steps:

We first extract the essential data from a resume/job posting (key responsibilities, skills, stack, summary) using one of the LLM available through AWS Bedrock. The model is instructed to order its output by priority for the job posting.
We then embed this data using an embedding model from AWS Bedrock.
We then use AWS Aurora pgvector plugin to find the most semantically similar entries (nearest neighbors) for each data point.
We use a threshold value for considering the neighbors as a match (e.g., we extracted “team leadership” from a job posting and although the candidate’s resume doesn’t have the same phrase, it contains “team management” which in this context is should treated as a match).
For each candidate - job posting pair we calculate the intersection of matched entities and calculated the overall score as a weighted sum of intersecting items

Alternatives

Force candidates and employers to fill in the data using a predefined dropdown list. Doesn’t take into account summary sections of resumes and job postings. Adds complexity for both candidates and employers. Can reduce conversion.
Instruct the LLM to extract data and convert it to a common format (in the same or separate request). Doesn’t take summaries into the account. Increases costs. Less scalable for increasing number of unique skills / tools, etc.

Consequences

Positive

Robustness to different wordings used in resumes / job postings
Robustness to different languages (we can use a multilingual model and avoid risking to loose the meaning in translation)
We can optimise costs by caching embeddings for known entities
Access to models from different providers
Fully-managed service - low operating costs
Possibility to increase security by switching to a closed network
The same service can be used for language modelling

Negative

Having a threshold value can become a burden
Additional costs for embedding and storage
Reliability of matching results must be thoroughly tested

Reversibility

This decision is somewhat reversible. There is no “no turning back”s, but there is no half-way alternatives either - switching to a different approach would likely require implementing it from scratch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ADR-8 - Matching Engine.md

ADR-8 - Matching Engine.md

Status

Context

Decision

Proposed approach

Alternatives

Consequences

Positive

Negative

Reversibility

Files

ADR-8 - Matching Engine.md

Latest commit

History

ADR-8 - Matching Engine.md

File metadata and controls

Status

Context

Decision

Proposed approach

Alternatives

Consequences

Positive

Negative

Reversibility