Projects
/ project/Matching Systems

Candidate Sourcing & Matching

Sourcing and matching workflow for turning incomplete freelancer profiles into more useful candidate shortlists for ScaledOps.

Status
Selected project
Type
Matching Systems
Link
External project

The problem

Freelancer profiles often contain partial or inconsistent information. The useful question was not simply whether two strings matched, but whether several weak signals could be combined into a shortlist that a person could review quickly.

What I built

The workflow combined public-profile collection, search queries, deterministic scoring, and optional LLM reranking. It used signals such as name variants, location, title phrases, skills, companies, and education.

The public Upwork to LinkedIn matcher documents one measured slice of the work. On a 51-profile golden dataset, the main bottleneck was search coverage rather than the final semantic selection step. That changed the next iteration: improve candidate discovery before spending more effort on reranking.

Related work

The broader sourcing workflow also included a Playwright-based Upwork profile collector with persistent sessions, small resumable runs, deduplication, normalized fields, and CSV export.

Provider matching

Another part of the workflow turned messy project briefs into inspectable provider shortlists:

  1. Convert Markdown briefs into structured requirements with a strict schema.
  2. Normalize categories and skills into a shared vocabulary.
  3. Expand important skills through a curated synonym map.
  4. Normalize capabilities from multiple provider sources.
  5. Filter by visible constraints such as skill overlap, budget, recency, and timezone.
  6. Loosen category enforcement once when a strict pass produces no candidates, preserving recall for human review.

The model is useful for extracting structured facts from unstructured briefs. The shortlist layer remains deterministic and inspectable: matched skills, thresholds, categories, and source signals stay visible to the reviewer.

What I learned

Matching systems are easiest to improve when each stage is observable. Separate discovery, filtering, scoring, and semantic selection so an evaluation can tell you where recall or precision is actually being lost.