02. The Matching Engine
The matching engine answers a deceptively complex question: given a candidate's full profile and a set of open roles, which roles is this person genuinely a top-1% fit for, and which roles can we confidently rule out?
A naive approach would be to ask an LLM to evaluate every (candidate, role) pair. This is the architecture most "AI recruiting" startups ship. It does not scale, it produces inconsistent results across runs, and it is dramatically more expensive than necessary.
Refery's matching engine instead operates as a three-stage pipeline:
- Signal engine computes deterministic, structured signals from the candidate's history. These are computed once and cached.
- Multi-vector retriever uses semantic embeddings combined with hard filters to narrow millions of candidate-role pairs to the top 20-30 candidates per role.
- Adversarial panel (covered in chapter 03) runs the expensive LLM evaluation only on this narrowed set.
This chapter covers stages 1 and 2.
The signal engine
Before any vector math happens, Refery extracts a set of structured, deterministic signals from each candidate's history. These signals are the bedrock of the matching system. They are computed by code, not by LLMs, which means they are reproducible, debuggable, and free to compute.
Logo tier (raw → modified pipeline)
Each company on a candidate's resume is scored on a tier list. The tier list is function-aware: engineering candidates are scored against an engineering tier list; sales candidates against a sales tier list. A small subset of companies sit at the very top of one list but not the other, and that asymmetry matters.
// signal-engine/logo-tier.ts
type Tier = 'S+' | 'S' | 'A' | 'B' | 'C' | 'D';
const ENG_TIERS: Record<Tier, string[]> = {
'S+': ['Google', 'Meta', 'Apple', 'Amazon', 'Netflix', 'OpenAI',
'Anthropic', 'DeepMind', 'Stripe', 'Databricks', 'Figma'],
'S': ['Airbnb', 'Uber', 'Coinbase', 'Notion', 'Linear', 'Vercel',
'Plaid', 'Ramp', 'Brex', 'Scale', 'Cursor', 'Perplexity'],
'A': ['Atlassian', 'Slack', 'Snowflake'],
'B': [], // computed dynamically: lesser-known funded startups
'C': [], // unknown startups, generic tech
'D': [], // non-tech enterprises, traditional consulting
};
const TIER_SCORE: Record<Tier, number> = {
'S+': 6, 'S': 5, 'A': 4, 'B': 3, 'C': 2, 'D': 1,
};
export function computeRawLogoTier(
companies: CompanyExperience[],
fnRole: 'eng' | 'sales'
): { average: Tier; peak: Tier; reasoning: string } {
const tierList = fnRole === 'eng' ? ENG_TIERS : SALES_TIERS;
const scored = companies.map(c => ({
company: c.name,
tier: classifyCompany(c.name, tierList),
yearsAt: c.tenure,
}));
const weightedAvg = weightedAverage(scored.map(s => ({
score: TIER_SCORE[s.tier],
weight: s.yearsAt,
})));
return {
average: scoreToTier(weightedAvg),
peak: scored.reduce((max, s) =>
TIER_SCORE[s.tier] > TIER_SCORE[max.tier] ? s : max
).tier,
reasoning: formatReasoning(scored),
};
}
The pedigree modifier
Raw logo tier is intentionally biased toward well-known names. But a 50-person Sequoia-backed startup is materially different from a 50-person bootstrapped shop, and the logo tier alone misses that. The pedigree modifier corrects for it.
// signal-engine/pedigree.ts
const TOP_TIER_VCS = new Set([
'Sequoia', 'Andreessen Horowitz', 'a16z', 'Benchmark', 'Founders Fund',
'Accel', 'Greylock', 'Index', 'Lightspeed', 'Bessemer', 'Kleiner',
'Khosla', 'Thrive', 'NEA', 'GV', 'Conviction', 'Radical', 'Spark',
'Ribbit', 'Initialized', 'USV',
]);
export function applyPedigreeModifier(
rawTier: Tier,
company: CompanyExperience
): { modifiedTier: Tier; lift: number; reason: string } {
let lift = 0;
const reasons: string[] = [];
// Top-tier VC backing → +1 tier
const topInvestors = company.investors.filter(i => TOP_TIER_VCS.has(i));
if (topInvestors.length > 0) {
lift += 1;
reasons.push(`backed by ${topInvestors.join(', ')}`);
}
// Funding band thresholds
if (company.totalRaised >= 200_000_000) {
lift += 1;
reasons.push(`$${(company.totalRaised / 1e6).toFixed(0)}M raised`);
} else if (company.totalRaised >= 50_000_000) {
lift += 1;
reasons.push(`well-funded at $${(company.totalRaised / 1e6).toFixed(0)}M`);
}
// Cap lift at +2 to avoid runaway scoring
lift = Math.min(lift, 2);
return {
modifiedTier: shiftTier(rawTier, lift),
lift,
reason: reasons.join(', '),
};
}
The AI bonus
Stacked on top of pedigree. Foundation labs (OpenAI, Anthropic, Mistral, Cohere, xAI, DeepMind) get a full +1 tier lift. AI infrastructure companies (Scale, Pinecone, Weights & Biases, Modal, Replicate, LangChain, Together, Hugging Face) get +1. AI-native applications (Cursor, Perplexity, Harvey, Sierra, Decagon, Glean, Hebbia, ElevenLabs, Suno) get +0.5 to +1 depending on traction.
This is intentionally Refery-tuned bias. The Refery client roster is heavily AI-weighted, so the matching system biases toward candidates with relevant AI exposure. The bias is surfaced explicitly in candidate briefs so it can be reviewed and recalibrated.
Trajectory analysis
Trajectory is a one-line summary of how a candidate has moved across company stages over their career. It captures something that pure tier scoring misses entirely: whether someone has actually ridden a stage transition or has only ever joined post-IPO.
// signal-engine/trajectory.ts
type Stage = 'pre-seed' | 'seed' | 'series-a' | 'series-b'
| 'series-c' | 'late-stage' | 'public' | 'non-tech';
export function computeTrajectory(history: CompanyExperience[]): string {
const transitions = history.map(c => ({
company: c.name,
joinedAt: classifyStageAtTime(c.name, c.startDate),
leftAt: classifyStageAtTime(c.name, c.endDate ?? new Date()),
yearsAt: c.tenure,
}));
const stageRides = transitions.filter(t => t.joinedAt !== t.leftAt);
const earlyStageCount = transitions.filter(t =>
['pre-seed', 'seed', 'series-a'].includes(t.joinedAt)
).length;
if (stageRides.length >= 2 && earlyStageCount >= 2) {
return `${stageRides.length} stage rides, ${earlyStageCount} early-stage joins. Pure builder DNA.`;
}
if (transitions.every(t => t.joinedAt === 'public')) {
return `Always joined post-IPO. No 0-to-1 exposure.`;
}
// ... full classification logic continues
}
Non-tech flag
A boolean that surfaces a structural concern about a candidate's fit for early-stage tech. Raised when three conditions hold simultaneously:
- More than 50% of career was at non-tech companies
- No notable shipped product or build accomplishments
- No clear pivot narrative explaining the shift
The flag is suppressed if the candidate led a real digital transformation, was non-tech only in early career with 5+ years tech since, was in a tech-adjacent function (quant or data science at a bank still signals strong for ML roles), or founded their own venture during any gap.
This is not a veto. The flag surfaces the concern to the panel. The Skeptic persona owns interpretation.
Sales client profile
For sales/GTM candidates only, an additional 3-dimensional classification:
- Client segment: Strategic Enterprise, Enterprise, Mid-market, SMB, Startup/PLG
- ACV band: Strategic ($1M+), Major ($250K-$1M), Mid ($50K-$250K), SMB ($10K-$50K), Velocity (under $10K)
- Industry concentration: fintech, healthcare, public sector, retail/ecom, manufacturing, tech/SaaS, media, real estate, legal, or "horizontal"
This is what catches the "Top 5% sales rep at the wrong segment" failure mode. A candidate who sold $20K Velocity SMB deals to startups is structurally a poor fit for an Enterprise role at $500K ACV, no matter how good their numbers are.
The multi-vector retriever
Once signals are computed, candidates and roles are embedded into a vector space for semantic retrieval. Refery's embedding strategy is multi-vector: each candidate and each role is represented by several distinct embeddings, one per signal axis, rather than one single concatenated vector.
Why multi-vector
A single dense embedding mashes everything together: skills, experience, comp, location, work mode, stage fit. This loses information. Two candidates with similar skills but vastly different stage fit produce vectors that look similar to each other, even though they should be ranked very differently for the same role.
Multi-vector retrieval keeps each axis independent and aggregates the per-axis similarity scores at query time, with axis-specific weights tuned per role type.
Schema
Embeddings are stored in pgvector tables, co-located with relational data in Supabase.
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE candidate_embeddings (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
candidate_id uuid NOT NULL REFERENCES candidates(id) ON DELETE CASCADE,
axis text NOT NULL,
embedding vector(1536) NOT NULL,
computed_at timestamptz NOT NULL DEFAULT now(),
UNIQUE (candidate_id, axis)
);
CREATE INDEX ON candidate_embeddings
USING hnsw (embedding vector_cosine_ops);
CREATE TABLE job_embeddings (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
job_id uuid NOT NULL REFERENCES jobs(id) ON DELETE CASCADE,
axis text NOT NULL,
embedding vector(1536) NOT NULL,
computed_at timestamptz NOT NULL DEFAULT now(),
UNIQUE (job_id, axis)
);
CREATE INDEX ON job_embeddings
USING hnsw (embedding vector_cosine_ops);
The seven axes Refery currently embeds:
| Axis | What it captures |
|---|---|
skills | Technical and functional skills surface area |
trajectory | Career arc, stage exposure, builder vs scaler |
comp_signals | Comp expectation, negotiation leverage, equity tolerance |
geo | Location preferences, time zone willingness |
work_mode | Remote / hybrid / onsite preference |
stage_fit | What company stage this person operates best at |
founder_dna | For senior roles: alignment with founder operating style |
Each axis is embedded by feeding a small, structured prompt to OpenAI's embedding model. The prompt is tightly scoped: the skills axis sees only the skills section, the trajectory axis sees only the structured trajectory summary, and so on. This keeps each embedding focused.
Retrieval query
A retrieval against the seven axes for a single role looks like this:
WITH job_axes AS (
SELECT axis, embedding
FROM job_embeddings
WHERE job_id = $1
)
SELECT
c.id AS candidate_id,
c.name,
-- Per-axis cosine similarity (1 - distance)
AVG(1 - (ce.embedding <=> ja.embedding)) AS avg_similarity,
-- Weighted score using role-specific axis weights
SUM(
CASE ce.axis
WHEN 'skills' THEN (1 - (ce.embedding <=> ja.embedding)) * 0.30
WHEN 'trajectory' THEN (1 - (ce.embedding <=> ja.embedding)) * 0.20
WHEN 'stage_fit' THEN (1 - (ce.embedding <=> ja.embedding)) * 0.20
WHEN 'founder_dna' THEN (1 - (ce.embedding <=> ja.embedding)) * 0.10
WHEN 'comp_signals' THEN (1 - (ce.embedding <=> ja.embedding)) * 0.08
WHEN 'geo' THEN (1 - (ce.embedding <=> ja.embedding)) * 0.07
WHEN 'work_mode' THEN (1 - (ce.embedding <=> ja.embedding)) * 0.05
END
) AS weighted_score
FROM candidate_embeddings ce
JOIN candidates c ON c.id = ce.candidate_id
JOIN job_axes ja ON ja.axis = ce.axis
WHERE c.status IN ('active', 'reviewing')
AND c.do_not_contact = false
GROUP BY c.id, c.name
ORDER BY weighted_score DESC
LIMIT 30;
The <=> operator is pgvector's cosine distance, which combined with the HNSW index runs in approximately O(log N) time. At Refery's current data volume this query returns in single-digit milliseconds.
The do_not_contact = false filter is non-negotiable. The blacklist (currently including Resolve.ai and a handful of others, each with do_not_contact = true) is enforced at the SQL layer to make it impossible for downstream code to accidentally surface a blacklisted candidate.
Hard filters
Before vector similarity is computed, hard filters eliminate structurally infeasible matches. These are deterministic and free.
-- Visa filter, applied per role
WHERE
CASE j.visa_requirement
WHEN 'us_authorized' THEN c.work_authorization @> ARRAY['us_authorized']
WHEN 'eu_authorized' THEN c.work_authorization @> ARRAY['eu_authorized']
WHEN 'sponsor_available' THEN true -- any candidate fits
WHEN 'global_remote' THEN true
ELSE false
END
-- Comp floor filter
AND j.salary_max >= c.salary_expectation_min
-- Location / remote-mode filter
AND (
j.remote_policy = 'remote'
OR (j.remote_policy = 'hybrid' AND c.location_metro = j.location_metro)
OR (j.remote_policy = 'onsite' AND c.location_metro = j.location_metro)
)
Placement constraint engine
After retrieval, every candidate has a 3-axis "reach" score against the live job board:
// matching/placement-constraint.ts
export async function computePlacementConstraint(
candidate: Candidate,
openJobs: Job[]
): Promise<PlacementConstraint> {
const total = openJobs.length;
const compReach = openJobs.filter(j =>
j.salary_max >= candidate.salary_expectation_min
).length / total;
const locationReach = openJobs.filter(j =>
j.remote_policy === 'remote' ||
j.location_metro === candidate.location_metro ||
candidate.relocation_open
).length / total;
const visaReach = candidate.work_authorization.includes('us_authorized')
? 1.0
: openJobs.filter(j =>
['global_remote', 'sponsor_available', 'eu_authorized']
.includes(j.visa_requirement)
).length / total;
return {
compReach,
locationReach,
visaReach,
flags: [
compReach < 0.30 ? 'comp_floor_too_high' : null,
locationReach < 0.40 ? 'location_too_narrow' : null,
visaReach < 0.20 ? 'visa_too_restrictive' : null,
].filter(Boolean),
verdict: classifyPlacementDifficulty(compReach, locationReach, visaReach),
};
}
This gets surfaced in the candidate brief as something like:
Placement constraint: $325K floor + remote Portland + non-US authorized = 10% of board reachable. Structurally hard to place.
This is a critical signal for triage. A "Top 5%" candidate with a 10% reach is a different operational reality than a Top 25% candidate with 80% reach.
Why this is novel
The combination of these techniques in a recruiting platform constitutes proprietary technical know-how:
- Multi-vector decomposition tuned for hiring axes specifically. Most production retrieval systems use a single dense embedding. Refery's seven-axis decomposition is custom-designed for the specific signal structure of senior tech hiring.
- Function-aware logo tier with pedigree and AI modifiers. Each modifier is independently auditable, surfaced in the brief, and tunable. There is no equivalent open-source or commercial library for this.
- Trajectory as a first-class structured signal. Free-text resumes do not capture stage transitions; Refery extracts them explicitly and they become a top-3 weight in the matching score.
- Placement constraint engine as a triage primitive. Computing reach across the live board for every candidate, on every match run, gives the operator a quantitative basis for prioritization that no ATS provides.
- Co-located vector + relational + RLS architecture. All retrieval, filtering, and access-control happens in a single SQL query, eliminating an entire class of consistency bugs.
The result is a system that converts a candidate's full history into a small set of high-signal structured fields and embeddings, which then drive every downstream decision. Continued in chapter 03, where the panel takes over for the genuinely ambiguous calls.