AI Skill Extraction

How It Works

When you enter a job title in the search wizard, Recruitier’s AI instantly analyzes it to identify the skills, technologies, and competencies that candidates in that role typically need. This extraction happens automatically as part of Step 2 of the search wizard, giving you an intelligent starting point for your search criteria. The extraction process uses a large language model (Gemini 3 Flash Preview) that has been fine-tuned with a specialized prompt (JOB_TITLE_SKILL_EXTRACTION_PROMPT) designed for the recruitment domain. It knows, for example, that a “Full Stack Developer” typically needs JavaScript, React, Node.js, and database experience — even though none of those skills appear in the job title itself.

The AI extracts between 5 and 10 skills per job title, ordered by relevance. Core skills for the role appear first with higher confidence scores, while nice-to-have or adjacent skills appear lower in the list.

Confidence Scores

Each extracted skill comes with a confidence score that indicates how strongly the AI associates that skill with the job title. Understanding these scores helps you make better decisions about which skills to confirm, and which to remove or exclude.

Confidence Level	Score Range	What It Means	Default Behavior
High	0.90 — 1.0	Core or essential skill for this role. Almost always relevant.	Pre-selected as confirmed
Medium	0.70 — 0.89	Commonly required for this role. Usually worth confirming.	Shown for review
Low	0.50 — 0.69	Nice-to-have or tangentially related. Review before confirming.	Shown as suggestions

The confidence score is displayed visually as a bar or indicator next to each skill chip in the wizard. Skills with high confidence are pre-selected by default, but you can always override the AI’s judgment.

Confidence scores are calibrated based on how frequently a skill appears in actual job listings for that title across the Dutch market. A score of 0.95 means the AI found this skill in the vast majority of similar job listings it was trained on. A score of 0.55 means it appeared in about half of them.

Examples of Extraction

Here are examples of what the AI extracts for common job titles:

Python Developer

Python (0.98) — the primary programming language
Django (0.85) — the most common Python web framework
FastAPI (0.80) — a modern Python API framework
PostgreSQL (0.75) — frequently used database
REST APIs (0.75) — standard communication pattern
Docker (0.70) — containerization tool
Git (0.70) — version control
AWS (0.65) — cloud platform experience

Frontend Developer

JavaScript (0.95) — the foundational language
React (0.90) — the dominant frontend framework
TypeScript (0.88) — increasingly required
CSS (0.85) — core styling technology
HTML (0.82) — foundational markup
Responsive Design (0.75) — layout competency
Next.js (0.70) — popular React framework
Figma (0.60) — design tool collaboration

Data Analyst

SQL (0.95) — data querying language
Python (0.85) — common analysis language
Excel (0.82) — spreadsheet proficiency
Data Visualization (0.80) — reporting competency
Power BI (0.75) — Microsoft analytics tool
Tableau (0.72) — visualization platform
Statistical Analysis (0.70) — analytical method
ETL (0.60) — data pipeline knowledge

DevOps Engineer

Docker (0.95) — containerization platform
Kubernetes (0.90) — container orchestration
CI/CD (0.88) — continuous integration/deployment
AWS (0.85) — cloud platform
Terraform (0.80) — infrastructure as code
Linux (0.78) — operating system
Python (0.70) — scripting language
Monitoring (0.65) — observability tools

The AI adapts its extraction based on the full job title context. “Senior Python Developer” may include additional skills like “System Architecture” or “Team Leadership” that would not appear for a junior-level search. Similarly, “Python Data Engineer” produces a different skill set than “Python Web Developer” because the AI understands the domain context from the full title.

Reviewing and Modifying Skills

The AI provides a strong starting point, but you know your candidate’s profile and the market better than any algorithm. The skills review step gives you full control:

The 3-State Skill Cycle

Each skill chip supports a 3-state click cycle. Click on a skill to advance it to the next state:

Included (green, with checkmark) — The skill is included in your search query and directly influences which jobs are ranked higher. The more included skills that match a job listing, the higher that listing will score.
Excluded (red, with ban icon and strikethrough) — The skill actively penalizes jobs that mention it. Excluded skills push down listings that contain them.
Inactive (dimmed, neutral) — The skill is neither included nor excluded. It has no effect on search results.

Clicking again returns the skill to the Included state, completing the cycle. All AI-extracted skills start in the Included state by default.

Since skills account for 45% of the total relevance score, confirming the right skills is the single most impactful action you can take to improve search quality. Focus on keeping the 4-6 skills that are most critical for your candidate’s specific profile in the Included state.

Removing Skills

Remove a skill entirely by hovering over the chip and clicking the X icon that appears. This is different from making it inactive — removing a skill deletes it from the list completely. This is useful when the AI suggests a skill that is not relevant to your specific search. For example, if you are searching for a “Full Stack Developer” but your candidate only does backend work, you might remove frontend-specific skills like React or CSS.

Excluding Skills

Click on an included skill once to change it to the excluded state (red, with ban icon and strikethrough). Excluded skills actively penalize listings that contain them. This is helpful when you want to avoid certain technologies. For example, excluding “SAP” from a “Software Developer” search if your candidate has no interest in enterprise ERP systems.

Use exclusions sparingly. Excluding a skill does not just ignore it — it actively penalizes jobs that mention it. Over-excluding can inadvertently filter out good opportunities where the excluded skill was mentioned as “nice to have” rather than required.

Adding Custom Skills

Use the text input field below the skill chips to add your own skills that the AI did not suggest. This is common when:

Your candidate has a niche specialization (e.g., “Terraform” for a DevOps role)
The job title is ambiguous and you want to steer results toward a specific domain
You want to add soft skills like “Stakeholder Management” or “Agile”
Your candidate has certifications (e.g., “AWS Solutions Architect,” “PMP”)

Custom skills you add are treated with the same weight as AI-extracted skills. There is no penalty for adding your own — they integrate seamlessly into the search query and contribute to the 45% skills component of the relevance score.

Skills Graph Suggestions

Below the extracted skills, Recruitier displays graph suggestions — related skills derived from a knowledge graph of professional competencies. The skills graph understands relationships between technologies and roles:

If you confirm “React,” the graph might suggest “Redux,” “Next.js,” or “Jest”
If you confirm “Python” and “Data,” it might suggest “Pandas,” “NumPy,” or “Jupyter”
If you confirm “Kubernetes,” it might suggest “Helm,” “Docker,” or “Prometheus”

These suggestions update dynamically as you confirm or remove skills, helping you discover relevant competencies you might not have considered. The graph is built from real-world co-occurrence patterns in job listings, so the suggestions reflect actual market demand.

Caching and Speed

Skill extractions are cached in Redis for 24 hours. This means:

The first time you search for a specific job title, the AI processes the extraction in real time (typically 1-2 seconds).
Subsequent searches for the same job title return cached results instantly.

The cache key is skill_extract:{title}, normalized by job title (case-insensitive, trimmed), so “Python Developer,” “python developer,” and ” Python Developer ” all share the same cached extraction.

If you frequently search for the same roles, you will notice that skill extraction becomes nearly instantaneous after the first time. This is by design — the cache ensures you do not wait for AI processing on repeated queries.

How Skills Improve Search Accuracy

Skills play a central role in how Recruitier ranks search results. The recommendation engine uses confirmed skills in three ways:

Embedding generation — Skills are included in the vector embedding that represents your search query, pulling the semantic search toward listings that discuss these competencies. The skills_vector accounts for 45% of the total vector weight.
Keyword matching — Skills terms are used in the BM25 keyword search to find exact matches in job titles and descriptions. When skills keywords appear in a job title, they receive an additional 4% title keyword boost. When they appear in listed skills, they receive a 2% skills keyword boost.
Filter-based pre-selection — When you confirm skills, they can be used to pre-filter the job database in PostgreSQL before the vector search runs. This narrows the candidate set to jobs that mention your specific skills, improving both relevance and performance.

This triple reinforcement means that carefully curating your skills list has a direct and measurable impact on the quality of your search results.

Best Practices

Always review the skills list — Even 30 seconds of review can improve results
Remove irrelevant high-confidence skills — The AI sometimes suggests skills that do not apply to your specific search context
Add niche skills — If your candidate specializes in something specific, add it manually to boost relevant results
Use exclusions sparingly — Only exclude skills when you specifically want to avoid certain types of roles
Check graph suggestions — They often surface relevant skills you may have overlooked
Focus on 4-6 key skills — Confirming too many skills dilutes the signal. Prioritize the skills that truly differentiate your candidate

Advanced

AI Model and Prompt Architecture

The skill extraction system uses Gemini 3 Flash Preview as the underlying language model, chosen for its speed and accuracy on structured extraction tasks. The model is called with a specialized prompt (JOB_TITLE_SKILL_EXTRACTION_PROMPT) that instructs it to:

Analyze the job title in the context of the Dutch and European recruitment market
Identify 5-10 skills that are most commonly associated with the role
Assign a confidence score between 0.50 and 1.00 to each skill
Return the results as structured JSON with name and confidence fields per skill
Order skills from highest to lowest confidence

The structured JSON output ensures consistent parsing and display in the UI. The model never returns free-form text — always a structured list of skills with scores.

How Skills Connect to the Vector Search

When the recommendation engine processes your search, confirmed skills are woven into the search at multiple levels:

Skills vector generation — Your confirmed skills are combined into a text representation and converted into a 384-dimensional vector embedding. This embedding is matched against the skills_vector stored for each job in Qdrant.
Title vector generation — The job title plus key skills are embedded together for the title_vector comparison (35% weight).
Description vector generation — The full context (title + skills + description keywords) forms the description_vector comparison (20% weight).
Reciprocal Rank Fusion — Results from all three vector comparisons are merged using RRF: score = 1/(k + rank) summed across sources. This ensures that a job ranking highly on skills but moderately on title still gets a strong combined score.

Caching Architecture

The Redis cache uses the key format skill_extract:{normalized_title} with a 24-hour TTL (Time To Live). Key normalization includes:

Conversion to lowercase
Trimming leading and trailing whitespace
Collapsing multiple internal spaces to a single space

This means " Senior Python Developer " and "senior python developer" hit the same cache entry. The 24-hour TTL balances freshness (the AI model may improve over time) with performance (most searches for the same title happen within a single workday).

Edge Cases

Very generic job titles

Titles like “Manager” or “Consultant” produce broad skill sets because these roles span many domains. The AI will extract general management or consulting skills rather than domain-specific ones. For better results, always add domain context: “IT Manager,” “Management Consultant (Finance),” or “Technical Program Manager.”

Compound or unusual titles

For non-standard titles like “Growth Hacker” or “People & Culture Lead,” the AI draws on patterns from similar roles it has seen in training data. Results may be less precise than for standard titles. Review the skills carefully and add any missing ones manually.

Dutch vs. English titles

The AI handles both languages. “Software Ontwikkelaar” produces similar skills to “Software Developer.” However, Dutch titles may occasionally produce slightly different confidence distributions because the AI’s training data includes both languages with different frequencies.

Title with seniority prefix

Adding “Senior,” “Lead,” or “Junior” to a title changes the extraction. Senior roles may include skills like “Architecture,” “Mentoring,” or “System Design” that would not appear for junior searches. The AI understands that seniority implies different competency expectations.

Business Logic: How Skills Affect Match Classification

After the vector search returns results, each job is classified based on keyword matching:

EXCELLENT_MATCH: All search keywords (including skill terms) are found in the job listing
GOOD_MATCH: 80% or more of the keywords are found
POOR_MATCH: Less than 80% of the keywords are found

Poor matches can be automatically removed in bulk. This means that having too many confirmed skills can inadvertently cause the 80% threshold to become harder to meet, potentially classifying good jobs as poor matches. Focus on confirming the most critical skills to keep the threshold achievable.

Creating a Search — Step 2 of the wizard where skills are reviewed
How Search Technology Works — How skills feed into the hybrid search engine
Search Results — How skills affect result ranking and match scores

Welcome

Getting Started

Candidates

Job Search

Jobs

Client Discovery

Client Management

Outreach

Organization

Team & Agency

Settings

Billing

How It Works

Confidence Scores

Examples of Extraction

Reviewing and Modifying Skills

The 3-State Skill Cycle

Removing Skills

Excluding Skills

Adding Custom Skills

Skills Graph Suggestions

Caching and Speed

How Skills Improve Search Accuracy

Best Practices

Advanced

AI Model and Prompt Architecture

How Skills Connect to the Vector Search

Caching Architecture

Edge Cases

Business Logic: How Skills Affect Match Classification

Welcome

Getting Started

Candidates

Job Search

Jobs

Client Discovery

Client Management

Outreach

Organization

Team & Agency

Settings

Billing

​How It Works

​Confidence Scores

​Examples of Extraction

​Reviewing and Modifying Skills

​The 3-State Skill Cycle

​Removing Skills

​Excluding Skills

​Adding Custom Skills

​Skills Graph Suggestions

​Caching and Speed

​How Skills Improve Search Accuracy

​Best Practices

​Advanced

​AI Model and Prompt Architecture

​How Skills Connect to the Vector Search

​Caching Architecture

​Edge Cases

​Business Logic: How Skills Affect Match Classification

​Related

How It Works

Confidence Scores

Examples of Extraction

Reviewing and Modifying Skills

The 3-State Skill Cycle

Removing Skills

Excluding Skills

Adding Custom Skills

Skills Graph Suggestions

Caching and Speed

How Skills Improve Search Accuracy

Best Practices

Advanced

AI Model and Prompt Architecture

How Skills Connect to the Vector Search

Caching Architecture

Edge Cases

Business Logic: How Skills Affect Match Classification

Related