micro1

(Coding/Agentic AI) Member of Technical Staff

United Kingdom

Posted 2 days ago

How your CV stacks up

1Upload CV

2Analyse CV

3Improve CV

Upload your CV to see how well it fits this job role

Drag and drop your CV

or browse files

Supported files: PDF, DOC, DOCX

(Coding/Agentic AI) Member of Technical Staff

Member of Technical Staff, Coding Research

Job Type: Full-time Location: Remote

The Role

We are seeking a Member of Technical Staff to help advance the evaluation and development of frontier coding agents. Sitting at the intersection of AI research, software engineering, and model evaluation, you will design the benchmarks, methodologies, and data systems that shape how next-generation coding models are measured and improved.

What You'll Do

Design and own evaluation frameworks for coding agents, including benchmark specifications, scoring methodologies, rubrics, and quality standards.
Lead end-to-end research initiatives focused on measuring and improving coding model performance across diverse software engineering tasks.
Develop high-quality datasets, golden examples, and evaluation protocols that enable reliable assessment of frontier coding systems.
Analyze model behavior and failure modes, identifying systematic weaknesses and translating findings into actionable improvements for training and evaluation.
Build tooling and infrastructure that support large-scale experimentation, data generation, review workflows, and evaluation pipelines.
Establish best practices for coding-agent assessment, ensuring methodological rigor, reproducibility, and measurement quality.
Partner closely with researchers, engineers, and applied AI teams to design experiments and evaluate emerging model capabilities.
Contribute to technical reports, benchmark studies, and client-facing research initiatives that communicate model performance and insights.

Reasons to use Rodeo

I’m in my final year doing Economics and I don’t know whether to apply for grad schemes now or do a masters first. What do you think?

Honest answer — it depends on where you want to end up. A lot of top grad schemes (Big 4, civil service, banking) don’t need a masters. Let’s look at the ones you’d be competitive for now, and we can decide if a masters actually adds anything.

Also worth knowing: most autumn 2026 applications are open now. Timing matters more than you think.

Start with a chat, not a search bar

Grad scheme, placement, apprenticeship? Not sure what you want yet — that's fine. Your agent talks it through with you and turns "I have no idea" into a shortlist.

It searches the market for you

Every day your agent scans the market matching roles against what actually matters to you, not just keywords on a CV.

Only hits

No noise. No "maybe this fits." Just roles with a clear explanation of why they're right — and where to focus when applying.

What We're Looking For

Strong software engineering background with expertise in Python, C++, or comparable programming languages.
3+ years of experience in software engineering, machine learning, AI research, evaluation, or related technical disciplines.
Experience designing, reviewing, or validating technical assessments, benchmarks, coding tasks, or evaluation methodologies.
Familiarity with large language models, coding agents, reinforcement learning, model evaluation, or related AI systems.
Proven ability to build tooling, automate workflows, and improve technical processes through systematic experimentation.
Strong analytical skills, with the ability to investigate model behavior and derive insights from complex technical systems.
Excellent written and verbal communication skills, including the ability to clearly articulate technical findings to diverse audiences.
Comfortable operating in fast-moving research environments with significant ambiguity and evolving priorities.

Get help with your application

Your very own career expert that helps elevate your application to the next level.

Get help applying for this job

Preferred

Experience working on frontier AI systems, coding agents, or model evaluation research.
Deep interest in understanding how data, evaluations, and feedback mechanisms influence model capabilities.
Track record of independently driving ambiguous technical or research projects from conception to execution.
Experience designing benchmarks or datasets for machine learning systems at scale.
Familiarity with agentic workflows, tool use, reinforcement learning, or post-training methodologies.
Publications, open-source contributions, or demonstrated technical leadership in AI, machine learning, or software engineering.

Trusted by 25,000+ job seekers

“It took my CV and asked me questions relevant to understanding what kind of jobs to suggest for me. Suggestions were almost perfect. Jobs were exactly what I’ve been looking for.”

Jessica, London

Get help applying for this job

Skills

Software Engineering

Python

C++

Machine Learning

AI Research

Model Evaluation

Benchmarking

Data Systems

Analytical Skills

Technical Communication

Reinforcement Learning

Experimentation

Tooling

Automation

Evaluation Methodologies

Coding Tasks

Location

United Kingdom