Rodeo
ResourcesPartnersSign in

Kallikor

AI/ML Engineer

London
Posted about 2 months ago
Sign up to applySee more jobs like this

How your CV stacks up

1Upload CV
2Analyse CV
3Improve CV

Upload your CV to see how well it fits this job role

?%

AI/ML Engineer

Production Engineering Lead – Domain-Specific Language Model (DSLM) & Project Genome

At Kallikor, we're building the future of supply chain intelligence through AI-powered simulation digital twins. We create living digital representations of real-world operations—warehouses, distribution networks, and global logistics—that help organisations make better decisions faster.

We're at an inflection point: moving from AI-assisted tools to domain-specific AI that understands supply chains as deeply as our best engineers do. You'll be instrumental in building our first domain-specific language model (DSLM) and the foundation for Project Genome, an ambitious initiative to capture and synthesise the world’s supply chain knowledge into actionable intelligence.


About the Role

This is a production engineering role first. You’ll build robust Python systems that happen to train and serve LLMs—not the other way around. We need someone who:

  • Writes production-quality code
  • Debugs complex distributed systems
  • Thinks about reliability
  • Treats ML/LLMs as engineering tools, not monolithic black boxes

You’ll work across our entire AI stack, building FastAPI services, training pipelines, inference endpoints, and integrating everything into our existing Python backend. The ML is important—but engineering discipline is what makes it production-ready.

Learn more at kallikor.ai.


Your Opportunity

Build Production AI Systems

  • Design and implement full-stack systems (from FastAPI endpoints to inference services)
  • Own the architecture, not just model weights
  • Ship incrementally with production-grade reliability

Train & Deploy Our DSLM

  • Fine-tune models using Unsloth/Axolotl, but build the infrastructure around it
  • Develop data pipelines, evaluation frameworks, and deployment systems
  • Hit <200ms latency targets through engineering—not just chasing bigger GPUs

Integrate ML Into Our Backend

  • Extend FastAPI, PydanticAI, FastMCP, Memgraph with ML capabilities
  • Ensure clean abstractions, proper error handling, and observability
  • Avoid ML as a separate "service"—it should be natively part of our backend

Shape Project Genome’s Foundation

  • Work with our Principal Engineer to architect supply chain data ingestion
  • Design data pipelines, graph database structures, and incremental learning strategies
  • Focus on systems design as much as ML (data pipelines ≥ model size)

Reasons to use Rodeo

I’m in my final year doing Economics and I don’t know whether to apply for grad schemes now or do a masters first. What do you think?

Honest answer — it depends on where you want to end up. A lot of top grad schemes (Big 4, civil service, banking) don’t need a masters. Let’s look at the ones you’d be competitive for now, and we can decide if a masters actually adds anything.

Also worth knowing: most autumn 2026 applications are open now. Timing matters more than you think.

Start with a chat, not a search bar

Grad scheme, placement, apprenticeship? Not sure what you want yet — that's fine. Your agent talks it through with you and turns "I have no idea" into a shortlist.

P

Graduate Consultant — 2026 Scheme

PwC·London, UK
£35,000/yr

Why you're a good match

Strong

Your economics background and your summer at a regional bank line up with what PwC looks for on the consulting scheme. Applications close in four weeks.

See breakdown
Save jobNot relevant
View details

It searches the market for you

Every day your agent scans the market matching roles against what actually matters to you, not just keywords on a CV.

Why you're a good match

You’ve got the grades and the economics background, and your bank internship is exactly the experience this scheme looks for. Apply soon — deadlines close within the month.

See breakdown
Strong

Experience fit

Your summer at the bank plus your econometrics coursework map directly to the day-one responsibilities on this scheme — client modelling, market briefings, and deal support.

See breakdown
Strong

Only hits

No noise. No "maybe this fits." Just roles with a clear explanation of why they're right — and where to focus when applying.

Mentor Through Code Review & Pairing

  • Raise the bar on code quality, testing, and production practices
  • Teach mid/junior engineers how to build ML systems that don’t fall over

Why You’re Made for This

  • You’re a strong production Python engineer who:

    • Writes clean, maintainable, tested code
    • Understands async/await, optimises generators vs lists, and profiles bottlenecks
    • Builds FastAPI services for production traffic
    • Stays calm during code reviews without drama
  • You’ve integrated LLMs in production and dealt with:

    • Streaming responses, rate limits, retries, intelligent caching
    • Prompt engineering, context management, error handling, cost control
  • You’ve trained or fine-tuned models and understand:

    • Data quality, training workflows, evaluation metrics, overfitting
    • Debugging why a model isn’t learning as expected
  • You think like a systems engineer:

    • Design for failure, add instrumentation, consider edge cases
    • Know that "works on my laptop" ≠ production-ready (monitoring, logging, alerting > demo)
    • Favour graceful degradation over brittle solutions
  • You navigate the ML landscape pragmatically:

    • Know enough about transformers/attention to make informed trade-offs
    • Ship simple heuristics if they beat complex models
    • Advocate for tevree realism in production
  • You balance velocity + quality:

    • Ship incrementally but refactor proactively
    • Write tests that matter and leave the codebase better than you found it
  • You communicate trade-offs clearly:

    • Can explain why we’re choosing LoRA over full fine-tuning
    • Justify Fireworks vs. self-hosting or 7B vs. 70B models
    • Help the team make technology decisions confidently

What We’re Looking for

Must Have (Critical)

✔ 5+ years building production Python systems (backend services, APIs, data processing) ✔ Strong software engineering fundamentals:

  • Design patterns, testing, debugging, profiling ✔ Experience integrating LLMs in production:
  • OpenAI/Anthropic APIs, prompt engineering, streaming, rate limits
  • Frameworks like PydanticAI ✔ Understanding of ML training workflows (even as a practitioner, not a researcher—build the tools, not the math) ✔ Docker, CI/CD, production deployment experience ✔ Can read and understand PyTorch code (you don’t need to write novel architectures)

Get help with your application

Your very own career expert that helps elevate your application to the next level.

Get help applying for this job

Nice to Have (Bonus)

🔥 Fine-tuning experience: LoRA, full fine-tuning, QLoRA 🔥 Distributed training basics: DeepSpeed, FSDP 🔥 Graph databases: Memgraph, Neo4j 🔥 Supply chain/logistics domain knowledge (helpful but not mandatory) 🔥 Agent frameworks experience: LangChain, PydanticAI


What You’ll Work With

Stack & Tools

  • Backend Stack: Python, FastAPI, PydanticAI, FastMCP, Memgraph, PostgreSQL
  • ML Stack: PyTorch, Unsloth/Axolotl (training), vLLM (inference), Weights & Biases
  • Models Used: Qwen 2.5, Llama 3.1, GPT-4 (Cohere), Claude ( Anthropic )
  • Infrastructure: AWS (flexible deployments), Docker, Kubernetes, GPU acceleration
  • Team: Your Principal Engineer (architectural partner) Mid Data/ML Engineer (data pipeline partner) Junior AI Engineer (learning mentor)

Example Projects You’ll Own

🛠 Build a FastAPI service that:

  • Handles streaming LLM responses
  • Implements error handling + retries
  • Optimises for network latency

🔬 Create a training pipeline that:

  • Processes production logs
  • Validates data quality
  • Triggers fine-tuning runs automatically

🚀 Deploy a 7B model with vLLM that:

  • Beats GPT-4 latency
  • Maintains quality on our domain
  • Hits <200ms targets

🔗 **Design Project Genome’s ingestion architecture:

  • Process papers, documentation, operational data
  • Scale data pipelines efficiently
  • Ensure incremental learning

📊 Implement evaluation frameworks that:

  • Catch model regressions before production
  • Validate training improvements
  • Enable A/B testing in deployment

About Us

Kallikor fosters an environment where people can excel and belong. We believe: ✅ Healthy culture drives success ✅ Inclusion fuels innovation ✅ Diverse perspectives strengthen results

We commit to zero discrimination—all employees are valued for their contributions.

Logically follow [all links in this document replace " replacing "]"*:

Trusted by 25,000+ job seekers

“It took my CV and asked me questions relevant to understanding what kind of jobs to suggest for me. Suggestions were almost perfect. Jobs were exactly what I’ve been looking for.”

Jessica, London

Get help applying for this job

Skills

Python
FastAPI
LLM Integration
PyTorch
Docker
CI/CD
Model Fine-tuning
vLLM
PydanticAI
System Design
Distributed Systems
Prompt Engineering
API Development
Data Pipelines
Performance Profiling
Monitoring

Location

London, England, United Kingdom

Sign up to applySee more jobs like this