Rodeo
ResourcesPartnersSign in

Mappa

Senior Data Engineer (AI-First Platform)

London
$6k/month
Posted 3 days ago
Sign up to applySee more jobs like this

How your CV stacks up

1Upload CV
2Analyse CV
3Improve CV

Upload your CV to see how well it fits this job role

?%

Senior Data Engineer

Full-Time | Remote/Hybrid £60,000

About the Company

We are an AI-native distribution company for creators and media IP, specialising in content optimisation through automation. Our platform intelligently processes raw content, determines optimal posting strategies, and refines performance insights — akin to Palantir for social media in the creator economy.

Organised by content verticals (video, podcasts, and future expansions), each feeds into a unified content and performance graph, growing smarter with each campaign. The team balances YC-style agility with Palantir-level rigor in systems and data, fostering rapid iteration and deep engineering discipline.


Position Overview

We’re seeking a Senior Data Engineer with a passion for AI/ML systems, responsible for designing and scaling data-driven AI infrastructure at enterprise scale. Experience with high-volume, unstructured data pipelines and AI-forward datasets is essential.

Your role: Build and maintain robust data layers capable of ingesting GBs per day of video, image, and performance data, while creating LLM/agent-friendly datasets to power internal AI models.

Key focus areas: In-house social API integrations, scalable data infrastructure, and AI-aligned pipeline optimisation.


Core Responsibilities

1. In-House Social API Integration

  • Redesign and own data ingestion pipelines to transition from third-party social APIs to a fully self-managed, scalable solution.
  • Engineer API integrations and bridge data gaps for valid, actionable social media signals.

2. High-Volume Data Infrastructure & Architecture

  • Scale and optimize data pipelines for massive volumes of unstructured data (video, metadata, performance logs).
  • Build fault-tolerant, performance-optimized storage using PostgreSQL for structured data and specialized storage for AV/ML asset tracking.
  • Develop scheduling, ETL, and orchestration frameworks to handle concurrent team campaigns without bottlenecking.

Reasons to use Rodeo

I’m in my final year doing Economics and I don’t know whether to apply for grad schemes now or do a masters first. What do you think?

Honest answer — it depends on where you want to end up. A lot of top grad schemes (Big 4, civil service, banking) don’t need a masters. Let’s look at the ones you’d be competitive for now, and we can decide if a masters actually adds anything.

Also worth knowing: most autumn 2026 applications are open now. Timing matters more than you think.

Start with a chat, not a search bar

Grad scheme, placement, apprenticeship? Not sure what you want yet — that's fine. Your agent talks it through with you and turns "I have no idea" into a shortlist.

P

Graduate Consultant — 2026 Scheme

PwC·London, UK
£35,000/yr

Why you're a good match

Strong

Your economics background and your summer at a regional bank line up with what PwC looks for on the consulting scheme. Applications close in four weeks.

See breakdown
Save jobNot relevant
View details

It searches the market for you

Every day your agent scans the market matching roles against what actually matters to you, not just keywords on a CV.

Why you're a good match

You’ve got the grades and the economics background, and your bank internship is exactly the experience this scheme looks for. Apply soon — deadlines close within the month.

See breakdown
Strong

Experience fit

Your summer at the bank plus your econometrics coursework map directly to the day-one responsibilities on this scheme — client modelling, market briefings, and deal support.

See breakdown
Strong

Only hits

No noise. No "maybe this fits." Just roles with a clear explanation of why they're right — and where to focus when applying.

3. LLM & Agent Optimised Datasets

  • A unique technical bridge between massive-scale data and internal AI agents.
  • Structure and pre-process natural-language, semi-structured, and unstructured datasets (Markdown, text snippets, raw promts, logs) for effective inference.
  • Implement data versioning, lineage, and attributes for high-contexting AI agents, enabling ML scenarios like auto-tuning content discovery.

Requirements & Qualifications

Technical

  • 5+ years as a Data Engineer with AI/ML adjacent experience, preferably in fast-moving startups or enterprise-grade creative/platform companies.
  • Python: Intermediate to advanced, with a bias toward clean, production-grade code and scalability.
  • PostgreSQL: Production-ready expertise in optimisation, partitioning, and transactional consistency.
  • Data Platforms: Hands-on experience with:
    • Orchestration: Airflow, Prefect, or other workload schedulers.
    • Batch/Stream: Kafka, Flume, Spark, or cloud solutions (e.g., Dataflow).
    • Storage: Variants of Parquet/ORC/CSVs, object storage, and managed DBs (e.g., Snowflake, BigQuery).
    • ETL: Batch/near-real-time processes, debatching, rigour in data quality, and clean field transitions.
  • AI/LLM Alignment: Prior hands-on experience (or deep curiosity) in tasks like:
    • LLM fine-tuning data preparation or dataset design.
    • Legal metadata organisation for compliance-intensive AI ingestion use cases.
    • Formatting and ingesting SOTA datasets in custom pipelines.

Get help with your application

Your very own career expert that helps elevate your application to the next level.

Get help applying for this job

Creative & Cultural Fit

  • English Proficiency: C1/C2 level, or near-native. Must intuitively navigate academic/peer reviews (e.g., design docs, architectural diagrams) and inform non-technical leaders with clarity and excitement.
  • Problem-Solving: No deadlines are too hard; no waste tolerated. Think end-to-end when taking ownership of components.
  • Startuportology: Thrives in small teams, permanently challenged to scale elegantly, and treat failure as catharsis for growth.

Deliverables & Soft Metrics

  • Pipeline latency and uptime: SLA <99.9%.
  • Social data metadata quality: >95% field scrubbed and normalized.
  • Agent/ML influence test scores: 20% improvement in campaign time-to-accuracy.

Location & Flexibility

Hybrid (UK-Based, South London - WeWork near Waterloo Station)

  • Model: 3 days on-site (flexible roster) at office hours with a late workend shift to accommodate US team overlaps.
  • Preferred but not mandatory for candidates outside UK (all remote interviews).

Fully Remote (Latam-Based)

  • Timezone: Works with Pacific Standard Time (PST), no local office requirement.

Compensation: £60,000 (candidate-grade benches offset via • Early equity • Performance bonus 15% once year-2 hinges (social maturity). Flexible final arrangements (per participation), but no trailing work hours monoxide.

Trusted by 25,000+ job seekers

“It took my CV and asked me questions relevant to understanding what kind of jobs to suggest for me. Suggestions were almost perfect. Jobs were exactly what I’ve been looking for.”

Jessica, London

Get help applying for this job

Skills

Data Engineering
Python
PostgreSQL
Data Orchestration
Database Optimization
Data Pipelines
AI
LLM
Natural Language Processing
Content Management
Performance Data
Social Media
Infrastructure
Scalability
Data Processing
Collaboration

Location

London, England, United Kingdom

Sign up to applySee more jobs like this