IMU Biosciences

Senior Software Engineer, Semantics

London

Posted 27 days ago

How your CV stacks up

1Upload CV

2Analyse CV

3Improve CV

Upload your CV to see how well it fits this job role

Drag and drop your CV

or browse files

Supported files: PDF, DOC, DOCX

Senior Software Engineer, Semantics

About IMU Biosciences

IMU Biosciences has developed proprietary platform technologies that generate and translate vast system-level immune data into actionable insights and tools to drive the development of precision medicines across a variety of diseases. Built on over a decade of research at King’s College London and the Francis Crick Institute, IMU leverages advanced immune profiling with proprietary AI and machine learning analytics to uncover novel clinical immune signatures. IMU continues to establish partnerships with leading pharma and biotech companies to advance disease diagnosis, optimise product selection, and improve patient stratification and monitoring - while also building its own pipeline of innovative products.

About The Role

IMU is applying cutting-edge immune system science, data engineering, and machine learning to understand human health in a deeper and more actionable way. As our work scales, we are building the platform capabilities needed to make complex biological and computational data structured, traceable, reproducible, and reusable across the organisation.

We are looking for a Senior Software Engineer, Semantics to help build the semantic and metadata capabilities of the core Data Platform.

This is not a standalone ontology or knowledge graph initiative. The role sits directly within the Data Platform team and focuses on building practical platform systems that help scientists, computational immunologists, and engineers work with trusted and well-structured scientific data.

You will work closely with the Computational Immunology team to understand how analytical and machine learning workflows produce data, and help ensure those outputs are consistently structured, versioned, lineage-aware, discoverable, and reusable inside the platform.

A core part of the role is helping turn fragmented scientific and computational outputs into usable data products that can be reliably found, assembled, interpreted, and reused by laboratory, project management, Computational Immunology, and Data Platform teams.

The work includes metadata systems, lineage and provenance capture, dataset contracts, FAIR data practices, data catalog capabilities, and semantic integration between pipelines, datasets, and scientific outputs.

This is a hands-on engineering role for someone who enjoys working across platform engineering, scientific workflows, metadata systems, and cloud-native infrastructure, while staying grounded in practical delivery and operational ownership.

The role is hybrid, with an expectation of working from our London office a couple of days per week.

Team and ways of working

You will join a small, growing Data Platform function working closely with Computational Immunology, wet lab, and clinical-facing teams.

This role sits within the platform layer: helping build the shared foundations that allow teams to ingest, structure, govern, discover, process, assemble, and reuse scientific data reliably.

You will not be working in isolation. The role is deeply connected to the wider platform effort and will involve close collaboration with engineers responsible for ingestion, orchestration, infrastructure, security, transformation, and platform operations.

As a senior engineer, you will be expected to:

Own substantial technical problems from discovery through implementation and operation. Work directly with stakeholders to gather requirements and translate them into practical platform capabilities. Influence platform architecture and engineering standards alongside the wider Data Platform team. Make pragmatic technical trade-offs in environments with evolving scientific and operational requirements. Contribute to shaping how the platform evolves as usage, scale, and regulatory expectations grow.

Reasons to use Rodeo

I’m in my final year doing Economics and I don’t know whether to apply for grad schemes now or do a masters first. What do you think?

Honest answer — it depends on where you want to end up. A lot of top grad schemes (Big 4, civil service, banking) don’t need a masters. Let’s look at the ones you’d be competitive for now, and we can decide if a masters actually adds anything.

Also worth knowing: most autumn 2026 applications are open now. Timing matters more than you think.

Start with a chat, not a search bar

Grad scheme, placement, apprenticeship? Not sure what you want yet — that's fine. Your agent talks it through with you and turns "I have no idea" into a shortlist.

It searches the market for you

Every day your agent scans the market matching roles against what actually matters to you, not just keywords on a CV.

Only hits

No noise. No "maybe this fits." Just roles with a clear explanation of why they're right — and where to focus when applying.

You should be comfortable balancing hands-on implementation work with technical leadership, collaboration, and operational ownership.

What you will do

Build and maintain metadata, semantic, lineage, and provenance capabilities within the core Data Platform. Work closely with the Computational Immunology team to understand analytical workflows and translate them into dataset contracts, metadata standards, and reusable platform capabilities. Develop systems for structuring, validating, registering, versioning, and governing scientific datasets and computational outputs. Build ingestion pathways that return outputs from analytical and machine learning pipelines to the Data Platform as governed and reusable scientific datasets. Help establish practical patterns for discovering, assembling, and delivering trusted datasets to laboratory, Computational Immunology, and operational teams. Improve data discoverability, usability, lineage, provenance, auditability, reproducibility, and reuse through well-structured, named, versioned, archived, and analysis-ready data products aligned with practical FAIR data principles. Contribute to data catalog and scientific knowledge graph capabilities that connect datasets, workflows, biological entities, analytical outputs, and scientific conclusions. Build AWS-native services, APIs, automation, and platform tooling using modern engineering practices. Work closely with scientists, computational immunologists, software engineers, and platform users to turn real scientific and data problems into reliable platform capabilities. Improve observability, reliability, documentation, maintainability, and operational maturity across semantic and metadata services.

What we are looking for

Core Experience

We do not expect every candidate to have used every technology in our stack. We are mainly looking for strong engineering judgement, practical delivery experience, and evidence of building reliable systems in complex data environments.

Strong software engineering experience in Python. Practical experience building and operating cloud-native systems on AWS. Experience working with data platforms, metadata systems, or data-intensive distributed systems. Experience with APIs, distributed services, infrastructure-as-code, CI/CD, and production engineering practices. Experience working with data lineage, metadata, schema management, versioning, validation, or data governance concepts. Experience supporting or integrating analytical, machine learning, or scientific workflows into production systems.

Technologies we use or value highly

The closer your experience is to this stack, the faster you are likely to be productive, but we care more about engineering depth and judgement than keyword matching.

AWS ecosystem Pulumi or equivalent infrastructure-as-code GitHub Actions or equivalent CI/CD Nextflow or equivalent scientific workflow orchestration systems Data lakes, metadata systems, lineage systems, schema management, and governed dataset patterns FAIR data principles, provenance, auditability, and reproducibility Knowledge graphs, semantic modelling, ontologies, RDF / OWL, or related technologies Data catalogs and metadata management systems Containers and modern DevOps practices

Get help with your application

Your very own career expert that helps elevate your application to the next level.

Get help applying for this job

Nice to have

Experience in scientific, bioinformatics, computational biology, immunology, clinical, healthcare, or regulated data environments. Experience with large-scale analytical or biomedical datasets. Experience implementing data catalogs, lineage systems, or knowledge graph capabilities. Experience with modern data lakehouse architectures, including open table formats such as Delta Lake or Apache Iceberg; query engines such as AWS Athena, or DuckDB; processing frameworks such as Apache Spark; and orchestration/compute platforms such as AWS Batch. Experience with regulated software, security, quality, or healthcare frameworks such as ISO 27001, IEC 62304, HIPAA, or similar. Experience building self-service platform capabilities for scientific or data-intensive teams.

What Success Looks Like

You will help make the Data Platform a dependable foundation for scientific discovery and computational research.

Success means computational outputs are no longer treated as disconnected pipeline artifacts, but as trusted and reusable scientific assets with clear structure, metadata, provenance, lineage, and lifecycle management.

Laboratory and Computational Immunology teams should be able to reliably find, assemble, interpret, and reuse trusted datasets without needing bespoke manual data wrangling for each project.

The platform should increasingly support discoverable, versioned, semantically connected scientific datasets that can be reused across workflows, studies, and future AI systems.

Why join us

IMU is building a company around a big scientific idea: that a deeper, data-driven understanding of the immune system can change how we understand, monitor, and ultimately improve human health.

We work with rich, complex biological and computational data and combine scientific expertise, machine learning, and modern platform engineering to generate insight from the immune system. As our work moves closer to the clinic, the systems we build now need to support both rapid scientific discovery and the discipline required for reproducibility, governance, and future regulated use.

This is a rare opportunity to help shape the semantic and metadata foundations of a modern scientific data platform while the architecture and operating model are still being defined.

You will not be maintaining a legacy metadata system or building isolated ontology models disconnected from real workflows. You will help build practical platform capabilities that directly support computational science, reproducibility, AI-readiness, and scientific reuse at scale.

The team is collaborative, scientifically curious, and pragmatic. We value strong engineering, sensible architecture, operational ownership, and people who can work effectively across disciplines while staying focused on practical delivery and real scientific outcomes.

The Data Platform is central to IMU’s future, which means this role offers genuine scope to influence technical direction, shape engineering standards, and help define how scientific data is structured, governed, discovered, and reused across the organisation as the platform scales.

You will join while foundational decisions are still being made, with the opportunity to build systems and patterns that can grow with the company from research-scale workflows toward future clinical and regulated environments.

We support hybrid working, with time in our London office a couple of days per week, and flexibility around how people do their best work.

Trusted by 25,000+ job seekers

“It took my CV and asked me questions relevant to understanding what kind of jobs to suggest for me. Suggestions were almost perfect. Jobs were exactly what I’ve been looking for.”

Jessica, London

Get help applying for this job

Skills

Python

Cloud-Native Systems

Data Platforms

Metadata Systems

APIs

Distributed Services

Infrastructure-As-Code

CI/CD

Data Lineage

Data Governance

Machine Learning

Scientific Workflows

Data Catalogs

Semantic Modelling

Knowledge Graphs

DevOps Practices

Location

London, England, United Kingdom