Rodeo
ResourcesPartnersSign in

OLIX

Architect, Staff & Senior Systems Software Engineer

London
Posted about 2 months ago
Sign up to applySee more jobs like this

How your CV stacks up

1Upload CV
2Analyse CV
3Improve CV

Upload your CV to see how well it fits this job role

?%

Architect, Staff & Senior Systems Software Engineer

OLIX – Architect, Staff & Senior Systems Software Engineer

About OLIX

AI is growing faster than any technology in history, and the explosion in demand has created a massive infrastructure gap. We’re unable to build chips or power stations quickly enough to keep up. The industry still relies on a ten-year-old hardware blueprint that has hit its limits. A faster, more efficient paradigm is the biggest economic opportunity of the next century and will create the most important company of the decade.

The OLIX Decode Accelerator 1 (DX-1) is the first accelerator designed exclusively for decode, enabling a step change in system-level performance through rack-scale co-design of logic, data movement, packaging, optics, and interconnect.

The Role

We’re searching for Architect, Staff & Senior Systems Software Engineers to own the software stack that brings our next-generation DX-1 accelerator to life as a production inference platform.

The DX-1 is a dataflow architecture built for decode in a disaggregated inference environment. Your mission is to optimise large AI models at rack scale by designing the runtime and serving stack that connects PyTorch and JAX down to the hardware.

This is a whole-stack systems role:

  • Work at the intersection of runtime, network, and accelerator
  • Partner closely with hardware, compiler, and modelling teams
  • Optimise serving performance and shape the platform’s direction

You’ll contribute not just through a list of responsibilities, but through the standards you set, the systems you build, and the teams you grow.


Responsibilities

  • Own the runtime & serving stack: Design, build, and extend a distributed inference and serving stack (e.g. MPL LeaF, SGLang, NVIDIA Triton) on DX-1, tackling challenges holistically rather than treating components as a black box.
  • Scale distributed inference: Define how inference scales across many accelerators, including tensor/pipeline/data parallelism, collective communication patterns, KV-cache management and offload, and memory-aware scheduling across disaggregated topologies.
  • Engineer for reliability at scale: Ensure distributed inference remains dependable through failure handling, graceful degradation, load balancing, and recovery, while establishing standards for observability, tracing, and tooling that enable teams to diagnose problems across the runtime, network, and accelerator.
  • Drive bring-up: Evaluate system behaviour pre-silicon through simulation, emulation, FPGA prototyping, or analytical modelling, root-cause issues during bring-up, and influence hardware/software trade-offs early.
  • Set standards across teams: Identify the highest-impact systems challenges and ensure they’re addressed. Hold and articulate a clear technical standard while raising the bar through code review, pairing, and constructive challenges. Build leveraged impact by creating shared systems and frameworks rather than assuming singular ownership.
  • Shape direction: Bring clarity to ambiguous cross-team problems, push for structural system improvements, and influence long-term platform strategy based on external research, competitive analysis, and industry trends.

Reasons to use Rodeo

I’m in my final year doing Economics and I don’t know whether to apply for grad schemes now or do a masters first. What do you think?

Honest answer — it depends on where you want to end up. A lot of top grad schemes (Big 4, civil service, banking) don’t need a masters. Let’s look at the ones you’d be competitive for now, and we can decide if a masters actually adds anything.

Also worth knowing: most autumn 2026 applications are open now. Timing matters more than you think.

Start with a chat, not a search bar

Grad scheme, placement, apprenticeship? Not sure what you want yet — that's fine. Your agent talks it through with you and turns "I have no idea" into a shortlist.

P

Graduate Consultant — 2026 Scheme

PwC·London, UK
£35,000/yr

Why you're a good match

Strong

Your economics background and your summer at a regional bank line up with what PwC looks for on the consulting scheme. Applications close in four weeks.

See breakdown
Save jobNot relevant
View details

It searches the market for you

Every day your agent scans the market matching roles against what actually matters to you, not just keywords on a CV.

Why you're a good match

You’ve got the grades and the economics background, and your bank internship is exactly the experience this scheme looks for. Apply soon — deadlines close within the month.

See breakdown
Strong

Experience fit

Your summer at the bank plus your econometrics coursework map directly to the day-one responsibilities on this scheme — client modelling, market briefings, and deal support.

See breakdown
Strong

Only hits

No noise. No "maybe this fits." Just roles with a clear explanation of why they're right — and where to focus when applying.


Requirements

###Must-Have:

  • Deep expertise in systems software, with hands-on experience in C/C++ and rooted understanding of runtimes, networking, and accelerators.
  • Proven ownership of a difficult end-to-end systems challenge—preferably in distributed inference/serving stack development (vLLM, SGLang, NVIDIA Triton, or TensorRT-LLM).
  • Ability to connect accelerators to PyTorch/JAX without treating them as opaque black boxes, includingು
  • Strong understanding of parallelism strategies, collective communication, KV-cache, memory management, and cluster-scale reliability.
  • Whole-stack debugging skills, including end-to-end tracing, profile-guided optimisation,‘ load replay, and reasoning from architectural constraints such as SRAM limits, host-device latency, KV footprint, memory bandwidth, and collective latency.
  • Solid judgement on weight trade-offs (speed, cost, quality) with a track record for handling late-stage risks with structured calmness.
  • Strong communication skills, enabling cross-functional alignment (hardware, compilers, modelling) without relying on managerial authority.

Get help with your application

Your very own career expert that helps elevate your application to the next level.

Get help applying for this job
  • Degree in Computer Science, Electrical Engineering, Mathematics, or a related field.

###Nice-to-Have:

  • Experience with dataflow architectures or non-GPU accelerators.
  • Pre-/postsilicon bring-up on custom hardware (ASIC or FPGA).
  • Production observability at scale (hardware counters, Prometheus/Grafana-style exports).
  • Familiarity in HPC cluster design, high-speed networking, distributed systems.

perks and benefits

  • Competitive salary: Paid in line with experience, skills, and location.
  • Equity & ownership: Meaningful stock options awarded as recognition of your contribution and partnership in OLIX’s growth.
  • Proximity bonus: A Living-Local Bonus is provided annually if our office is within 20 minutes of your residence.
  • Retirement benefits: Invested matches in the company’s retirement plans for your long-term security.

Note on eligibility: Due to U.S. export control restrictions, we’re able to offer roles only to candidates whose most recent permanent residency or citizenship isn’t in certain restricted jurisdictions (e.g., China, Russia). The full eligibility criteria may be reviewed upon request.

Trusted by 25,000+ job seekers

“It took my CV and asked me questions relevant to understanding what kind of jobs to suggest for me. Suggestions were almost perfect. Jobs were exactly what I’ve been looking for.”

Jessica, London

Get help applying for this job

Skills

Systems Software
C/C++
Distributed Inference
Serving Stack
Parallelism Strategies
Collective Communication
KV-Cache Management
Memory Management
Reliability
Whole-Stack Debugging
Observability
Communication
Technical Standards
Cross-Functional Collaboration
Dataflow Architecture
HPC Cluster Design

Location

London, England, United Kingdom

Sign up to applySee more jobs like this