Anthropic

Staff Engineer, Datacenter Server Lifecycle

London

£325k – £390k/yr

Posted 2 days ago

How your CV stacks up

1Upload CV

2Analyse CV

3Improve CV

Upload your CV to see how well it fits this job role

Drag and drop your CV

or browse files

Supported files: PDF, DOC, DOCX

Staff Engineer, Datacenter Server Lifecycle

About Anthropic

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

About The Role

Anthropic is expanding beyond cloud infrastructure, and this role sits at the heart of that effort. As a Staff Engineer on the Datacenter Server Lifecycle team, you will own the end-to-end operational journey of every machine in our facility — from initial provisioning and deployment, across its working life, through maintenance and refresh, and all the way to decommissioning. This is greenfield work: you will help define the processes, tooling, and operational standards that govern how we run and retire hardware at scale.

A distinguishing aspect of this role is its deep intersection with security. The machines in our datacenter handle some of the most sensitive workloads in AI — training frontier models and serving millions of users interacting with Claude. Ensuring that every machine in the fleet is trusted, attested, and operating with a verified chain of integrity from the hardware up is a core part of the job, not an afterthought. You will partner closely with our Infrastructure Security team to define and enforce trusted compute standards across the lifecycle, from secure provisioning through end-of-life handling.

Key Responsibilities

Lead the build-out of automation to support datacenters containing tens of thousands of servers.
Define and own the end-to-end server lifecycle strategy — from provisioning and deployment through operation, maintenance, refresh, and decommissioning — and maintain automation and operational procedures for common lifecycle events (e.g., hardware failures, firmware upgrades, fleet rotations).
Partner closely with Infrastructure Security to design and enforce trusted compute standards across the server lifecycle.
Work closely with our Networking team to ensure end-to-end connectivity across all sites.
Build and maintain tooling to track machine health, configuration, and operational status across the full datacenter fleet.

Reasons to use Rodeo

I’m in my final year doing Economics and I don’t know whether to apply for grad schemes now or do a masters first. What do you think?

Honest answer — it depends on where you want to end up. A lot of top grad schemes (Big 4, civil service, banking) don’t need a masters. Let’s look at the ones you’d be competitive for now, and we can decide if a masters actually adds anything.

Also worth knowing: most autumn 2026 applications are open now. Timing matters more than you think.

Start with a chat, not a search bar

Grad scheme, placement, apprenticeship? Not sure what you want yet — that's fine. Your agent talks it through with you and turns "I have no idea" into a shortlist.

It searches the market for you

Every day your agent scans the market matching roles against what actually matters to you, not just keywords on a CV.

Only hits

No noise. No "maybe this fits." Just roles with a clear explanation of why they're right — and where to focus when applying.

Minimum Qualifications

Hands-on experience with server hardware, including rack deployment, cabling, troubleshooting, and understanding failure modes at scale.
End-to-end understanding of hardware lifecycle management: asset tracking, provisioning workflows, maintenance scheduling, and decommissioning practices.
Proficiency in at least one programming language (e.g., Python, Rust, Go, or Java).
Working knowledge of modern cloud infrastructure, including Kubernetes, Infrastructure as Code, AWS, and GCP.
Ability to communicate clearly and build consensus with a wide range of stakeholders.
Comfort navigating ambiguity and making progress on complex, cross-functional problems.
Willingness to travel occasionally to datacenter sites across North America.

Preferred Qualifications

8+ years of experience in datacenter operations, hardware infrastructure management, or a closely related discipline.
Hands-on experience with GPU or AI accelerator hardware (e.g., NVIDIA A100/H100, AMD MI300, Google TPUs, or AWS Trainium) and an understanding of their operational demands.
Familiarity with modern provisioning tooling such as coreboot, LinuxBoot, or u-root.
Experience building or contributing to datacenter automation or fleet management platforms.
Experience building and deploying server operating system distributions across thousands of hosts.
Background in large-scale capacity planning and hardware refresh strategy, ideally at a hyperscaler or large cloud provider.
Experience with trusted compute and hardware security concepts such as secure boot, TPM, hardware attestation, and firmware verification — or a strong desire to develop deep expertise in this area.

Compensation

The annual salary range for this role is £325,000 — £390,000 GBP.

Get help with your application

Your very own career expert that helps elevate your application to the next level.

Get help applying for this job

Logistics and Additional Information

Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience.

Required field of study: A field relevant to the role, as demonstrated through coursework, training, or professional experience.

Years of experience: Requirements will correlate with the internal job level of the position.

Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time, though some roles may require more frequent office attendance.

Visa sponsorship: We do sponsor visas! If we make you an offer, we will make every reasonable effort to support visa requirements.

We encourage you to apply even if you do not feel you meet every qualification. Underrepresented groups are particularly encouraged to apply.

Career Highlights

The easiest way to understand our research directions is to read our recent publications, including contributions from before Anthropic such as:

GPT-3
Circuit-Based Interpretability
Multimodal Neurons
Scaling Laws
AI & Compute
Concrete Problems in AI Safety
Learning from Human Preferences

How We’re Different

Big science approach: We value impact over incremental progress and work as a cohesive team on large-scale efforts.
Collaborative environment: Regular research discussions ensure all individuals prioritize the most impactful work.
Safety-focused: We believe responsible AI research is an empirical science akin to physics and biology.
Representation matters: We embrace diverse perspectives in our quest to advance trustworthy AI.

Guidance on Candidates' AI Usage: Learn about our policy for using AI in our application process.

Key Benefits

Competitive compensation and benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours
A collaborative and supportive work environment in San Francisco

Come work with us! Apply here

Trusted by 25,000+ job seekers

“It took my CV and asked me questions relevant to understanding what kind of jobs to suggest for me. Suggestions were almost perfect. Jobs were exactly what I’ve been looking for.”

Jessica, London

Get help applying for this job

Skills

Server Hardware

Hardware Lifecycle Management

Programming

Cloud Infrastructure

Networking

Automation

Security Standards

Machine Health Tracking

GPU Hardware

Datacenter Operations

Capacity Planning

Firmware Verification

Trusted Compute

AI Accelerators

Provisioning Tooling

Fleet Management

Location

London, England, United Kingdom