Aceolution
Senior Site Reliability Engineer

How your CV stacks up
Upload your CV to see how well it fits this job role
?%
Senior Site Reliability Engineer
Senior Lead Site Reliability Engineer – Observability
We are seeking an [experienced Lead/Senior Site Reliability Engineer – Observability] to join a high-performing engineering team responsible for designing, building, and operating large-scale observability platforms that underpin mission-critical cloud services. This role involves architecting highly scalable monitoring, logging, alerting, and telemetry systems while collaborating closely with software engineering, platform, and infrastructure teams.
About the Role
This opportunity is ideal for engineers who thrive on solving complex infrastructure challenges, working with large-scale distributed systems, and building resilient cloud platforms.
Key Responsibilities
- Design, implement, and maintain scalable and highly available observability platforms.
- Build and operate enterprise-scale monitoring, logging, and alerting solutions.
- Design and optimise Prometheus-based monitoring architectures for large-scale environments.
- Deploy and manage high-performance Elasticsearch clusters for decentralised log storage and analytics.
- Build and maintain high-throughput event streaming pipelines using Kafka.
- Develop self-service APIs, libraries, and tools to enable engineering teams in managing observability.
- Automate infrastructure deployment using Terraform under the purview of Infrastructure as Code (IaC).
- Partner with engineering teams to enhance system reliability, monitoring capabilities, and operational excellence.
- Troubleshoot production issues, conduct root-cause analysis, and implement long-term preventive solutions.
- Participate in an on-call rotation to diagnose and address production disruptions.
- Drive automation, operational excellence, and continuous improvement across cloud platforms.
Reasons to use Rodeo
I’m in my final year doing Economics and I don’t know whether to apply for grad schemes now or do a masters first. What do you think?
Honest answer — it depends on where you want to end up. A lot of top grad schemes (Big 4, civil service, banking) don’t need a masters. Let’s look at the ones you’d be competitive for now, and we can decide if a masters actually adds anything.
Also worth knowing: most autumn 2026 applications are open now. Timing matters more than you think.
Start with a chat, not a search bar
Grad scheme, placement, apprenticeship? Not sure what you want yet — that's fine. Your agent talks it through with you and turns "I have no idea" into a shortlist.
Graduate Consultant — 2026 Scheme
Why you're a good match
StrongYour economics background and your summer at a regional bank line up with what PwC looks for on the consulting scheme. Applications close in four weeks.
See breakdownIt searches the market for you
Every day your agent scans the market matching roles against what actually matters to you, not just keywords on a CV.
Why you're a good match
You’ve got the grades and the economics background, and your bank internship is exactly the experience this scheme looks for. Apply soon — deadlines close within the month.
Experience fit
Your summer at the bank plus your econometrics coursework map directly to the day-one responsibilities on this scheme — client modelling, market briefings, and deal support.
Only hits
No noise. No "maybe this fits." Just roles with a clear explanation of why they're right — and where to focus when applying.
Required Skills & Experience
- Minimum 5+ years designing, deploying, and operating medium-to-large-scale distributed systems on Linux environments (Debian, Ubuntu, etc).
- Minimum 2+ years of programming expertise in one or more of the following:
- Go
- Python
- Ruby
- Scala
- Bash
- Deep understanding of Site Reliability Engineering (SRE) principles and best practices.
- Experience building and supporting highly available cloud infrastructure.
- Strong analytical, troubleshooting, and problem-solving skills.
- Demonstrated ability to collaborate effectively in cross-functional environments.
Technical Skills (Preferred Expertise)
Familiarity with at least several of the following:
- SRE & Observability (Monitoring, Logging, Alerting)
- Prometheus, Thanos, Cortex, Grafana, Graphite
- ELK Stack components (Elasticsearch, Logstash, Kibana)
- Kafka for real-time event streaming
- Terraform & Infrastructure as Code (IaC)
- Ansible for configuration management
- Consul for service meshing and discoverability
- Snowflake for data warehousing
- Linux administration
- DevOps practices (CI/CD pipelines, GitOps, etc.)


Get help with your application
Your very own career expert that helps elevate your application to the next level.
Preferred Qualifications
- Hands-on experience with large-scale observability platforms.
- Strong background in distributed systems and cloud-native infrastructure.
- Ability to process high-volume monitoring, logging, and telemetry data.
- Dedication to automation, scalability, and operational excellence.
- Adaptability to thrive in a remote-first, collaborative environment.
Eligibility
This role requires unrestricted work eligibility in the United Kingdom (no employer sponsorship required).
Why Join Us?
- Collaborate with high-performing engineers on cutting-edge cloud infrastructure.
- Address complex technical challenges while contributing to scalable, automated, and resilient platforms.
- Work in a remote-first environment with flexible, mission-aligned opportunities.
“It took my CV and asked me questions relevant to understanding what kind of jobs to suggest for me. Suggestions were almost perfect. Jobs were exactly what I’ve been looking for.”
Jessica, London
Skills
Location