Jobgether
Senior Support Engineer

How your CV stacks up
Upload your CV to see how well it fits this job role
?%
Senior Support Engineer
Senior Support Engineer – Cloud & AI Infrastructure (UK-based) – "Critical Role in AI/ML & Distributed Systems Support"
About the Role
This position is available through a partner company based in United Kingdom. As a Senior Support Engineer, you’ll operate at the heart of production-grade cloud environments, focusing on AI, distributed computing, and GPU workloads.
This hands-on role demands deep technical expertise in diagnosing, escalating, and resolving complex infrastructure issues across:
- Linux systems
- Kubernetes and containerised environments
- Networking, storage, and GPU-based architectures
You’ll act as the lead escalation point for critical incidents, collaborating directly with engineering and customers to restore system stability, conduct root cause analysis, and drive permanent improvements. Beyond traditional support, you’ll contribute to: ✔ Enhancing observability and monitoring tools ✔ Automating troubleshooting workflows ✔ Optimising operational maturity across large-scale cloud platforms
This role suits individuals who thrive in dynamic environments, enjoying the challenge of unambiguous, high-stakes technical problem-solving.
Key Accountabilities
Diagnosis & Resolution
- Investigate, troubleshoot, and resolve high-impact production issues with root cause analysis as a top priority.
- Debug multi-layered systems, including:
- Linux environments (performance, logging, misconfigurations)
- Kubernetes clusters (node performance, pod behaviour, scaling)
- Networking layers (latency, packet loss, distributed traffic issues)
- Storage systems (I/O bottlenecks, cluster replication)
- GPU-accelerated workloads (driver issues, resource contention)
Escalation & Collaboration
- Serve as the senior escalation point for critical incidents, ensuring rapid resolutions.
- Work closely with engineering teams to:
- Reproduce and mitigate issues
- Identify systemic dependencies and drive long-term fixes
- Support customer-facing incidents, including AI/ML pipelines and inference/training workloads.
Reasons to use Rodeo
I’m in my final year doing Economics and I don’t know whether to apply for grad schemes now or do a masters first. What do you think?
Honest answer — it depends on where you want to end up. A lot of top grad schemes (Big 4, civil service, banking) don’t need a masters. Let’s look at the ones you’d be competitive for now, and we can decide if a masters actually adds anything.
Also worth knowing: most autumn 2026 applications are open now. Timing matters more than you think.
Start with a chat, not a search bar
Grad scheme, placement, apprenticeship? Not sure what you want yet — that's fine. Your agent talks it through with you and turns "I have no idea" into a shortlist.
Graduate Consultant — 2026 Scheme
Why you're a good match
StrongYour economics background and your summer at a regional bank line up with what PwC looks for on the consulting scheme. Applications close in four weeks.
See breakdownIt searches the market for you
Every day your agent scans the market matching roles against what actually matters to you, not just keywords on a CV.
Why you're a good match
You’ve got the grades and the economics background, and your bank internship is exactly the experience this scheme looks for. Apply soon — deadlines close within the month.
Experience fit
Your summer at the bank plus your econometrics coursework map directly to the day-one responsibilities on this scheme — client modelling, market briefings, and deal support.
Only hits
No noise. No "maybe this fits." Just roles with a clear explanation of why they're right — and where to focus when applying.
Tooling & Automation
- Develop and improve internal tools (automation scripts, dashboards) in:
- Python, Bash, Go, or equivalent languages.
- Enhance scripting efficiency for repetitive troubleshooting tasks.
Observability & Reliability
- Contribute to operational excellence by:
- Advocating for better monitoring and alerting mechanisms.
- Streamlining debugging workflows through structured post-incident reviews.
- Improve platform reliability and scalability.
Incident Response
- Participate in 24/7 incident-response rotations, including weekend on-call shifts.
Requirements
Technical Expertise
- Strategic Linux administration (RHEL, Ubuntu, debugging kernel/logs).
- Kubernetes expertise, including:
- Deep knowledge of resource scheduling, networking (CNI), and cluster management.
- Experience with self-hosted vs cloud-managed clusters.
- Cloud infrastructure proficiency:
- AWS, GCP, Azure, or OpenStack ( ประจำการ ที่ในที่ตั้งว่าง).
- Hands-on experience with orchestration, scaling, and cross-service dependencies.
- Networking fundamentals, with skills in:
- Troubleshooting complex distributed networks (routing, DNS, load balancing).
- Latency/connectivity patterns in containerised and cloud-native environments.
Troubleshooting & Debugging
- Ability to successfully reproduce and correct incidents under pressure.
- Capability to ascertain root causes from logs, metrics, and distributed tracing.
- Strong collaboration in cross-team scenarios (DevOps, SRE, platform teams).


Get help with your application
Your very own career expert that helps elevate your application to the next level.
Automation & Scripting
- Scripting in Python/Bash/Go for:
- Automating repetitive tasks (diagnostics, infrastructure validation).
- Building lightweight monitoring or alerting scripts.
AI & GPU Experience (Highly Desirable)
- Prior experience with GPU-based computing (CUDA, drivers, resource contention).
- Work on AI/ML workloads, including:
- Model training pipelines and inference.
- Debugging end-to-end distributed AI models.
Soft Skills
- Analytical thinking for ambiguous, high-pressure scenarios.
- Clear, non-technical-friendly communication of technical issues.
- Ability to documents findings for internal learning/improvement.
Benefits & Perks
- Competitive compensation (aligned with experience and expertise).
- Career growth opportunities, with emphasis on learning and technical development.
- Flexible working, high autonomy, and ownership over systems.
- Exposure to cutting-edge technology, including:
- Large-scale distributed systems.
- AI-driven infrastructure.
- Collaborative environment with skilled, internationally diverse engineering teams.
- Impact-oriented role, operating at a cross-section of modern cloud and AI infrastructures.
- Inclusive and innovation-driven culture, focused on:
- Continuous improvement.
- Best practices, not process bureaucracy.
How Jobgether Works
This position is managed through an AI-matching process for fair and efficient candidate reviews. Applied candidates are screened against technical fit, then shortlisted directly for the partner company’s internal hiring team, who handle interview scheduling, assessments, and next steps.
“It took my CV and asked me questions relevant to understanding what kind of jobs to suggest for me. Suggestions were almost perfect. Jobs were exactly what I’ve been looking for.”
Jessica, London
Skills
Location