Back to jobs
Platform Support Engineer (EMEA)
London, England, United Kingdom
Who We Are
Lightning AI is the company behind PyTorch Lightning. Founded in 2019, we build an end-to-end platform for developing, training, and deploying AI systems—designed to take ideas from research to production with less friction.
Through our merger with Voltage Park, a neocloud and AI Factory, Lightning AI combines developer-first software with cost-efficient, large-scale compute. Teams get the tools they need for experimentation, training, and production inference, with security, observability, and control built in.
We serve solo researchers, startups, and large enterprises. Lightning AI operates globally with offices in New York City, San Francisco, Seattle, and London, and is backed by Coatue, Index Ventures, Bain Capital Ventures, and Firstminute.
What We’re Looking For
Lightning AI is looking to hire Platform Support Engineers to join our EMEA Customer Experience team, supporting ML engineers running large-scale training and inference workloads across cloud infrastructure, Kubernetes, and GPU platforms in production environments.
This role is not a ticket router or traditional support engineer. You are a technical partner to ML teams - helping diagnose failures, improve reliability, and guide customers through complex distributed systems problems.The problems range from Kubernetes scheduling and GPU orchestration to distributed PyTorch failures, inference latency, networking bottlenecks, storage performance, and platform reliability. You’ll gain exposure to a wide variety of real world AI workloads across industries and help shape the infrastructure powering the next generation of ML applications.
We are currently hiring for two EMEA shifts (9AM–7PM CET/CEST):
- Sunday–Wednesday
- Saturday–Tuesday OR Thursday–Sunday
This role is hybrid out of our London office, with an in-office requirement of at least 2 days per week and occasional team and company offsites. We are not able to provide visa sponsorship for this role at this time.
What You'll Do
Work Directly With ML Engineers
- Partner directly with customer engineering teams running training and inference workloads in production
- Help customers diagnose and resolve complex distributed systems and ML infrastructure issues
- Act as a technical advisor during high impact incidents and platform degradation events
- Translate infrastructure level issues into actionable guidance for ML engineers
- Build credibility with customers through strong technical reasoning and clear communication
Debug ML Infrastructure & Distributed Workloads
- Investigate failures involving distributed training, Kubernetes orchestration, GPU allocation, networking, and storage systems
- Troubleshoot PyTorch, CUDA, NCCL, and inference serving related issues
- Analyze logs, metrics, traces, and system behavior to isolate root causes
- Debug containerized workloads running across Kubernetes and bare metal GPU environments
- Support customers scaling workloads across multi node GPU systems
- Diagnose performance bottlenecks involving compute, memory, networking, or storage
Improve Reliability & Platform Operations
- Identify recurring patterns across customer issues and drive long term reliability improvements
- Contribute to post incident reviews and operational improvements
- Build internal tooling, automation, documentation, and runbooks
- Partner closely with infrastructure, networking, and platform engineering teams
- Help improve observability, operational visibility, and troubleshooting workflows
- Improve the customer experience through better processes and technical guidance
What This Role Is Not
To set clear expectations:
- This is not a traditional help desk or ticket routing support role
- This is not purely customer success or account management
- This is not a backend engineering role
- This is not a passive escalation position
This role is for engineers who enjoy solving difficult technical problems while working closely with other engineers.
What You’ll Need
Required Qualifications
Infrastructure & Systems
- Strong software engineering and systems troubleshooting background
- Experience with Kubernetes and containerized environments
- Linux systems knowledge, including networking, storage, process management, and performance tuning
- Experience with cloud infrastructure and distributed systems
- Experience with observability and debugging tools such as Prometheus, Grafana, or OpenTelemetry
ML Infrastructure Experience
- Hands on experience operating machine learning workloads in production or research environments
- Experience with distributed ML systems and tooling such as PyTorch, CUDA, or NCCL
- Familiarity with GPU infrastructure and orchestration
- Experience troubleshooting performance, reliability, or scaling issues in ML infrastructure
- Understanding of the operational challenges involved in running ML systems at scale
Collaboration
- Strong communication skills and ability to work directly with highly technical customers and engineering teams
- Comfortable operating in fast moving, highly ambiguous environments
- Enjoys solving complex technical problems collaboratively
Nice-to-Haves
- Experience with large scale model training or distributed inference systems
- Familiarity with Ray, Kubeflow, Slurm, or similar distributed scheduling platforms
- Experience with InfiniBand, RDMA, or high-performance networking
- Experience operating bare metal infrastructure
- Familiarity with storage systems commonly used in ML environments
- Experience working at an AI infrastructure, cloud, MLOps, or developer tooling company
- Contributions to platform engineering, developer infrastructure, or operational tooling projects
- Experience writing automation, tooling, or scripts in Python or similar languages
Compensation
We are committed to offering competitive compensation that reflects the value each team member brings to our mission. Final offers are based on factors such as experience, skills, geographic location, and role expectations. In addition to base salary, our total rewards package for eligible roles includes a discretionary bonus, a meaningful equity component, and comprehensive benefits.
The anticipated annual base salary range for this role is:
£75,000 - £95,000 GBP
Benefits and Perks
We offer a comprehensive and competitive benefits package designed to support our employees’ health, well-being, and long-term success. Benefits may vary by location, team, and role.
Benefits include:
- Comprehensive medical, dental and vision coverage (U.S.); Private medical and dental insurance (U.K.)
- Retirement and financial wellness support (U.S.); Pension contribution (U.K.)
- Generous paid time off, plus holidays
- Paid parental leave
- Professional development support
- Wellness and work-from-home stipends
- Flexible work environment
At Lightning AI, we are committed to fostering an inclusive and diverse workplace. We believe that diverse teams drive innovation and create better products. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity, national origin, age, disability, veteran status, or any other protected characteristic. We are dedicated to building a culture where everyone can thrive and contribute to their fullest potential.
Create a Job Alert
Interested in building your career at Lightning AI? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field
--- Sector: AI/ML Subsectors: Open Source Functions: Customer Service£46k - £63k per annum
...focus on the employee experience. Our Workplace Technology Support Engineers help ensure we provide world-class dynamic technology support... ...and service requests through ServiceNow or equivalent ITSM platforms while ensuring SLA adherence. Handle user account management...EMEAHybrid workingOn-siteRemoteFlexible hoursShift workWeekend workWeekday work- £45k - £60k per annumEstimated...passionate about mentoring others and delivering exceptional platform support? Practical Information Location: UK or Poland | Reporting... ..., written and spoken As a Senior Platform Support Engineer , you will play a key role in ensuring the stability,...SuggestedFull-timeHybrid working
£300 - £400 per day
...Platform Support Engineer – Contract Position – HIRING ASAP Location: Hybrid in London office Start Date: ASAP Duration: 3 months (extension after) Daily Rate: £300 - £400 per day outside IR35 Summary of what we’re looking for We need someone who is excellent...SuggestedDaily payHybrid workingOn-siteImmediate startRotating shifts£32k per annum
...Title: Onsite Helpdesk Engineer – Japanese speaker Salary: maximum £32,000 + Commuting... ...Responsibilities: ~ Communication with end users from EMEA region face to face or via telephone,... ...contact. ~ Hands on Desktop/Laptop PC Support Skill. Network and PC implementation and...EMEAPermanentFixed-term contractOn-siteImmediate startRemote- £53k - £70k per annumEstimatedDescriptionA global investment management firm is seeking aSales Support Analyst to join their Client Groupin London. This is a dynamic, client... ...supporting relationship managers and sales teams across the UK & EMEA regions. Key responsibilities: • Produce presentation decks,...EMEAPermanent
£40k - £45k per annum
IT Support Analyst Engineer Our Client is looking to recruit an IT Support Analyst Engineer with at least 3 to 5 years experience in IT Network and Support ideally from a banking environment. Responsible for the support and maintenance of the IT infrastructure, and to provide...Full-timeOn-site£28k - £34k per annum
...Job Description First Line Support Engineer / IT Service Desk Technician A fantastic opportunity for an IT support professional with previous... ...MS-900 or SC-900 ~ Experience with Microsoft Intune, Azure platforms or endpoint management tools BENEFITS ~ Learning and...PermanentFull-timeHybrid workingOn-siteProbationary periodMonday to Friday- £33k - £43k per annumEstimated...trusted advisor; guiding Marketing, Product, Engineering, Business Operations, and Data teams... ...will develop a deep understanding of our platform, and provide hands-on technical assistance... ...functionally with Account Management, Sales, Support, Product, and Engineering teams to drive...EMEALong-term contractHybrid workingOn-site
£40k - £55k per annum
...COMPANY: Canoe Intelligence WEBSITE : TITLE: Support Analyst - EMEA LOCATION: London, UK (Hybrid) SALARY : £40,000 - £55,000 base... ..., and gain deeper access to their data. Canoe’s AI-driven platform was developed in 2013 for Portage Partners LLC, a private investment...EMEAInternshipHybrid workingWork from homeFlexible hours- £56k - £75k per annumEstimated...its data into instant access to capital. Role Overview We're seeking a skilled DevOps Engineer to modernize and secure our infrastructure as we scale our lending platform. You'll work hands-on with GCP and Kubernetes to build reliable, efficient deployment...Long-term contractFull-timeHybrid workingImmediate start
£39.42 per hour
...experienced Partner Activation & Technical Operations Specialistto support the EMEA Sales organisation. This is a highly operational and... ...focus externally. ~Work with the Google Ads and Media MMM data platforms to measure and evaluate client ROI. ~Support dashboard development...EMEAHourly payHybrid workingImmediate start£100k per annum
...Trade Support Engineer – HFT £100,000 + Bonus Quant Capital is urgently looking for a Trade Support Engineer to join our high profile client. Our client is a well known global High Frequency Trading firm. They value technology especially the opensource variety...£30k - £40k per annum
...We are looking for a Level 2 Support Engineer with a junior development background who is keen to grow into a hybrid role spanning technical... ...is quiet you’ll be part of our wider QA team testing our platform, raising defects and suggesting improvements. This role is ideal...Full-timeHybrid working- £30k - £41k per annumEstimated.... We provide technology and resources to the education sector, supporting over 10 million students worldwide. We work with over28,000 schools... ...Visit us here to find out more: The IT Support Engineer function has an important role to play in helping teachers to teach...Extra incomePermanentTemporaryHybrid workingOn-site
£30k - £40k per annum
IT Support Engineer with 1 and 2 Level Support Our Client is a very successful bank with offices across Europe and the Middle East. The Client is looking to recruit a 1st & 2nd Line IT Support Engineer for the Bank?s Mayfair London office. You will have at least 3 to 5...Full-timeFixed-term contractOn-siteWork from homeMonday to Friday5 days/weekRotating shiftsWeekend work£100k per annum
...Junior Trading Support Engineer – Prop Trading £100,000 Plus Bonus Quant Capital is urgently looking for a Junior Trading Support Engineer to join a high profile trading firm in London. We are looking for engineers who can be responsible for designing and supporting...Hybrid workingOn-site£250 - £550 per day
Company: VERTEC SERVICES LTD Job Type: Contract, Full Time Salary: £250 - £550/day impressive commission planFull-time- ...We are seeking an experienced 2nd Line Tech Bar Engineer to provide high-quality, customer-facing technical support across onsite and remote environments. This role acts as an escalation point from 1st line support, resolving more complex technical issues while delivering...Remote
- £23k - £30k per annumEstimated.... We provide technology and resources to the education sector, supporting over 20 million students and improving educational outcomes worldwide... ...Visit us here to find out more: The IT Support Engineer function has an important role to play in helping teachers to teach...Permanent
- £34k - £45k per annumEstimated...At TTEC Digital, we coach clients to ensure their employees feel valued, and fully supported, because an amazing customer experience is an employee first process. Our vision is the same, a place where employees know they can thrive. About Us TTEC Digital and our 1,80...EMEARemote job
£34k - £38k per annum
...Job Description IT Support Engineer – 2nd Line (Service Desk) An exciting opportunity for a 2nd Line IT Support Engineer to provide technical support, troubleshooting and cloud-based solutions across Microsoft environments within a fast-paced MSP setting. If you’ve...PermanentFull-timeHybrid workingOn-siteWork from homeProbationary periodMonday to Friday£35k - £40k per annum
...Be part of one of the UK's leading record labels! This company is on the hunt for an IT Support Engineer who is as passionate about Macs and JAMF as they are about music. Get ready to rock your role by maintaining and developing their end-user devices, rolling out zero-touch...Permanent£40k - £50k per annum
IT Systems Network Support Engineer Our Client is a Bank based in Central London, are looking to recruit an IT Systems Support Engineer ideally with at least 2 years of experience in IT within the financial industry and overall 5 years of experience in IT. You will be working...Full-timeHybrid workingOn-site- £32k - £41k per annumEstimated...Job title: Interface Regulatory Support Engineer (Graduate) Location: London, UK Job reference #: 33529 Contract type: Fixed-term (1-year) Language requirements: Fluent level of English At Eni , we are looking for an Interface Regulatory Support Engineer...PermanentFixed-term contractInternshipFlexible hours
£28k - £32k per annum
...Systems Engineer (1st Line) | Central London | Financial Services Technology Looking to kick-start your IT career in a fast-paced, high... ...'re partnering with a leading technology services provider that supports some of the most demanding and prestigious organisations within...Long-term contractOn-siteRemoteRotating shifts- £141k - £180k per annumEstimated...increasing, proportion of the matters it undertakes from London are Pan-EMEA. Notwithstanding the current (mildly vapid) real estate market,... ...estate team has continued to expand faster than the tax team can support it - periodically resulting in workflow bottlenecks. The law firm...EMEA
£50k - £65k per annum
...Reporting Manager for their leading Oxford Circus based team. This position will see you taking on the tax reporting responsibilities for EMEA, monitoring the deferred tax opportunities and reviewing accounts as produced by juniors. This would make an excellent first step in...EMEA- £59k - £77k per annumEstimated...scientific approach to trading. We are looking for a Junior Trade Support Engineer to join our front office technology team. This is an entry-... ...operational processes, runbooks, and alerting as the trading platform scales Gradually take on more responsibility for production...Full-time
- £35k - £46k per annumEstimated...Job Description Role: Deskside Support Engineer Location: London Accenture is a leading global professional services company, providing a broad range of services in strategy and consulting, interactive, technology and operations, with digital capabilities across...Full-timeRemote
- £43k - £58k per annumEstimated...(NYSE: CRCL) is one of the world’s leading internet financial platform companies, building the foundation of a more open, global economy... ...Responsible For Circle is looking for a Senior Technical Support Engineer who will provide world-class support to customers building on...Full-timeFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Platform Support Engineer (EMEA). Be the first to apply!
- low latency platform engineer London
- remote platform engineer London
- azure platform engineer London
- junior devops platform engineer London
- digital platform infrastructure engineer London
- 2nd line IT support engineer London
- remote IT support engineer London
- technical support engineer active directory unix linux London
- onsite support engineer London
- data centre support engineer London

