Senior Site Reliability Engineer Job Description Template
Our company is looking for a Senior Site Reliability Engineer to join our team.
Responsibilities:
- Defining and implementing self-healing and self-management for the platforms;
- Ensuring that the Error Budgets in place are tracked and defended;
- Designing and measuring Service Level Objectives for our platforms ,ensuring that they are effective measures of our clients’ success;
- Bringing expertise to bear on the design and engineering of the product to ensure reliability and high availability concerns are up front;
- Maintaining and improving our observability tools such as Prometheus, Grafana, Thanos, and Splunk;
- Facilitate stand-ups and discussions with developers, engineering teams and project managers;
- Configure and deploy regular software releases using a continuous delivery pipeline;
- Improve observability of our platform and applications to make troubleshooting process straightforward;
- Ensure our engineering processes have a focus on security, scalability and performance;
- Ensure the best practices of trustworthy computing and secure development and implementation lifecycle are adhered to;
- Break requirements down into stories and tasks, along with work estimates;
- You’d be reporting directly to our SRE Architecture and Development Lead;
- Research and gather project requirements;
- Provide expertise and guidance to design and develop a wide range of key systems;
- Design, develop and implement solutions that improve the stability, scalability, availability, and performance of Cookpad’s Global service.
Requirements:
- Clear understanding of SRE principles and eagerness to put them into practice;
- History of mentoring junior engineering resources, and ability to influence across engineering teams;
- Substantial experience in a platform engineering role, with exposure to infrastructure and middleware platforms;
- Educated to Bachelor’s degree level or equivalent qualification/relevant work experience;
- Sports and social activities;
- Strong communication skills in English and building working relationships with coworkers in locations around the globe;
- SRE/DevOps experience and comfortable operating software in a Linux based environment;
- Interest-free loans to buy a bike or a season ticket, so it’s even easier for you to get to work and start making a difference;
- Volunteering and charitable giving;
- Passion for solving problems using open source software;
- Learning and training opportunities, including coaching, mentoring, events, community meet ups and lots more;
- Experience in software engineering and automation;
- Familiar with at least one Cloud environment, for example, AWS, GCP, or Azure;
- Strong coding skills in Ruby and Golang;
- Flexible working and family friendly policies.