Senior Site Reliability Engineer - Experis IT Jobs

Site Reliability Engineer
£550.00 - £600 per day (Umbrella)
London (Hybrid)
6 Month Contract

Role Summary

Our client is currently searching for an experienced Site Reliability Engineer to join the team in London to help support them with their current business transformation. Our client is transforming their business and building their future on smoke-free products that are a better choice than continued smoking.

We are looking a highly experienced Site Reliability Engineer with experience in developing processes, tools and automation for managing distributed systems in production environments. Our team combines software and systems engineering with system administration practices to develop creative engineering solutions to operations problems.

Within this position, you would be an integral member of the wider technology function and will be highly visible, work across multiple teams to deliver reliable solutions and drive both efficiency and effectiveness. It is essential that the role holder is a highly collaborative individual. We are seeking individuals that enjoy automating and reducing manual work - quality and time to market is important to us so it's key that we have people who truly believe in this direction.

TECHNICAL REQUIREMENTS:

Proficiency in coding/scripting languages such as Python, Java, or Go, utilized for automating deployment, configuration, management, and monitoring processes.
Strong understanding and practical experience with observability/monitoring tooling like New Relic, Dynatrace, or Splunk. Ability to define and create monitors/alerts at both infrastructure and application layers.
Demonstrated automation skills, showcasing how you've applied automation to solve problems and reduce manual effort and activity.
Knowledge of distributed computing and cloud-native applications, including proficiency in AWS, Terraform, ELK stack (including monitoring tools as mentioned), PagerDuty/OpsGenie or similar, and Jenkins.

NON-TECHNICAL REQUIREMENTS:

Awareness of Site Reliability Engineering (SRE) principles, including Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets.
Understanding of development and operations models, with a holistic view of monitoring systems and processes.
Experience working in Agile environments, with the ability to adapt to changing requirements and priorities.

OTHER DESIRABLE SKILLS:

Exceptional problem-solving skills with an investigative mindset, capable of identifying root causes and implementing effective solutions.
Strong collaboration skills, able to work effectively with cross-functional teams to achieve common goals.

Safwaan Ibrahim