Site Reliability Engineer
6 Months Contract
We are looking for world-class Site Reliability Engineers with experience in developing processes, tools and automation for managing distributed systems in production environments. Our team combines software and systems engineering with system administration practices to develop creative engineering solutions to operations problems.
Partnering closely with our technology run partners, Infosys, and our colleagues in Domain Architecture and Solution Management, we focus on automating operations for our growing footprint of deployments, building self-service products to empower internal customers and increasing the reliability and scalability of services with application and systems-level improvements.
We are expecting our SREs to live and breathe customer solutions. Our technology footprint is large and unwiedly and needs to be consolidated, managed, automated and simplified - our constant optimization of it is of huge importance. The delivery team comprises Site Reliability Engineers and DevOps Engineers to help crafting solutions that will delight our markets, regions, functions and ultimately, our consumers. The team interact very closely with our vendor population, especially Amazon Web Services, Salesforce, Adobe and SAP) to deliver change to the technology estate and integrate technology where vendors are less mature.
This position is an integral member of the wider technology function and will be highly visible, work across multiple teams to deliver reliable solutions and drive both efficiency and effectiveness. It is essential that the role holder is a highly collaborative individual. We are seeking individuals that enjoy automating and reducing manual work - quality and time to market is important to us so it's key that we have people who truly believe in this direction.
Your 'day to day'
- Collaborate with different technology groups to deliver services and solutions for the technology stack
- Design and implement logging, monitoring and alerting solutions, increasing systems visibility and enabling faster recovery from incidents
- Automate systems management, focusing on performance and scalability, improving utilization and reducing toil
- Design and management of cloud architectures guaranteeing high availability, top performance and reliability
- Optimise systems for performance and scalability, building infrastructure and eliminating work through automation
- Ensure all services and solutions designed are built in adherence to my clients InfoSec policies and are fully industrialized for consumption by customers and technology groups
- Research and development of tooling and/or process to enable delivery, operations and infrastructure teams
- Enable consumption of our features and functions via self-service
- Be involved in engineering and applications operations
- Design cost-effective solutions and services ensuring that value is measured, tracked and realized at the end of key delivery points
- Drive on-boarding and adoption of our core automation tools, services and apps
- Participate in and promoting the my clients DevOps and Technical Communities
- Work in a data driven environment to set, monitor and meet ambitious quality goals
Who we're looking for
- Deep understanding of performance monitoring and web application profiling
- Extensive experience of integrating logging, monitoring and alerting technologies, such as ELK, New Relic, PagerDuty, Grafana and CloudWatch and driving significant change in the customer experience
- Excellent skills and experience in configuration management via Puppet, Chef, Ansible, or others
- Understanding of Internet infrastructure services including DNS, DHCP, LDAP, server virtualization, server monitoring and cloud services
- Demonstrated history in automating operations processes
- Consistent track record of troubleshooting and resolving issues in live production environments and implementing strategies to eliminate them
- Driven approach to continually improving service levels
- Experience with DNS and Content Distribution Networks (Akamai is highly desirable)
- Extensive experience of deploying technology solutions across cloud-native platforms. We predominantly use Amazon Web Services, Salesforce Cloud, Adobe Cloud and SAP Hybris Cloud
- Demonstrable experience of integrating and industrializing technology platforms (SaaS, PaaS, IaaS) on a global scale reducing operational and process waste
- Fluency in one or more high-level programming languages like Java, Python, Go, Ruby or equivalent
- Knowledge of data platforms, including but not limited to: Apache, Kafka, Solr, Redis, MySQL, Cassandra, Hadoop
- Experience with microservices architectures
- Strong ability and enthusiasm to learn new technologies in a short period of time. We seek a self-starter, visionary person with leadership capabilities
- Demonstrable understanding of security and networking principles in a cloud-native environment
- Has worked within, and can appreciate, the need for applying delivery methodologies such as Scrum, Kanban, Waterfall etc. appropriately to the work being delivered
What we offer
Our success depends on the men and women who come to work every single day with a sense of purpose and an appetite for progress. Join my client and you too can:
- Seize the freedom to define your future and ours. We'll empower you to take risks, experiment and explore.
- Be part of an inclusive, diverse culture, where everyone's contribution is respected; collaborate with some of the world's best people and feel like you belong.
- Pursue your ambitions and develop your skills with a global business - our staggering size and scale provides endless opportunities to progress.
- Take pride in delivering our promise to society: to improve the lives of a billion smokers.