Lead Platform Engineer
Up to £120k
I have an exciting opportunity for a Lead Platform Engineer to join a thriving digital organisation. This is a principal technical role and you'll be reporting to the UK&I Digital Security Head of Platform Management.
Lead Platform Engineer Responsibilities;
- Manage FCAPS scenarios in production environment.
- Track and manage platform tickets to meet SLA requirements.
- Monitor and manage cluster capacity based on customer count, events per second Perform Hadoop administration tasks.
- Guide platform engineers to fix day-to-day Operational issues.
- Setup and manage HDP Platform, handling all Hadoop environment builds, performance tuning and ongoing monitoring.
- Develop scripts/ tools to automate platform maintenance activities.
- Work with sustenance engineering on emergency fixes.
- Debug day to day job issues in Hadoop platform and provide solutions.
- Perform software release management tasks.
- Health monitoring of multiple HDP clusters using centralized dashboards, for Hadoop services, overall server health, custom applications running on the cluster.
- Troubleshoot Log collection and ingestion via Apache NiFi to our MDR platform from various network devices (like Firewall, Switches, Router, Proxy, IPS, WAF, Etc..), servers, and Cloud resources.
- Coordinate with Network, Infrastructure, and other organizations as required.
- Perform root cause analysis on failed components and implements corrective measures.
- Configuration of high level and low level HDP parameters to fine tune performance of the cluster.
- Manage escalations on FCAPS issues.
Lead Platform Engineer Benefits;
- Remote working model
- 25 days annual leave
- Great Culture
- Supportive team
- Progression opportunities
- Private Medical scheme
- Up to £120k
- On call pay
Skills needed as a Lead Platform Engineer;
- 5 years minimum experience within an application role.
- Experience in design and operationalizing FCAPS (Fault, Configuration, Availability, Performance, Security) for Hadoop clusters.
- Experience in design of automated Hadoop installation.
- Deep Expertise in managing Hadoop ecosystem components in large production clusters.
- Expertise in HDP platform/Cloudera.
- Application Deployment using JAVA & Python APIs.
- Good Scripting knowledge in Bash, Python, Anaconda, Ansible.
- Knowledge of Automation/ DevOps Tools Github, Jenkins, Docker, Kubernetes.
- Data Ingestion, Data Access & Data storage using Hadoop Big Data tools like HBase, Flume, Kafka, Nifi, ElasticSearch.
- Good hands-on experience of Linux, its commands and scripting are a must.
- General operational excellence. This includes good troubleshooting skills, understanding of system's capacity and bottlenecks, memory management, performance tuning and optimization for Linux and Hadoop.
- Configuration management and deployment exposure in Open source environment.
- Knowledge of Kerberos and Apache Ranger for configuring security.
- Excellent communication skills.
- Critical thinker and good problem-solver.
- Ability to work independently with a strong attention to detail.
- Exposure and experience of Azure administration
- Knowledge of Core Java.
- Ability to work well in a global team environment
- Able to manage time and prioritise workload
- UK Security Clearance or eligibility and willingness to obtain security clearance to SC level as required by specific projects
- Ability to achieve Office of Nuclear Regulation clearance (currently requires UK Born Citizen/UK National)
One requirement is to provide occasional on-call support as part of a support rota, out of business hours. This will attract extra pay on a retention and call out basis, and is additional to base reward package.