Lead Platform Engineer

Job Type:
Cloud & Infrastructure
Job reference:
24 days ago

Lead Platform Engineer


Up to £120k

I have an exciting opportunity for a Lead Platform Engineer to join a thriving digital organisation. This is a principal technical role and you'll be reporting to the UK&I Digital Security Head of Platform Management.

Lead Platform Engineer Responsibilities;

  • Manage FCAPS scenarios in production environment.
  • Track and manage platform tickets to meet SLA requirements.
  • Monitor and manage cluster capacity based on customer count, events per second Perform Hadoop administration tasks.
  • Guide platform engineers to fix day-to-day Operational issues.
  • Setup and manage HDP Platform, handling all Hadoop environment builds, performance tuning and ongoing monitoring.
  • Develop scripts/ tools to automate platform maintenance activities.
  • Work with sustenance engineering on emergency fixes.
  • Debug day to day job issues in Hadoop platform and provide solutions.
  • Perform software release management tasks.
  • Health monitoring of multiple HDP clusters using centralized dashboards, for Hadoop services, overall server health, custom applications running on the cluster.
  • Troubleshoot Log collection and ingestion via Apache NiFi to our MDR platform from various network devices (like Firewall, Switches, Router, Proxy, IPS, WAF, Etc..), servers, and Cloud resources.
  • Coordinate with Network, Infrastructure, and other organizations as required.
  • Perform root cause analysis on failed components and implements corrective measures.
  • Configuration of high level and low level HDP parameters to fine tune performance of the cluster.
  • Manage escalations on FCAPS issues.

Lead Platform Engineer Benefits;

  • Remote working model
  • Mon-Fri
  • 25 days annual leave
  • Great Culture
  • Supportive team
  • Progression opportunities
  • Private Medical scheme
  • Up to £120k
  • On call pay

Skills needed as a Lead Platform Engineer;

  • 5 years minimum experience within an application role.
  • Experience in design and operationalizing FCAPS (Fault, Configuration, Availability, Performance, Security) for Hadoop clusters.
  • Experience in design of automated Hadoop installation.
  • Deep Expertise in managing Hadoop ecosystem components in large production clusters.
  • Expertise in HDP platform/Cloudera.
  • Application Deployment using JAVA & Python APIs.
  • Good Scripting knowledge in Bash, Python, Anaconda, Ansible.
  • Knowledge of Automation/ DevOps Tools Github, Jenkins, Docker, Kubernetes.
  • Data Ingestion, Data Access & Data storage using Hadoop Big Data tools like HBase, Flume, Kafka, Nifi, ElasticSearch.
  • Good hands-on experience of Linux, its commands and scripting are a must.
  • General operational excellence. This includes good troubleshooting skills, understanding of system's capacity and bottlenecks, memory management, performance tuning and optimization for Linux and Hadoop.
  • Configuration management and deployment exposure in Open source environment.
  • Knowledge of Kerberos and Apache Ranger for configuring security.
  • Excellent communication skills.
  • Critical thinker and good problem-solver.
  • Ability to work independently with a strong attention to detail.


  • Exposure and experience of Azure administration
  • Knowledge of Core Java.
  • Ability to work well in a global team environment
  • Able to manage time and prioritise workload
  • UK Security Clearance or eligibility and willingness to obtain security clearance to SC level as required by specific projects
  • Ability to achieve Office of Nuclear Regulation clearance (currently requires UK Born Citizen/UK National)

One requirement is to provide occasional on-call support as part of a support rota, out of business hours. This will attract extra pay on a retention and call out basis, and is additional to base reward package.

Back to Search Results