Role Title: Infrastructure/ Platform Engineer - Apache
Duration: 9 Months
Location: Remote
Rate: £ - Umbrella only

Would you like to join a global leader in consulting, technology services and digital transformation?

Our client is at the forefront of innovation to address the entire breadth of opportunities in the evolving world of cloud, digital and platforms.

Role purpose / summary

? Refactor prototype Spark jobs into production-quality components, ensuring scalability, test coverage, and integration readiness.
? Package Spark workloads for deployment via Docker/Kubernetes and integrate with orchestration systems (e.g., Airflow, custom schedulers).
? Work with platform engineers to embed Spark jobs into InfoSum's platform APIs and data pipelines.
? Troubleshoot job failures, memory and resource issues, and execution anomalies across various runtime environments.
? Optimize Spark job performance and advise on best practices to reduce cloud compute and storage costs.
? Guide engineering teams on choosing the right execution strategies across AWS, GCP, and Azure.
? Provide subject matter expertise on using AWS Glue for ETL workloads and integration with S3 and other AWS-native services.
? Implement observability tooling for logs, metrics, and error handling to support monitoring and incident response.
? Align implementations with InfoSum's privacy, security, and compliance practices.
Required Skills and Experience:
? Proven experience with Apache Spark (Scala, Java, or PySpark), including performance optimization and advanced tuning techniques.
? Strong troubleshooting skills in production Spark environments, including diagnosing memory usage, shuffles, skew, and executor behavior.
? Experience deploying and managing Spark jobs in at least two major cloud environments (AWS, GCP, Azure).
? In-depth knowledge of AWS Glue, including job authoring, triggers, and cost-aware configuration.
? Familiarity with distributed data formats (Parquet, Avro), data lakes (Iceberg, Delta Lake), and cloud storage systems (S3, GCS, Azure Blob).
? Hands-on experience with Docker, Kubernetes, and CI/CD pipelines.
? Strong documentation and communication skills, with the ability to support and coach internal teams.
Key Indicators of Success:
? Spark jobs are performant, fault-tolerant, and integrated into InfoSum's platform with minimal overhead.
? Cost of running data processing workloads is optimized across cloud environments.
? Engineering teams are equipped with best practices for writing, deploying, and monitoring Spark workloads.
? Operational issues are rapidly identified and resolved, with root causes clearly documented.
? Work is delivered with a high level of independence, reliability, and professionalism

All profiles will be reviewed against the required skills and experience. Due to the high number of applications we will only be able to respond to successful applicants in the first instance. We thank you for your interest and the time taken to apply!

Infrastructure/ Platform Engineer Apache

Owen Kent