Data Engineer
6 months initially
London/Remote
Inside IR35 - Umbrella only
BPSS eligible
Role Description
A strong Data Engineer is required to work on the Data Transformation workstream within the Breast Screening Pathway team within NHSE Digital Screening. Working towards the goal of modernising and transforming the e2e Breast Screening service in England with modern data-driven digital capabilities, a small team of Data experts are required to analyse existing system datasets spread across multiple system instances, in order to help identify and address data issues and to shape target datasets and structures for the future. The core focus of this role is to build secure, repeatable data ingestion and transformation pipelines, to implement data cleansing rules, and to produce auditable, reproducible outputs.
Required Skills
- Establish Import/export patterns, handling data extracts, schema discovery, incremental loads, handling multiple source instances
- Capability in data transformation-heavy pipelines from data-profiling to cleansing to standardisation, conformance and publishing ? Advanced knowledge of SQL for profiling, joins/merges, deduplication, anomaly detection, and performance tuning
- Scripting knowledge in Python for automation, parsing, rules engines, and data quality checks, ability to write maintainable code.
- Experience using Python packages for data wrangling (e.g. Pandas, Polars), modelling (e.g. scikit-learn) and visualisation (e.g. matplotlib) ? Experience with modern data tooling (for example, Spark, Azure Data Factory) or ability to implement equivalents with code.
- Proven experience working with geospatial data, including handling spatial formats (e.g., vector, raster, GeoJSON, shapefiles), coordinate reference systems, and spatial analysis workflows
- Strong ability to interpret and apply geographical context in data processing pipelines, with demonstrated capability to aggregate, upscale, or translate local/regional geospatial insights into broader national or regional-level datasets and analytical outputs
- Experience working with publicly available official datasets, particularly Office for National Statistics (ONS) open data products (e.g., census boundaries, geographic lookups, deprivation indices, or mid-year population estimates)
- Able to build rules for completeness/validity/consistency, implement exception handling

