API Reference
NHS Practice Pipeline Package
A comprehensive NHS practice level crosstabs data processing pipeline built with the oops-its-a-pipeline framework.
SummarisationStage
Bases: PipelineStage
Pipeline stage for creating descriptive statistics and summary tables.
This stage processes the combined appointment data to generate comprehensive statistical summaries and NHS performance metrics suitable for analysis and reporting purposes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
NHSPracticeAnalysisConfig
|
Configuration object containing analysis parameters and specifications. |
required |
Attributes:
| Name | Type | Description |
|---|---|---|
config |
NHSPracticeAnalysisConfig
|
The configuration object passed during initialisation. |
Methods:
| Name | Description |
|---|---|
run |
Execute the summarisation stage and generate statistical summaries. |
Notes
Generated summaries include: - Monthly appointment trends by status (attended, DNA, cancelled) - Healthcare professional type distribution and workload analysis - Appointment mode analysis (face-to-face, telephone, online) - Regional performance comparisons and geographical analysis - Booking time analysis and access pattern evaluation - Key NHS performance indicators and completion rates
All metrics follow NHS performance monitoring standards and include appropriate statistical measures for different analytical purposes.
Examples:
>>> config = NHSPracticeAnalysisConfig()
>>> stage = SummarisationStage(config)
>>> context = {"combined_data": dataframe}
>>> results = stage.run(context)
Source code in practice_level_gp_appointments/analytics.py
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 | |
__init__(config)
Initialize the summarisation stage.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
NHSPracticeAnalysisConfig
|
Configuration object containing analysis parameters. |
required |
Source code in practice_level_gp_appointments/analytics.py
run(context)
Create descriptive statistics and summary tables.
This method processes the combined appointment data to generate multiple summary tables and key performance indicators for NHS practice level analysis.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
context
|
dict
|
Pipeline execution context containing combined appointment data. |
required |
Returns:
| Type | Description |
|---|---|
dict
|
Updated pipeline context containing summary statistics and metrics. |
Notes
The method generates seven main summary categories: 1. Monthly trends by appointment status 2. Healthcare professional type analysis 3. Appointment mode temporal analysis 4. Regional performance summaries 5. Booking time access analysis 6. Overall descriptive statistics 7. Key NHS performance metrics (DNA rates, completion rates)
Source code in practice_level_gp_appointments/analytics.py
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 | |
NHSPracticeAnalysisConfig
Bases: PipelineConfig
Simple configuration for NHS Practice Level Crosstabs pipeline.
Source code in practice_level_gp_appointments/config.py
create(zip_file_stem='jul_25')
classmethod
Create configuration with date-specific paths.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
zip_file_stem
|
str
|
Date identifier for input data (e.g., "jul_25", "jun_25"). |
"jul_25"
|
Returns:
| Type | Description |
|---|---|
NHSPracticeAnalysisConfig
|
Configured instance with date-specific paths. |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the specified zip file does not exist. |
Source code in practice_level_gp_appointments/config.py
DataJoiningStage
Bases: PipelineStage
Pipeline stage for joining monthly data and combining with mapping data.
This stage combines monthly crosstab datasets into a unified dataframe and merges with geographical mapping information to enable regional analysis and reporting.
Methods:
| Name | Description |
|---|---|
run |
Execute the data joining stage and store results in pipeline context. |
Notes
The joining process includes: - Concatenation of monthly crosstab data with data_month identifier - Left join with mapping data using gp_code as the key - Addition of geographical information (ICB, region details) - Validation of join results and data quality checks
The resulting dataset contains all original crosstab fields plus: - data_month: Identifier for the source month - icb_code, icb_name: Integrated Care Board information - region_code, region_name: NHS regional information
Examples:
>>> stage = DataJoiningStage()
>>> context = {"raw_data": loaded_datasets}
>>> updated_context = stage.run(context)
Source code in practice_level_gp_appointments/data_processing.py
301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 | |
__init__()
Initialize the data joining stage.
The stage is configured to consume raw_data from the loading stage and produce combined_data for downstream analysis stages.
Source code in practice_level_gp_appointments/data_processing.py
run(context)
Join monthly data and combine with mapping data.
This method performs the core data joining operations to create a unified dataset suitable for comprehensive analysis.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
context
|
dict
|
Pipeline execution context containing raw_data from loading stage. |
required |
Returns:
| Type | Description |
|---|---|
dict
|
Updated pipeline context containing the joined dataset. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If no monthly data is found in the input datasets. |
Notes
Processing steps: 1. Extract monthly datasets and add data_month identifier 2. Concatenate all monthly data into single dataframe 3. Merge with mapping data on gp_code field 4. Validate join results and log summary statistics
Source code in practice_level_gp_appointments/data_processing.py
DataLoadingStage
Bases: PipelineStage
Pipeline stage for loading extracted CSV files.
This stage loads monthly crosstab CSV files and mapping data that have been extracted by the DataExtractionStage, using NHS_HERBOT for standardised processing and column normalisation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
NHSPracticeAnalysisConfig
|
Configuration object containing data directory paths and processing parameters including sample size limits. |
required |
Methods:
| Name | Description |
|---|---|
run |
Execute the data loading stage and store results in pipeline context. |
Notes
The stage loads: - Monthly practice level crosstab files from raw data directory - Practice mapping/lookup data for geographical information - All data is processed through NHS_HERBOT for column normalisation
Examples:
>>> config = NHSPracticeAnalysisConfig()
>>> stage = DataLoadingStage(config)
>>> context = {"extracted_files": file_paths}
>>> updated_context = stage.run(context)
Source code in practice_level_gp_appointments/data_processing.py
158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 | |
__init__(config)
Initialize the data loading stage.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
NHSPracticeAnalysisConfig
|
Configuration object containing data paths and parameters. |
required |
Source code in practice_level_gp_appointments/data_processing.py
run(context)
Load NHS practice level crosstab data from extracted CSV files.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
context
|
dict
|
Pipeline execution context containing extracted file paths. |
required |
Returns:
| Type | Description |
|---|---|
dict
|
Updated pipeline context containing loaded datasets. |
Source code in practice_level_gp_appointments/data_processing.py
OutputStage
Bases: PipelineStage
Stage for saving outputs (tables, figures, reports).
Source code in practice_level_gp_appointments/output.py
run(context)
Save outputs (tables, figures, reports).
Source code in practice_level_gp_appointments/output.py
NHSPracticeAnalysisPipeline
Bases: Pipeline
NHS Practice Level Crosstabs Analysis Pipeline.
This pipeline processes NHS practice level appointment data through five sequential stages to produce comprehensive analysis outputs including summary statistics, visualizations, and reports.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
NHSPracticeAnalysisConfig
|
Configuration object containing data paths, processing parameters, and output specifications. |
required |
Attributes:
| Name | Type | Description |
|---|---|---|
config |
NHSPracticeAnalysisConfig
|
The configuration object passed during initialisation. |
Methods:
| Name | Description |
|---|---|
run_analysis |
Execute the complete pipeline and return exit code. |
Notes
The pipeline stages are: 1. Data Loading Stage - Load CSV files from raw data directory 2. Data Joining Phase - Combine monthly data with mapping information 3. Summarisation Stage - Generate statistical summaries and metrics 4. Graphing Stage - Create visualizations and charts 5. Output Stage - Save processed data, figures, and reports
Examples:
>>> config = NHSPracticeAnalysisConfig()
>>> pipeline = NHSPracticeAnalysisPipeline(config)
>>> exit_code = pipeline.run_analysis()
>>> print(f"Pipeline completed with exit code: {exit_code}")
Source code in practice_level_gp_appointments/pipeline.py
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 | |
__init__(config)
Initialize the NHS Practice Analysis Pipeline.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
NHSPracticeAnalysisConfig
|
Configuration object containing all pipeline parameters. |
required |
Source code in practice_level_gp_appointments/pipeline.py
run_analysis()
Run the complete NHS practice level analysis pipeline.
This method orchestrates the execution of all pipeline stages, handles validation, error management, and logging.
Returns:
| Type | Description |
|---|---|
int
|
Exit code: 0 for success, 1 for failure. |
Raises:
| Type | Description |
|---|---|
Exception
|
Any unhandled exception during pipeline execution will be caught and logged, returning exit code 1. |
Notes
The method performs the following operations: - Validates pipeline configuration and stages - Generates unique run identifier with timestamp - Executes all pipeline stages in sequence - Handles errors and provides appropriate logging
Source code in practice_level_gp_appointments/pipeline.py
GraphingStage
Bases: PipelineStage
Stage for creating visualizations and graphs.
Source code in practice_level_gp_appointments/visualization.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 | |
run(context)
Create visualizations and graphs.
Source code in practice_level_gp_appointments/visualization.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 | |
Module Documentation
| Module | Description | Key Components |
|---|---|---|
| Data Processing | Data loading, cleaning, and transformation | DataLoadingStage, DataJoiningStage |
| Analytics | Statistical analysis and summaries | SummarisationStage |
| Visualization | Chart generation and plotting | Visualization functions |
| Output | Data export and report generation | OutputStage |
| Pipeline | Pipeline orchestration and workflow | NHSPracticeAnalysisPipeline |