Horizon CDT Research Highlights

Research Highlights

Automated Feature Engineering, AutoML, and Decision-Focused Learning for Improved Energy Consumption Forecasting

  Nasser Alkhulaifi (2021 cohort)

The rising cost and demand for energy, coupled with the need to meet environmental sustainability goals, present significant challenges that require a multifaceted approach. Energy Consumption Forecasting (ECF) emerges as a useful tool for informed energy management by predicting future consumption patterns. It enables decision-makers to take pre-emptive action and implement effective planning strategies that reduce energy use while minimising waste and emissions. Although Machine Learning (ML) methods have been widely adopted for ECF, their development remains highly dependent on domain expertise.

A major challenge in developing accurate ML models for ECF is Feature Engineering (FE). This process is necessary because raw energy data often require preprocessing and transformation before algorithms can learn effectively. Additionally, real-world datasets are usually small due to data collection limitations, privacy issues, or resource constraints. In such cases, FE can compensate for these limitations by extracting and selecting useful features, thereby maximising the utility of available data and improving predictive performance. However, in addition to being expert-dependent, this process is manual, prone to human error, and time-consuming. Although state-of-the-art Automated ML (AutoML) frameworks have streamlined ML development through automated model selection and hyperparameter tuning, they assume that data preparation and feature generation have been completed, leaving FE largely dependent on human practitioners. This has led to growing interest in automated FE (AFE) methods. Nevertheless, current AFE approaches are general-purpose and fail to capture the domain-specific temporal patterns and consumption characteristics inherent in energy data. This complexity stems from multi-scale dynamics and exogenous drivers such as operational fluctuations and weather. Moreover, current ML research for ECF problems focuses on minimising forecasting errors rather than optimising downstream decision-making (i.e., decisions based on these forecasts), allowing prediction errors to cascade into suboptimal operational decisions in energy management systems. The emerging Decision-Focused Learning (DFL) methods aim to mitigate this shortcoming by integrating prediction and optimisation, yet they remain relatively nascent and have been tested primarily on synthetic datasets or small-scale problems, leaving a gap in their practical evaluation.

This thesis, therefore, addresses these challenges by developing ML models for ECF that minimise reliance on domain knowledge and can be applied across diverse energy systems while maximising both forecasting accuracy and downstream decision quality. Three key research objectives guide this thesis: (1) establishing a comprehensive pipeline for FE in ECF; (2) developing an AFE method tailored for ECF that minimises the domain expertise needed and seamlessly integrates with AutoML for fully automated ECF modelling; and (3) leveraging AFE to optimise downstream tasks beyond predictive performance in a DFL framework.

The first contribution of this thesis fulfils objective (1) by establishing a comprehensive ML pipeline capable of predicting one week into the future to maximise usability. The developed pipeline was evaluated on two novel real-world energy datasets and is suitable for small datasets. Through extensive investigation of feature extraction and selection techniques, this pipeline demonstrates that domain knowledge plays a critical role in FE, revealing it as the most expert-dependent and time-consuming task in developing ML models for ECF. These findings form a robust empirical foundation for the subsequent development of AFE.

The second contribution pursues objective (2) by introducing AutoEnergy, a novel AFE algorithm specifically tailored for ECF problems. AutoEnergy automatically generates interpretable features from timestamps and past consumption values through rule-based transformations. This method was extensively evaluated across eighteen diverse real-world energy datasets spanning residential, commercial, industrial, renewable, and grid power domains, demonstrating robust generalisation potential. On average, AutoEnergy achieved forecasting error reductions of 19.52% to 84.72% compared with baseline AutoML and established AFE methods, while running 1.31 to 4.41 times faster than benchmark methods.

The final contribution targets objective (3) by leveraging AutoEnergy to improve the nascent DFL for ECF applications such as Battery Energy Storage System (BESS) optimisation. In this work, an AFE–DFL framework is developed that forecasts electricity prices and demand while jointly optimising BESS operations to minimise costs. The framework was validated on a novel real-world UK property dataset. Incorporating AutoEnergy reduces operating costs by 22.9–56.5% relative to the same DFL models without AFE, demonstrating enhanced DFL performance, while simultaneously providing empirical evidence of DFL's practical viability in real-world settings.

In summary, the thesis demonstrates that: (A) fully automated ECF modelling can be achieved through the integration of AutoEnergy with AutoML, thereby reducing dependence on domain expertise while maximising forecasting accuracy; and (B) AFE-enhanced DFL methods deliver tangible operational benefits by translating forecasting gains into improved decision-making for ECF applications. These contributions have broader implications for energy management systems in settings with limited expertise and small datasets, providing valuable decision-support tools for the transition to smart, automated energy systems.

Publications

  1. Alkhulaifi, N., Bowler, A.L., Pekaslan, D., Watson, N.J. and Triguero, I., 2025. AutoEnergy: An Automated Feature Engineering Algorithm for Energy Consumption Forecasting with AutoML. Knowledge-Based Systems, p.114300. https://doi.org/10.1016/j.knosys.2025.114300
  2. Alkhulaifi, N., Ismail G. D., Timothy R. C., Bowler, A.L., Pekaslan, D., Watson, N.J. and Triguero, I., 2025. Decision-Focused Learning Enhanced by Automated Feature Engineering for Energy Storage Optimisation, 2509.05772 https://doi.org/10.48550/arXiv.2509.05772
  3. Alkhulaifi, N., Bowler, A.L., Pekaslanc, D., Triguero, I., and Watson, N.J., "Exploring Automated Feature Engineering for Energy Consumption Forecasting with AutoML," 2024, IEEE International Conference on Systems, Man, and Cybernetics (SMC), Malaysia, 2024, pp. 2993-2998, https://doi.org/10.1109/SMC54092.2024.10831959
  4. Alkhulaifi, N., Bowler, A.L., Pekaslanc, D., Serdaroglu, G., Closs, S., Watson, N.J. and Triguero, I., 2024. Machine Learning Pipeline for Energy and Environmental Prediction in Cold Storage Facilities. IEEE Accesshttps://doi.org/10.1109/ACCESS.2024.3482572
  5. Alagoz, B.B., Keles, C., Ates, A., Özdemir, E. and Alkhulaifi, N., 2025. Optimal deep neural network architecture design with improved generalization for data-driven cooling load estimation problem. Neural Computing and Applications, pp.1-20. https://doi.org/10.1007/s00521-025-11212-7
  6. Canatan M, Alkhulaifi N, Watson N, Boz Z. Artificial Intelligence in Food Manufacturing: A Review of Current Work and Future Opportunities. Food Engineering Reviews. 2025 Mar 4:1-31. Food Engineering Reviewshttps://doi.org/10.1007/s12393-024-09395-1