Forecasting U.S. Fossil Energy Consumption: Advancing Accuracy with Multilayer Perceptron and Residual Learning

Yiwu Hao *

School of Science, Southwest University of Science and Technology, Mianyang, China.

Qingping He

School of Science, Southwest University of Science and Technology, Mianyang, China.

*Author to whom correspondence should be addressed.


Accurate midterm forecasts of fossil fuel consumption are essential for effective energy planning, economic management, and resource allocation. While machine learning models have demonstrated their efficacy in handling large-scale nonlinear datasets, many, including Multilayer Perceptrons (MLPs), suffer from performance degradation with increased depth. Fortunately, recent studies have revealed that Residual Networks (ResNets) can mitigate or even overcome this challenge. In this paper, we propose a Weighted Residual Network based on MLP to enhance predictive performance. We employ the Adam algorithm for model training and utilize the Gridsearch algorithm for hyperparameter tuning. In the application section, we develop predictive models using three case studies: natural gas, petroleum, and total fossil fuel consumption. We validate the effectiveness of our proposed model and compare it with ten other machine learning models. Our findings demonstrate that our proposed model consistently outperforms others in all three cases, underscoring its superior performance in midterm forecasting of fossil fuel consumption.

Keywords: Weighted residual network, multilayer perceptron, adaptive moment estimation, grid Search, energy consumption

How to Cite

Hao, Yiwu, and Qingping He. 2024. “Forecasting U.S. Fossil Energy Consumption: Advancing Accuracy With Multilayer Perceptron and Residual Learning”. Journal of Energy Research and Reviews 16 (6):13-26.


Download data is not yet available.


Yadav H, Thakkar A. NOA-LSTM: An efficient LSTM cell architecture for time series forecasting[J]. Expert Systems with Applications, 2024;238:122333.

Yi K, Zhang Q, Fan W, et al. Frequency-domain MLPs are more effective learners in time series forecasting[J]. Advances in Neural Information Processing Systems. 2024;36.

Sajjad M, Khan Z A, Ullah A, et al. A novel CNN-GRU-based hybrid approach for shortterm residential load forecasting[J]. Ieee Access, 2020;8:143759-143768.

Rosenblatt F. The perceptron: A probabilistic model for information storage and organization in the brain[J]. Psychological Review. 1958;65(6):386.

Rosenblatt F. Principles of neurodynamics: Perceptrons and the theory of brain mechanisms[M]. Washington, DC: Spartan books; 1962.

Schmidhuber J. Annotated history of modern AI and Deep learning[J]. arXiv preprint ar. 2022;Xiv:2212.11279.

Huang G B, Zhu Q Y, Siew C K. Extreme learning machine: Theory and applications[J]. Neurocomputing, 2006;70(1-3):489-501.

Ivakhnenko AG, Lapa VG. Cybernetic predicting devices[J]. (No Title); 1965.

Ivakhnenko AG, Lapa VG. Cybernetics and forecasting techniques[J]. (No Title); 1967.

Amari S. A theory of adaptive pattern classifiers[J]. IEEE Transactions on Electronic Computers. 1967;(3):299-307.

Rodr´ıguez O H, Lopez Fernandez J M. A semiotic reflection on the didactics of the chain rule[J]. The Mathematics Enthusiast. 2010;7(2):321-332.

Werbos PJ. Applications of advances in nonlinear sensitivity analysis[C]//System Modeling and Optimization: Proceedings of the 10th IFIP Conference New York City, USA, August 31–September 4, 1981. Berlin, Heidelberg: Springer Berlin Heidelberg. 2005;762-770.

Rumelhart DE, Hinton GE, Williams RJ. Learning internal representations by error propagation[J]; 1985.

Hochreiter S. Untersuchungen zu dynamischen neuronalen Netzen[J]. Diploma, Technische Universit ¨ at M¨unchen. 1991;91(1):31.

Srivastava R K, Greff K, Schmidhuber J. Highway networks[J]. arXiv preprint ar. 2015;Xiv:1505.00387.

He K, Zhang X, Ren S, et al. Identity mappings in deep residual networks[C]//Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14. Springer International Publishing. 2016;630-645.

Azadeh A, Ghaderi SF, Sohrabkhani S. Annual electricity consumption forecasting by neural network in high energy consuming industrial sectors[J]. Energy Conversion and management. 2008;49(8):2272-2278.

Ekonomou L. Greek long-term energy consumption prediction using artificial neural networks[J]. Energy. 2010;35(2):512-517.

Trejo-Perea M, Herrera-Ruiz G, Rios-Moreno J, et al. Greenhouse energy consumption prediction using neural networks models[J]. Training. 2009;1(1):2.

Albelwi S. A Robust Energy Consumption Forecasting Model using ResNet-LSTM with Huber Loss[J]. International journal of computer science and network security: IJCSNS. 2022;22(7):301-307.

Khanarsa P, Luangsodsai A, Sinapiromsaran K. Self-Identification ResNet-ARIMA Forecasting Model[J]. WSEAS transactions on systems and control. 2020;15(21): 196-211.

Zhao Y, Khushi M.Wavelet Denoised-ResNet CNN and LightGBM method to predict forex rate of change[C]//2020 International Conference on Data Mining Workshops (ICDMW). IEEE. 2020;385- 391.

Andrychowicz M, Denil M, Gomez S, et al. Learning to learn by gradient descent by gradient descent[J]. Advances in neural information processing systems. 2016;29.

Amari S. Backpropagation and stochastic gradient descent method[J]. Neurocom puting, 1993;5(4-5):185-196.

Kingma D P, Ba J. Adam: A method for stochastic optimization[J]. arXiv preprint. 2014ar;Xiv:1412.6980.

Dey R, Salem F M. Gate-variants of gated recurrent unit (GRU) neural networks[C]//2017 IEEE 60th international midwest symposium on circuits and systems (MWSCAS). IEEE. 2017;1597-1600.

Breiman L. Random forests[J]. Machine learning, 2001;45:5-32.

Chen T, He T, Benesty M, et al. Xgboost: extreme gradient boosting[J]. R package version 0.4-2, 2015;1(4):1-4.

Gers F A, Schmidhuber J, Cummins F. Learning to forget: Continual prediction with LSTM[J]. Neural computation, 2000;12(10):2451-2471.

Drucker H, Burges CJ, Kaufman L, et al. Support vector regression machines[J]. Advances in neural information processing systems. 1996;9.

O’Shea K, Nash R. An introduction to convolutional neural networks[J]. arXiv preprint 2015;arXiv:1511.08458.

Popescu M C, Balas V E, Perescu-Popescu L, et al. Multilayer perceptron and neural networks[J]. WSEAS Transactions on Circuits and Systems. 2009;8(7):579-588.

Kim TY, Cho SB. Predicting residential energy consumption using CNN-LSTM neural networks[J]. Energy. 2019;182:72-81.

Kim S, Hong S, Joh M, et al. Deeprain: Convlstm network for precipitation prediction using multichannel radar data[J]. arXiv preprint. 2017;arXiv:1711.02316.

Kulkarni SG, Chaudhary AK, Nandi S, et al. Modeling and monitoring of batch processes using principal component analysis (PCA) assisted generalized regression neural networks (GRNN)[J]. Biochemical Engineering Journal. 2004;18(3):193-210.

Yu S, Zhang Z, Wang S, et al. A performancebased hybrid deep learning model for predicting TBM advance rate using attention-ResNet- LSTM[J]. Journal of Rock Mechanics and Geotechnical Engineering. 2024;16(1):65-80.

Zhang Y, Ma M, Li Y, et al. Attention-ResNet-AR-LSTM: An intelligent method for PCCP deformation prediction via structure monitoring based on distributed fiber optics[J]. Engineering Failure Analysis. 2024;108157.