×

Forecasting container throughput of Qingdao Port with a hybrid model. (English) Zbl 1310.93065

Summary: This paper proposes a hybrid forecasting method to forecast container throughput of Qingdao Port. To eliminate the influence of outliers, Local Outlier Factor (LOF) is extended to detect outliers in time series, and then different dummy variables are constructed to capture the effect of outliers based on domain knowledge. Next, a hybrid forecasting model combining Projection Pursuit Regression (PPR) and Genetic Programming (GP) algorithm is proposed. Finally, the hybrid model is applied to forecasting container throughput of Qingdao Port and the results show that the proposed method significantly outperforms ANN, SARIMA, and PPR models.

MSC:

93C95 Application models in control theory
93E03 Stochastic systems in control theory (general)
62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)
90B06 Transportation, logistics and supply chain management
90C59 Approximation methods and heuristics in mathematical programming

Software:

Orca; LOF
Full Text: DOI

References:

[1] Hawkins D, Identification of Outliers, Chapman and Hall, London, 1980. · Zbl 0438.62022 · doi:10.1007/978-94-015-3994-4
[2] Pettit L I and Smith A F M, Outliers and Influential Observations in Linear Models, Bayesian Statistics 2 (eds. by Bernado J M, DeGroot M H, Lindley D V, and Smith A F M), North-Holland, Amsterdam, 1985. · Zbl 0671.62031
[3] McCulloch R E and Tsay R S, Bayesian analysis of autoregressive times series via the Gibbs sampler, Journal of Time Series Analysis, 1994, 15(2): 235-250. · Zbl 0800.62549 · doi:10.1111/j.1467-9892.1994.tb00188.x
[4] Chaloner K and Brant P, A Bayesian approach to outlier detection and residual analysis, Biometrika, 1998, 75(4): 651-659. · Zbl 0659.62037 · doi:10.1093/biomet/75.4.651
[5] Giuli M E D, Maggi M A, and Tarantola C, Bayesian outlier detection in capital asset pricing model, Statistical Modelling, 2010, 10(4): 379-390. · Zbl 07256830 · doi:10.1177/1471082X0901000402
[6] Shotwell M S and Slatey E H, Bayesian outlier detection with Dirichlet process mixtures, Bayesian Analysis, 2011, 6(4): 665-690. · Zbl 1330.62153
[7] Sardy S, Tseng P, and Bruce A, Robust wavelet denoising, IEEE Transactions on Signal Processing, 2011, 49(6): 1146-1152. · doi:10.1109/78.923297
[8] Struzik Z R and Siebes A P J M, Wavelet transform based multifractal formalism in outlier detection and localization for financial time series, Physica A, 2002, 309(3-4): 388-402. · Zbl 0995.62081 · doi:10.1016/S0378-4371(02)00552-6
[9] Ranta R, Louis-Dorr V, Heinrich C, and Wolf D, Iterative wavelet-based denoising methods and robust outlier detection, IEEE Signal Processing Letters, 2005, 12(8): 557-560. · doi:10.1109/LSP.2005.851267
[10] Bilen C and Huzurbazar S, Wavelet-based detection of outliers in time series, Journal of Comutational and Graphical Statistics, 2002, 11(2): 311-327. · doi:10.1198/106186002760180536
[11] Grané A and Veiga H, Wavelet-based detection of outliers in financial time series, Computational Statistics and Data Analysis, 2010, 54(11): 2580-2593. · Zbl 1284.91585 · doi:10.1016/j.csda.2009.12.010
[12] Knorr E M, Ng R T, and Tucakov V, Distance-based outliers: Algorithms and applications, International Journal on Very Large Data Bases, 2000, 8(3-4): 237-253. · doi:10.1007/s007780050006
[13] Bay, S. D.; Schwabacher, M., Mining distance-based outliers in near linear time with randomization and a simple pruning rule (2003)
[14] Angiulli F, Basta S, and Pizzuti C, Distance-based detection and prediction of outliers, IEEE Transaction on Knowledge and Data Engineering, 2006, 18(2): 145-160. · doi:10.1109/TKDE.2006.29
[15] Pasha M Z and Umesh N, A comparative study on outlier detection techniques, International Journal of Computer Applications, 2013, 66(24): 23-27.
[16] Baragona R, Battagliab F, and Calzinia C, Genetic algorithms for the identification of additive and innovation outliers in time series, Computational Statistics & Data Analysis, 2001, 37(1): 1-12. · Zbl 1030.62063 · doi:10.1016/S0167-9473(00)00058-X
[17] Tolvi J, Genetic algorithms for outlier detection and variable selection in linear regression models, Soft Computing, 2004, 8(8): 527-533. · Zbl 1061.62103 · doi:10.1007/s00500-003-0310-2
[18] Ozlem G A, Serdar K, and Aybars U, Genetic algorithms for outlier detection in multiple regression with different information criteria, Journal of Statistical Computation and Simulation, 2011, 81(1): 29-47. · Zbl 1206.62126 · doi:10.1080/00949650903136782
[19] Raja P V and Bhaskaran V M, An effective genetic algorithm for outlier detection, International Journal of Computer Applications, 2012, 38(6): 30-33. · doi:10.5120/4614-6836
[20] Markou M and Singh S, Novelty detection: A review-part 1: Statistical approaches, Signal Processing, 2003, 83(12): 2481-2497. · Zbl 1145.94402 · doi:10.1016/j.sigpro.2003.07.018
[21] Markou M and Singh S, Novelty detection: A review-part 2: Neural network based approaches, Signal Processing, 2003, 83(12): 2499-2521. · Zbl 1145.94403 · doi:10.1016/j.sigpro.2003.07.019
[22] Beckman R J and Cook R D, Outliers in statistical data, Technometrics, 1983, 25(2): 119-149. · Zbl 0514.62041
[23] Hawkins D M, Bradu D, and Kass G V, Location of several outliers in multiple regression data using elemental sets, Technometrics, 1984, 26(3): 197-208. · doi:10.1080/00401706.1984.10487956
[24] Barnett V and Lewis T, Outliers in Statistical Data, John Wiley & Sons, Chichester, 1984. · Zbl 0638.62002
[25] Patcha A and Park J M, An overview of outlier detection techniques: Existing solutions and latest technological trends, Computer Networks, 2007, 51(12): 3448-3470. · doi:10.1016/j.comnet.2007.02.001
[26] Cousineau D and Chartier S, Outlier detection and treatment: A review, International Jouranl of Phychological Research, 2010, 3(1): 58-67.
[27] Hodge V J and Austin J, A survey of outlier detection methodologies, Artificial Intelligence Review, 2004, 22(2): 85-126. · Zbl 1101.68023 · doi:10.1023/B:AIRE.0000045502.10941.a9
[28] Singh K and Upadhyaya S, Outlier detection: Applications and techniques, International Journal of Computer Science Issue, 2012, 9(1): 1694-0814.
[29] Zhang J, Advancements of outlier detection: A survey, ICST Transactions on Scalable Information Systems, 2013, 13(1-3): 1-24.
[30] Pahuja D and Yadav R, Outlier detection for different applications: Review, International Journal of Engineering Research & Technology, 2013, 2(3): 1-13.
[31] Friedman J H and Stuetzle W, Projection pursuit regression, Journal of the American Statistical Association, 1981, 76(376): 817-823. · doi:10.1080/01621459.1981.10477729
[32] Du H, Wang J, Zhang X, Yao X, and Hu Z, Prediction of retention times of peptides in RPLC by using radial basis function neural networks and projection pursuit regression, Chemometrics and Intelligent Laboratory Systems, 2008, 92(1): 92-99. · doi:10.1016/j.chemolab.2007.12.005
[33] Diaconis P and Shahshahani M, On nonlinear functions of linear combinations, SIAM Journal on Scientific and Statistical Computing, 1984, 5(1): 175-191. · Zbl 0538.41041 · doi:10.1137/0905013
[34] Aldrin M, Moderate projection pursuit regression for multivariate response data, Computational Statistics & Data Analysis, 1996, 21(5): 501-531. · Zbl 0900.62334 · doi:10.1016/0167-9473(94)00029-8
[35] Lingjærde O C and Liestøl K, Generalized projection pursuit regression, SIAM Journal on Scientific and Statistical Computing, 1998, 20(3): 844-857. · Zbl 0985.62028 · doi:10.1137/S1064827595296574
[36] Posse C, Projection pursuit exploratory data analysis, Computational Statistics & Data Analysis, 1998, 20(6): 669-687. · Zbl 0875.62206 · doi:10.1016/0167-9473(95)00002-8
[37] Du H Y, Wang J, Zhang X Y, Yao X J, and Hu Z D, Prediction of retention times of peptides in RPLC by using radial basis function neural networks and projection pursuit regression, Chemometrics and Intelligent Laboratory Systems, 2008, 92(1): 92-99. · doi:10.1016/j.chemolab.2007.12.005
[38] Du H, Wang J, Hu Z, Yao X, and Zhang X, Prediction of fungicidal activities of rice blast disease based on least-squares support vector machines and project pursuit regression, Journal of Agricultrual and Food Chemistry, 2008, 56(22): 10785-10792. · doi:10.1021/jf8022194
[39] Liu P and Long W, Current mathematical methods used in QSAR/QSPR studies, International Journal of Molecular Sciences, 2009, 10(5): 1978-1998. · doi:10.3390/ijms10051978
[40] Guo Q J and Yang J G, Application of projection pursuit regression to thermal error modeling of a CNC machine tool, International Journal of Advanced Manufacturing Technology, 2011, 55(5): 623-629.
[41] Guo Q J, Yu S S, and He L, Research on tool wear monitoring method based on project pursuit regression for a CNC machine tool, Research Journal of Applied Sciences, Engineering, and Technology, 2014, 7(3): 438-441.
[42] Fildes R and Stekler H, The state of macroeconomic forecasting, Journal of Macroeconomics, 2002, 24(4): 435-468. · doi:10.1016/S0164-0704(02)00055-1
[43] Huang A Q, Xiao J, and Wang S Y, A combined forecast method integrating contextual knowledge, International Journal of Knowledge and Systems Science, 2011, 2(4): 39-53. · doi:10.4018/jkss.2011100104
[44] Breunig M M, Kriegel H P, Ng R T, and Sander J, LOF: Identifying Density-Based Local Outliers, Proceedings of the 29th ACM SIGMOD International Conference on Management Data, Dallas, Texas, USA, 2000.
[45] Fox A J, Outliers in time series, Journal of Royal Statistical Society, Series B, 1972, 34(3): 350-363. · Zbl 0249.62089
[46] Denby L and Martin R D, Robust estimation of the first order autoregressive parameter, Journal of the American Statistical Association, 1979, 74(365): 140-146. · Zbl 0407.62066 · doi:10.1080/01621459.1979.10481630
[47] Brezillion P and Pomerol J, Contextual knowledge sharing and cooperation in intelligent assistant systems, Le Travail Humain, 1999, 62(3): 223-246.
[48] Koza J R, Genetic Programming: On the Programming of Computers by Means of Natural Selection, MIT Press, Cambridge, 1992. · Zbl 0850.68161
[49] Koza J R, Mydlowec W, Lanza G, Yu J, and Keane M A, Reverse engineering and automatic synthesis of metabolic pathways from observed data using genetic programming, Pacific Symposium on Biocomputing, 2001, 6: 434-445.
[50] Fonlupt C, Solving the ocean color problem using a genetic programming approach, Applied Soft Computing, 2001, 1(1): 63-72. · doi:10.1016/S1568-4946(01)00007-2
[51] Sugimoto M, Kikuchi S, and Tomita M, Reverse engineering of biochemical equations from time course data by means of genetic programming, BioSystems, 2005, 80(2): 155-164. · doi:10.1016/j.biosystems.2004.11.003
[52] Worzel W P, Yu J J, Almal A A, and Chinnaiyan A M, Applications of genetic programming in cancer research, International Journal of Biochemistry & Cell Biology, 2009, 41(2): 405-413. · doi:10.1016/j.biocel.2008.09.025
[53] Tsai H C, Using weighted genetic programming to program squat wall strengths and tune associated formulas, Engineering Applications of Artificial Intelligence, 2011, 24(3): 526-533. · doi:10.1016/j.engappai.2010.08.010
[54] Forouzanfar M, Doustmohammadi A, Hasanzade S, and Shakouri G H, Transport energy demand forecast using multi-level genetic programming, Applied Energy, 2012, 91(1): 496-503. · doi:10.1016/j.apenergy.2011.08.018
[55] Kumru M and Kumru P Y, Using artificial neural networks to forecast operation times in metal industry, International Journal of Computer Integrated Manufacturing, 2014, 27(1): 48-59. · doi:10.1080/0951192X.2013.800231
[56] Mombeni H A, Rezaei S, Nadarajah S, and Emami M, Estimation of water demand in Iran based on SARIMA models, Environmental Modeling & Assessment, 2013, 18(5): 559-565. · doi:10.1007/s10666-013-9364-4
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.