Vergleich und Evaluierung verschiedener Clustering Algorithmen und Methoden zur Anwendung auf Wetterdaten zum Definieren von Wetterereignisprofilen und deren Charakteristiken


  • Julian Erath DHBW Stuttgart


Clusteranalyse, Clustermodelle, Wetterdaten, Meteorologie, Wetterdaten Gruppierung, Wetterereignisprofile


Clusteranalysen mit den Algorithmen KMeans, HAC, GMM & DBSCAN auf Wetterdaten aus Ontario, Kanada, mithilfe von DSR. Ziel ist die Identifizierung von Wetterereignisprofilen. Die entwickelten Profile könnten in Wettervorhersagen, Dashboards und zur Anomaliedetektion Anwendung finden. IBM Deutschland GmbH stellt sieben Jahre historische Wetterdaten bereit, die Potenzial für zukünftige Forschung bieten.


Abdi, H. / Williams, L.J. (2010): Principal component analysis, in: Wiley interdisciplinary re-views: com-putational statistics, 2(4), pp.433-459.

Ackerman, S. / Knox, J. (2011): Meteorology. Jones & Bartlett Publishers.

Akande, A. / Costa, A.C. / Mateu, J. / Henriques, R. (2017): Geospatial analysis of extreme weather events in Nigeria (1985–2015) using self-organizing maps, in: Advances in Meteorology, 2017.

Ban, Z. / Liu, J. / Cao, L. (2018): Superpixel segmen-tation using Gaussian mixture model, in: IEEE Transactions on Image Processing, 27(8), pp.4105-4117.

Bellman, R. (1957): Dynamic programming. Princeton University, NJ, Princeton University Press, New Jersey.

Chandola, V. / Banerjee, A. / Kumar, V. (2009): Anomaly detection: A survey, in: ACM com-puting surveys (CSUR). Jul 30;41(3):1-58.

Chu, X. / Ilyas, I.F. / Krishnan, S. / Wang, J. (2016): Data cleaning: Overview and emerging challenges, in: Proceedings of the 2016 international conference on management of data (pp. 2201-2206).

Cui, M. (2020): Introduction to the k-means clustering algorithm based on the elbow method. Accounting, Auditing and Finance, 1(1), pp.5-8.

Das, S. / Sun, X. (2014): Investigating the pattern of traffic crashes under rainy weather by association rules in data mining, in: Transportation Research Board 93rd Annual Meeting (No. 14-1540). Trans-portation Research Board Washington DC.

de Lima, Glauston, R.T. / Stephan, S. (2013): A new classification approach for detecting severe weather patterns, in: Computers & geosciences 57 (2013): 158-165.

ECMWF (2023a): ERA5: data documentation. URL:, Abruf: 01.03.2023, 13:36 Uhr

ECMWF (2023b): ERA5: reanalysis datasets for fore-casts. URL:, Abruf: 01.03.2023, 13:44 Uhr

ECMWF (2023c): ERA5: data documentation parame-terlistings. URL:, Abruf: 01.03.2023, 13:59 Uhr

Epstein, E.S. (1969): A scoring system for probability forecasts of ranked categories, in: Journal of Applied Meteorology (1962-1982), 8(6), pp.985-987.

Eskandarpour, R. / Khodaei, A. (2016): Machine learn-ing based power grid outage predic-tion in response to extreme events, in: IEEE Transactions on Power Systems, 32(4), pp.3315-3316.

Fathi, M. / Haghi Kashani, M. / Jameii, S. M. / Mah-dipour, E. (2022): Big Data Analytics in Weather Forecasting: A Systematic Review, in: Archives of Computational Methods in Engineering 29.2 (2022, Springer): 1247–1275

Ferstl, F. / Kanzler, M. / Rautenhaus, M. / Wester-mann, R. (2017): Time-hierarchical clus-tering and visualization of weather forecast ensembles, in: IEEE transactions on vis-ualization and computer graphics, 23(1), pp.831-840.

Firdaus, S. / Uddin, M.A. (2015): A survey on cluster-ing algorithms and complexity analysis, in: Interna-tional Journal of Computer Science Issues (IJCSI), 12(2), p.62.

Géron, A. (2019): Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow. O'Reilly Me-dia, Inc.

Ghirardelli, J.E. (2005): An Overview of the Redevel-oped Localized Aviation Mos Program (Lamp) For Short-Range Forecasting.

Giordani, P. / Ferraro, M.B. / Martella, F. (2020): Introduction to Clustering. Springer Singa-pore.

Grabbe, S.R. / Sridhar, B. / Mukherjee, A. (2014): Clustering days with similar airport weather condi-tions, in: 14th AIAA Aviation Technology, Integra-tion, and Operations Conference (p. 2712).

Gregor, S. / Hevner, A.R. (2013): Positioning and Presenting Design Science Research for Maximum Impact, in: MIS Quarterly, Jg. 37, Nr. 2, S. 337-355

Government of Canada (2022): Canada’s top 10 weather stories of 2022. URL:, Abruf: 20.04.2023, 14:05 Uhr

Hasan, N. / Uddin, M.T. / Chowdhury, N.K. (2016): Automated weather event analysis with machine learning, in: International Conference on Innovations in Science 2016, Engi-neering and Technology (ICISET) (pp. 1-5). IEEE.

Hegland, M. (2003): Algorithms for Association Rules, in: Mendelson, S., Smola, A.J. (eds) Advanced Lec-tures on Machine Learning. Lecture Notes in Com-puter Science(), vol 2600. Springer, Berlin, Heidel-berg.

Hersbach, H. / Bell, B. / Berrisford, P. / Hirahara, S. / Horányi, A. / Muñoz‐Sabater, J. / Nicolas, J. / Peu-bey, C. / Radu, R. / Schepers, D. / Simmons, A. (2020): The ERA5 global reanalysis, in: Quarterly Journal of the Royal Meteorological Socie-ty, 146(730), pp.1999-2049.

Hevner, A. / Chatterjee, S. (2010): Design Research in Information Systems, Theory and Practice. Hrsg. von R. Sharda/S. Voß. Bd. 22. Integrated Series in Information Sys-tems. New York, NY, USA: Springer New York, NY.

Hevner, A. / March, S.T. / Park, J. / Ram, S. (2004): Design Science in Information Systems Research, in: MIS Quaterly 28.1, S. 75–105.

Hjelmfelt, M.R. (1990): Numerical study of the influ-ence of environmental conditions on lake-effect snowstorms over Lake Michigan, in: Monthly Weather Review, 118(1), pp.138-150.

Holmstrom, M. / Liu, D. / Vo, C. (2016): Machine learning applied to weather forecasting. Meteorolo-gy. Appl. Dec; 10: 1-5.

Horenko, I. / Dolaptchiev, S.I. / Eliseev, A.V. / Mokhov, I.I. / Klein, R. (2008): Metastable decom-position of high-dimensional meteorological data with gaps, in: Journal of the atmospheric sciences, 65(11), pp.3479-3496.

Hupfer, P. / Kuttler, W. (2005): Witterung und Klima. Eine Einführung in die Meteorologie und Klimato-logie, 11. Auflage

Jahn, M. (2015): Economics of extreme weather events: Terminology and regional impact models. Weather and Climate Extremes, 10, pp.29-39.

Jo, J.M. (2019): Effectiveness of normalization pre-processing of big data to the machine learning per-formance, in: The Journal of the Korea institute of electronic communica-tion sciences, 14(3), pp.547-552.

Kassambara, A. (2017): Practical guide to cluster analy-sis in R: Unsupervised machine learning, 1. Auflage, Sthda.

Kotsiantis, S. / Kanellopoulos, D. (2006): Association rules mining: A recent overview, in: GESTS Interna-tional Transactions on Computer Science and Engi-neering. 2006 Jan;32(1): 71-82.

Liljequist, G.H. / Cehak, K. (1984): Allgemeine Meteo-rologie. 3. Auflage, Springer-Verlag.

Liu, F. / Deng, Y. (2020): Determine the number of unknown targets in open world based on elbow method, in: IEEE Transactions on Fuzzy Systems, 29(5), pp.986-995.

Liu, F. / Ting, K.M. / Zhou, Z.H. (2012): Isolation-based anomaly detection, in: ACM Trans Knowl. Discov. Data 6(1): Article 3

Mitchell, T. (1997): Machine learning. McGraw Hill, New York

Moon, T.K. (1996): The expectation-maximization algorithm, in: IEEE Signal processing mag-azine, 13(6), pp.47-60.

Pelosi, A. / Terribile, F. / D’Urso, G. / Chirico, G.B. (2020): Comparison of ERA5-Land and UERRA MESCAN-SURFEX reanalysis data with spatially interpolated weather ob-servations for the regional assessment of reference evapotranspiration. Water, 12(6), p.1669.

Pooja, S.B. / Balan, R.S. / Anisha, M. / Muthukuma-ran, M.S. / Jothikumar, R. (2020): Techniques Tan-imoto correlated feature selection system and hybrid-ization of clus-tering and boosting ensemble classifi-cation of remote sensed big data for weather forecast-ing. Computer Communications, 151, pp.266-274.

Poteraş, C.M. / Mihăescu, M.C. / Mocanu, M. (2014): An optimized version of the K-Means clustering al-gorithm, in 2014 Federated Conference on Computer Science and Infor-mation Systems (pp. 695-699). IEEE.

Ray, P. (ed) (2015): Mesoscale meteorology and fore-casting. Springer.

Runkler, T.A. (1999): Probabilistische und Fuzzy Methoden für die Clusteranalyse, in: Seis-ing, R. (eds) Fuzzy Theorie und Stochastik. Computational Intelligence. Vie-weg+Teubner Verlag, Wiesbaden.

Scikit Learn (2023a): Clustering. URL:, Abruf: 07.03.2023, 14:33 Uhr

Scikit Learn (2023b): Preprocessing. URL:, Abruf: 19.04.2022, 16:37 Uhr

Sagiroglu, S. / Sinanc, D. (2013): Big data: A review, in: International conference on collabo-ration tech-nologies and systems (CTS) 2013 May 20 (pp. 42-47). IEEE.

Savaresi, S.M. / Boley, D.L. / Bittanti, S. / Gazzaniga, G. (2002): Cluster selection in divi-sive clustering algorithms, in: Proceedings of the 2002 SIAM In-ternational Confer-ence on Data Mining (pp. 299-314). Society for Industrial and Applied Mathema-tics.

Spektrum Akademischer Verlag, Heidelberg, (2000): Lexikon Der Geowissenschaften: Atmosphäre. URL:, Abruf: 23.02.2023, 13:46 Uhr

Syakur, M. A. / Khotimah, B. K. / Rochman, E. M. S. / Satoto, B. D. (2018): Integration k-means cluster-ing method and elbow method for identification of the best customer profile cluster, in: IOP conference series: materials science and engineering (Vol. 336, p. 012017). IOP Publishing.

The Weather Network (2022): The Weather Network. URL:, Abruf: 24.04.2023, 15:06 Uhr

Thudumu, S. / Branch, P. / Jin, J. / Singh, J. (2020): A comprehensive survey of anomaly detection tech-niques for high dimensional big data, in: Journal of Big Data. Dec;7: 1-30.

Fang, W. / Sheng, V.S. / Wen, X. / Pan, W. (2014): Meteorological data analysis using mapreduce, in: The Scientific World Journal, 2014.

Webster, J. / Watson, R.T. (2002): Analyzing the past to prepare for the future: Writing a literature review, in: MIS quarterly. Jun 1: xiii-xiii.

Xu, Q. / He, D. / Zhang, N. / Kang, C. / Xia, Q. / Bai, J. / Huang, J. (2015): A short-term wind power fore-casting approach with adjustment of numerical weather prediction in-put by data mining, in: IEEE Transactions on sustainable energy, 6(4), pp.1283-1291.

Yuan, C. / Yang, H. (2019): Research on K-value selec-tion method of K-means clustering algorithm. J, 2(2), pp.226-235.

Zhou, Z.H. (2021): Machine learning. Springer Nature.