Clustering of wind resource data for the South African renewable energy development zones
This study investigates the use of clustering methodologies as a means of reducing spatio-temporal wind speed data into statistically representative classes of temporal profiles for further processing and interpretation. The clustering methodologies are applied to the high-resolution spatio-temporal, meso-scale renewable energy resource dataset produced for Southern Africa by the Council of Scientific and Industrial Research. This large dataset incorporates thousands of coordinates and represents a challenge from a computational perspective. This dataset can be reduced by applying clustering techniques to classify the temporal wind speed profiles into categories with similar statistical properties. Various clustering algorithms are considered, with the view to compare the performances of these algorithms for large wind resource datasets, namely k-means, partitioning around medoids, the clustering large applications algorithm, agglomerative clustering, the divisive analysis algorithm and fuzzy c-means clustering. Two distance measures are considered, namely the Euclidean distance and Pearson correlation distance. The validation metrics evaluated in the investigation includes the silhouette coefficient, the Calinski-Harabasz index and the Dunn index. Case study results are presented for the Komsberg Renewable Energy Development Zone, located in Western Cape, South Africa. This zone is selected based on the high mean wind speed and large standard deviation exhibited by the temporal wind speed profiles associated with the zone. The effects of seasonal variation in the temporal wind speed profiles are considered by partitioning the input dataset in accordance with the low and high demand seasons defined by the Megaflex Time of Use tariff. The clustered wind resource maps produced by the proposed methodology represent a valuable input dataset for further studies such as siting and the optimal geographical allocation of wind generation capacity to reduce the variability and ramping effects that are inherent to wind energy.
Council for Scientific and Industrial Research, National wind solar sea, Council for Scientific and Industrial Research, [Online]. Available: https://www.csir.co.za/national-wind-solar-sea. [Accessed 08 10 2018].
J. Michalakes, J. Dudhia, D. Gill, T. Henderson, J. Klemp, W. Skamarock and W. Wang, The weather reseach and forecast model: Software architecture and performance, in Use of high performance computing in meteorology, World Scientific Publishing Co Pte Ltd, 2018.
Fraunhofer IWES and The CSIR Energy Centre, Wind and solar PV resource aggregation study for South Africa, Fraunhofer IWES, South Africa, 2016.
Council for Scientific and Industrial Research, Strategic search for SA’s best wind and sun, Council for Scientific and Industrial Research, 2014.
C. Y. Janse van Vuuren, H. J. Vermeulen and J. C. Bekker, Clustering of wind resource Weibull characteristics on the South African renewable energy development zones, in The 10th International Renewable Energy Congress (IREC 2019), Sousse, Tunisia, 2019.
Eskom, tariffs & charges 018/2019, Eskom, 2018.
C. Martha, W. Milligan and G. Cooper, Methodology review: Clustering methods, Applied Ssychological Measurement, vol. 11, no. No.4, pp. 329-354, 1987.
A. Kassambara, Clustering distance measures, in Practical guide to cluster analysis in R, STHDA, 2017, pp. 25-27.
S. Ayramo and T. Karkkainen, Introduction to partitioning-based clustering methods with a robust example, University of Jyvaskyla Department of Mathematical Information Technology, Jyvaskyla, 2006.
K. Alsabti, S. Ranka and V. Singh, An efficient k-means clustering algorithm, in Electrical Engineering and Computer Science, 1997.
M. A. Mottalib and F. B. A. Abid, An accurate grid-based PAM clustering method for large dataset, International Journal of Computer Applications (0975 – 8887), vol. Volume 41, no. No.21, p. 0975 – 8887, 2012.
A. Bhat, k-medoids clustering using partitioning around medoids for performing face recognition, International Journal of Soft Computing, Mathematics and Control, vol. Vol. 3, no. No.3, 2014.
D. S. Wilks, Chapter 15 - Cluster analysis, in International geophysics, Elsevier, 2011.
G. Milligan, An examination of the effect of six types of error perturbation on fifteen clustering algorithms, Psychometrika, vol. 45, no. 3, pp. 325-342, 1980.
J. Keane, A. Stetco and . X.-J. Zeng, Fuzzy C-means++: Fuzzy C-means with effective seeding initialization, Expert Systems with Applications, vol. 42, p. 7541–7548, 2015.
Suranaree University of Technology, The clustering validity with silhouette and sum-of-squared errors, in Industrial Application Engineering, Japan, 2015.
B. Kim, J. Kim and G. Yi, Analysis of clustering evaluation considering features of item response data using data mining technique for setting cut-off scores, Symmetry MDIP, p. 8, 25 April 2017.
S. Saitta, B. Raphael and I. F. C. Smith, A comprehensive validity index for clustering, Intelligent Data Analysis, vol. 12, no. 6, pp. 529-548, November 2008.
F. Kovács, C. Legány and A. Babos, Cluster validity measurement techniques, Department of Automation and Applied Informatics,Budapest University of Technology and Economics, Budapest, Hungary.
S. Saitta, B. Raphael and I. F. Smith, A bounded index for cluster validity, in Ecole Polytechnique Fédérale de Lausanne, Switzerland.
I. Jolliffe, Principal component analysis, in International Encyclopedia of Statistical Science, Berlin, Heidelberg, Springer, 2011.
University of Reading, Wind profile program-Logarithmic wind profile, [Online]. Available: http://www.met.reading.ac.uk/~marc/it/wind/interp/log_prof/. [Accessed 2018 July 19].
A. Kassambara, K-means clustering, in Practical guide to cluster analysis in R, STHDA, 2017, pp. 36-38.
A. Kassambara, K-Medoids, in Practical guide to cluster analysis in R, STHDA, 2017, pp. 36-37.
A. Kassambara, Internal measures for cluster validation, in Practical guide to cluster analysis in R, STHDA, 2017, pp. 139-140.
S. Aranganayagi and K. Thangavel, Clustering categorical data using silhouette coefficient as a relocating measure, in International Conference on Computational Intelligence and Multimedia Applications, Sivakasi, Tamil Nadu, India, 2007.
F. Hoppner, F. Klawonn, R. Kruee and T. Runkler, Fuzzy cluster analysis, New York: John Wiley & Sons, LTD, 2000.
A. Kassambara, CLARA - Clustering large applications, in Practical guide to cluster analysis in R, STHDA, 2017, pp. 57-63.
M. B. Eisen, P. T. Spellman and P. O. Brown, Cluster analysis and display of genome-wide expression patterns, Proceedings of the National Academy of Sciences, p. 14863−14868, 08 November 1999.
R. Tibshirani, G. Walther and T. Hastie, Estimating the number if clusters in a data set via the gap statistic, Royal Statistical Society, USA, 2001.
A. Trevino, Introduction to K-means clustering, 06 December 2016. [Online]. Available: https://www.datascience.com/blog/k-means-clustering. [Accessed 07 October 2018].
R. Lletı́a, M. OrtizaL, A. Sarabiab and M. Sánchezb, Selecting variables for k-means cluster analysis by using a genetic algorithm that optimises the silhouettes, Analytica Chimica Acta, vol. 515, no. 1, pp. 87-100, 2003.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Copyright remains with the author(s).
Publishing rights remain with the author(s)
All articles published in JESA can be re-used under the following CC license: CC BY-SA Creative Commons Attribution-ShareAlike 4.0 International License.