Inferential based statistical indicators for the assessment of solar resource data

The drive to reduce fossil fuel dependency led to a surge in interest in renewable energy as a replacement fuel source, which provided research opportunities for vastly different domains. Statistical modelling was used extensively to assist in research. This study applied two statistical techniques that can be used in conjunction or independently to existing methods to validate solar resource data simulated from models. The case study, using a database from a Southern African Universities Radiometric Network, provided illustrative benefits to the methods proposed, while comparing them with some of the validation methods currently used. It was demonstrated that profile analysis plots are easy to interpret, as deviations between modelled and measured data over time are clearly observed, while traditional validation scatter plots are unable to distinguish these deviations.


Introduction
asserted that the assessment of solar energy requires reliable data having been collected for a period of at least ten years.The reason for this lengthy period is that solar resources are highly variable and with seasonal fluctuations, long-term trends and environmental factors influencing the day-to-day data.In practice, said Kleissl, the duration of reliable data collected is often considerably less than deemed sufficient.There are many reasons for this: examples include sites where the recorded data are unreliable, as maintenance of the measurement equipment has been inadequate; where tenyear collection periods are too lengthy, based on the urgent demand for electrical energy; where no recorded data are available because of the location of the sites and the lack of infrastructure to record reliable information (Gueymard and Myers, 2009;Kleissl, 2013;Msimanga and Sebitosi, 2014;Clohessy, 2017).These scenarios are not uncommon and are addressed by simulation models using specially designed algorithms that estimate data expected at a site.Badescu et al. (2013) gave insight into model types developed for solar radiation computationsthe list is extensive.
The challenges investors face is to decide whether the data generated by a simulation model is reliable, as decisions have huge financial implications.In cases such as these, model validation is a method that researchers have used to justify the model choice (Davies et al., 1988;Lefèvre et al., 2007).Model validation is the methodology whereby data generated from a simulation process is compared with data collected using standard measurement systems.The data from these datagenerating processes (DGPs) is defined as modelled and measured data respectively.There are several alternative approaches to data validation.Earlier research used descriptive methods of differences in means (Perez et al., 2002;Kudish et al., 2005;Amillo et al., 2018), correlation (Lopez et al., 2001;Amillo et al., 2018), and other descriptive statistics.Interest in distributional comparisons began in the 2000s with Espinar et al. (2009) using Kolmogorov-Smirnov distance to define an index method, while Badescu (2013) advocated a categorical ranking approach to identify good models, as the concept of a 'best model' was claimed to be an elusive solution.Gueymard (2014) reviewed the statistical methods that were used to validate solar resource data, and classified the methods into four groups, which were labelled indicators of dispersion of individual points, visual indicators, data distribution indicators, and measures of overall performance.Gueymard gave a detailed exposition of the classification lists, but a summary is provided below to contextualise the new methods proposed in the present study.

Class A indicators of dispersion of individual points:
Class A indicators are the most commonly used measures.These include the mean bias error (MBE), root mean square difference (RMSD), standard deviation of the residuals (SD), the coefficient of determination (R 2 ), and other elementary statistics.
Class B indicators of overall performance: According to Gueymard (2014), the class B indicators are less common than those in class A. They are used to compare different statistical models for the estimation of solar resource data.The model with the highest value for the class B indicator is considered the better model.The present study assesses two data collecting methods, hence the objective is not to rank models but to compare data-collecting methods.
Class C indicators of distributional similarity: These compare the distribution of modelled data with a reference dataset.Class C indicators of distribution that have been used in solar energy studies include goodness-of-fit tests such as the Kolomogorov-Smirnov test, Espinar et al.'s (2009) OVER index andGueymard et al.'s (2012) linear combination method defined as a combined performance index.

Class D graphical visualisation indicators:
Class D indicators illustrate the relationship between the modelled and measured solar resource data graphically.The most commonly used visual plots are xy-scatter plots and box-whisker plots.In addition, visual displays of class A indicators, RMSD, SD and R 2 are also illustrated graphically for ease of interpretation.
For several years, class A and D indicators were used to validate the research models for predicting solar resource data, although support for these indicators had not been universal.Willmott and Matsuura (2005), Willmott et al. (2009) and Gueymard (2014) found that they yield contradictory findings.For example, the MBE can show whether a model is under-predicting (negative MBE) or overpredicting (positive MBE) the measured data by a certain percentage.However, this indicator can be misleading, as a zero percent result does not imply that the measured and modelled values coincide.The sum of positive and negative values may cancel each other out, thus yielding a zero percent value.Another example of the limitations of a frequently used indicator would be the over use of a xy-scatter plot, a class D indicator.When the database is large, the image becomes difficult to interpret and adds little value for high-level decision-making.Despite the limitations of the indicators, they are often used by developers to make analytically informed decisions on whether to use modelled data in calculating their investment risks.
Given the limitations of some validation methods and the dearth of inferential methods, the present study proposes two novel approaches to solar resource validation.The study was undertaken in a South African context to propose two methods validating the model used to generate a database of three solar resource variables needed in energy assessments.Both methods were based on inferential principles and provided informative visualisation graphics for subjective decision-making (Jang, 2018).

Equipment and database
Three variables were considered appropriate for the analysis of this study.These variables, global horizontal irradiance (W/m 2 ), diffuse irradiance (W/m 2 ) and temperature ( C o ), have been identified in several studies (Iqbal, 2012;Sung et al., 2015;An et al., 2017;Opálková et al., 2018) as important measures in solar radiation assessments, hence the decision to select them.Modelled data was collated using Meteonorm, a software package that uses trade-restricted algorithms to simulate solar resource data subject to specific settings (Remund et al., 2014).The simulation model uses a database of measured data stored within Meteonorm to generate the modelled data.The database within the software includes measured data collected at different sites worldwide (Remund et al., 2012).If measured data were not available at a fixed site, an interpolation process was used to estimate the solar resource data.Software users decide on the settings for the model.For an example, a user would determine the sites for which data are required, the types of data that are required and the time periods for which the data are required.There are many different selection choices and users are expected to have pre-determined requirements for their DGPs.The modelled data for this study used Meteonorm to estimate resource data for Port Elizabeth (PE), Eastern Cape province, South Africa (-34.00°N,25.67°E).Two databases were required, as two separate validation methods were proposed.The first was created by simulating hourly observations for each variable for a typical year.The data were summarised as daily averages, followed by monthly average determinations.The first validation routine proposed required that each month had the same number of days, so 28 days were randomly was chosen, to bring all the monthly values in line with February.The second database of modelled data was created by simulating hourly observations for each variable for a ten-year period and summarised as averages for each hour of the day for individual months of an average year.This summative method was required for the second validation method proposed.Two databases for the measured data were required, similar to the databases prepared for the modelled data.The system used to collect the measured data is part of an online database by the Southern African Universities Radiometric Network (SAURAN), a reliable source of measured solar resource data for stations located throughout South Africa, Namibia, Botswana and the Réunion islands (Brooks et al., 2015).The Nelson Mandela University station is located at the outdoor research facility of the Centre for Energy Research in Port Elizabeth.The measurement sensor used to collect the global horizontal irradiance and diffuse irradiance data was a Kipp and Zonen CMP11 (David et al., 2007).Temperature data was collected via a Campbell Scientific CS215 sensor.The measurement instrumentation was positioned on the roof of the outdoor research facility.Figure 1 shows an image of the measurement station at the ORF with GPS coordinates summarised in Table 1.The measured data (or reference data) were recorded at the outdoor research facility for the calendar years 2013, 2014 and 2015.No data was available for analysis in the months of January and February 2014 and for September 2015 to December 2015 because of a faulty recording system.Despite the incomplete dataset, the study deemed the available data sufficient for assessment purposes.The measuring instruments recorded the value of the variable every 15 seconds.This was then stored as hourly average values.As per the modelled data, these observations were then transformed into monthly averages for the first validation method and averages for each hour of the day for individual months for the second validation method.Once the databases for both modelled and measured data were available, the analytical aspect of the study was implemented.

Validation method one
Profile analysis is an inferential statistical routine with graphical representations that are easy to interpret.The method is a multivariate technique used to analyse the shape (profile) of variables across groups.The technique is well suited for energy assessments where shapes of summarised data can be compared over multiple periods.Profile analysis can demonstrate whether modelled data of solar irradiance and temperature follow the same shape and trend as measured data, thereby validating the use of the modelled data.This section introduces the mathematical theory of profile analysis.If the variable X is ( , ) p N µ Σ and commensurate (measured in the same units and with approximately equal variance), the means 1 2 , ,..., p µ µ µ in µ can be com- pared by plotting the 1 2 , ,..., p µ µ µ as co-ordinates.
When these points are connected the plot is referred to as a profile.Profile analysis is a comparison of two or more profiles (Rencher, 2003).Suppose that two independent groups or samples have the same number of mean points.Rather than testing the hypothesis that the group means are equal, an option could be to compare the profiles (Rencher, 2003).There are two types of profile tests that are of interest in this study:

Parallel test
This compares two groups to determine whether each line segment is parallel across both groups.A parallel test can be defined in terms of its slopes.The two group profiles are parallel if the corresponding slopes for each line segment are the same.For the hypothesis test, instead of comparing whether the slopes of the line segments are the same, the increase from one mean to the next in a given group profile is compared (Rencher, 2003) : : Figure 2 illustrates a two-sample profile plot for a parallel test.) measurements for the th j period in group i ( ) . The matrix of the measurements for group i is given as where i ( 1, 2 i = ) denotes the group and j ( 1, 2,..., j p = ) the period.The vector of means used for constructing the group profile i is given by . The test statistic is calculated as and the critical point for rejection

S S S
where (Rencher, 2003).The discriminant function is calculated using The line seg- ment that produces the largest absolute value in the vector a is the line that contributes the most to the profiles not being parallel (Rencher, 2003).
The test statistic F , can also be used to test for parallelism.The test statistic is ( 1)( 2) and rejects 01

Coincidental/same level test
This test assesses whether the group profiles are at the same level by comparing the average level of profile one with the average level of profile two (Rencher, 2003).The null and alternative hypothesis for the same level test is given respectively as and the term ( 1) p× j is a vector of ones (Rencher, 2003).The same level test rejects 02

Validation method two
Confidence limits for comparing observed data are a commonly used method in many areas of applied statistics, where statistical process control (Woodall and Montgomery, 2014;Montgomery and Runger, 1993) is possibly the best-known domain.The present study used confidence limits to illustrate the unique nature of solar irradiance data, in particular as the daylight period changes how the intervals constrict as the light dissipates.A review of the literature revealed no evidence of interval plots having been used in solar energy applications.
The modelled hourly observations are defined by , , , , where a comma separator is used in the subscript notation to ensure that the level identifiers are clearly marked.This is different to the convention, but necessary for clarity, as some levels have double digits that could be ambiguous.The subscript notation identifies the month (i), the day of the month (j), the hour of the day (k) and the year (l); and data summarised accordingly.The sample mean and sample standard deviation across the ten years of modelled data are defined as , , .In summary, this study proposes a profile analytical approach to validate modelled versus measured data.This method compares both the shape and trend (profile) of the data and is easily visualised for interpretative purposes.In addition, the second validation method proposes the use of interval estimates to illustrate the coverage of measured data to modelled data.These plots are easy to interpret and provide users with a simple diagnostic tool to validate datasets.The measured data are plotted on the x -axis and the modelled data on the y -axis.The dashed red line indicates the best fitted line to the hourly data with corresponding R 2 estimates.The solid blue line, y x = , is given as a zero-intercept reference line.Table 3 summarises the class A indicator results.
The scatter plots provide a visual comparison between the hourly data for the modelled and measured data.The modelled hourly global horizontal irradiance from the software shows a good correlation with the measured data (R 2 = 0.767).The modelled data for diffuse irradiance (R 2 = 0.445) and temperature (R 2 = 0.585) have a weaker correlation with the measured data.The interpretations of the scatter plots are difficult because of the large datasets (Gueymard, 2014).The linear trend between the measured and modelled data are not easily discernible in the plots.The data points on the scatter plots are highly dispersed from the fitted line; this is most evident in the diffuse irradiance plot.It would be difficult to come to a decision on the validity of the datasets based on these plots and further analysis would be required.In addition, when the criteria defined by Badescu et al. (2013) were applied to the numerical statistics MBE and RMSE, conflicting re-sults emerged.Considering the global horizontal irradiance data, the requirement for the MBE to absolute value of less than 5% was satisfied.The RMSE requirement of less than 15% was not met.There is a possibility that the Badescu et al. rating that a model is 'bad' for those with RMSE > 20%, would indicate that the modelled data should be viewed with caution.In this case, a conservative decision would be to disregard the modelled data as the validation process was inconclusive.These results lend support to the recommendation by Badescu et al. that a single indicator alone should not be used to assess model viability.In the cases of the diffuse irradiance and temperature, the set criterion is not satisfied.Based on these findings the conservative decision would be to disregard the modelled data.The analysis discussed is not exhaustive but provides evidence of support for the development of alternative assessment tools.

Validation method one: Two-sample profile analysis
This analysis was made possible by using the package profiler (Bulut and Desjardins, 2013) Core Team, 2014).Results are extracted from the outputs and provided for interpretative purposes.
For discussion purposes, only year 2013 results are shown, but results for 2014 and 2015 are comparable in a general sense.Figures 7, 8 and 9 show the profile plots for global horizontal irradiance, diffuse irradiance and temperature respectively.Figure 7 shows that the modelled global horizontal irradiance data are parallel to, and at the same level as, the 2013 measured data.The month of September had a marginally larger monthly average value for the measured data when compared with that of the modelled data.On inspection of the source data it was found that several days in September had unusually large hourly averaged measured data.This later discovered to be a result of less cloud cover in the month because of low rainfall and high wind speeds.
Figure 8 illustrates the profile plot for the 2013 modelled and measured diffuse irradiance data.Although the profiles tend to be parallel, there is a difference between the datasets for the months of September and November.As previously emphasised, September experienced less cloud cover, which in turn produced less radiation scattering, hence lower diffuse irradiance.No obvious reason for the November difference could be detected on inspection.
The final profile plot of the temperature averages for the modelled and measured data is shown in Figure 9.The plots are inclined to be parallel, with some deviation in July.The noticeable observation is that the levels of the dataset differ.From this plot, the DGPs raise a concern as clearly there is a difference, hence indicating caution when using the temperature data.Given that the visualisation tools have provided some intuitive idea of the DGP's integrity, the inferential based mechanisms using a conservative rejection rule of 5% is provided.The inferential assessments of the hypotheses for global horizontal irradiance, diffuse irradiance and temperature are given in Table 4.
These results support the interpretations of the visual observations from Figures 7, 8 and 9.The null hypotheses that the global horizontal irradiance was coincidental and parallel were not rejected at the 5% significance level.This lends support to the observation that the 2013 modelled and measured data are parallel and coincidental, providing evidence that the dataset modelled is valid, a finding which conflicts with some of the existing indicators discussed previously.For example, the RMSE indicated that modelled and measured data were different.The null hypothesis to test whether the profiles were parallel for diffuse irradiance was not rejected at the 5% significance level, but it was rejected for the diffuse irradiance profile being coincidental at a 5% signifi-cance level.Despite this finding, the profile visualisation was able to identify where the potential differences were, allowing for further investigation.This visualisation can be invaluable to decision making when confronted by conflicting results.Finally, the hypotheses testing whether the profiles for temperature were parallel was not rejected, while the test of coincidental was rejected.This corroborates the observation that the modelled and measured 2013 data for temperature were not at the same level.In summary, the graphical and inferential procedures are easily implementable in the comparison of solar resource data and complement each other in cases where differences are observed.

Validation method two: Confidence interval plots
Table 2 shows the functions used to estimate the means ( )    The measured data for the years 2013, 2014 and 2015 predominately fell within the interval estimates of the modelled data.The interval estimates for March at 12:00 marginally underestimated the 2014 measured data.Similarly, the interval-estimate for July marginally overestimated the 2015 measured data from 9:00 and 15:00.The intervals clearly illustrate the small variability expected for the hourly mean global horizontal irradiance values for the early mornings and late afternoon.The results show that the interval-estimates provided good coverage of the measured data.This assessment is useful for developers as it enables to decide whether the modelled data is appropriate for further consideration.Figure 11 shows the diffuse irradiance interval estimate plots for PE.
Between the hours of 11:00 and 14:00, the interval underestimated the measured data for 2014 and to a lesser extent overestimated the measured data for 2013.These illustrations are very informative as they tell the user that this type of data is highly variable and that decisions based on the modelled estimates need to be made with caution.Further illustrations are available in Clohessy (2017), but findings and interpretations are similar to those discussed for March.The results corroborated the belief that diffuse irradiance is highly variable and caution should be exercised when used in risk assessments.Figure 12 shows the interval-estimate plots for temperature.
The interval estimates for temperature do not cover the measured data for 2013, 2014 and 2015.Although the modelled interval-estimates had the same shape and trend as the measured data, the datasets were different.This result supported the findings from the two-sample profile analysis even though one method used monthly averages and the other used hourly averages.This finding is evidence that these validation methods, when used appropriately, can be informative for reliable information in decision making process.

Discussion
This research emanated from calls by Kohler (2014) and Sanoh et al., (2014) to increase the effort into South African-based research into solar energy systems and their applications.As the knowledge base increases important information will be available to industry role players to facilitate more informed decisions.There is a distinct lack of infrastructure resources in the country to collect long-term measured data, hence it is necessary to use modelled data to inform opinion on solar projects (Pegels, 2010;Msimanga and Sebitosi, 2014).This study allowed for the development of two methods, which are easily implementable to assess the validity of data generated from a model.
Both techniques provided satisfactory evidence that modelled data adequately predicts global horizontal irradiance for the Port Elizabeth region.The results showed that diffuse irradiance measurements have more variability than what is computationally predicted by Meteonorm.The modelled temperature estimates were not found to be coincidental to the measured data for the years 2013, 2014 and 2015.

Conclusion
The conflicting results observed for global horizontal irradiance using traditional validation methods and the inferential acceptance of the profile analysis method indicate that there is merit to users adding these techniques to their toolbox when conducting solar irradiance assessments.A limitation of the twosample profile analysis was the necessity to fix the number of days per month, thereby having to use a randomisation approach to select 28 days for each month.

Figure 2 :
Figure 2: Illustration of a parallel profile analysis plot used as an indication of which slope contributes the most to the rejection of 01 H .

Figure 3 :
Figure 3: Illustration of a same level profile analysis plot.
the sample mean and the pooled standard deviation for each hour k for month i are defined as . .
variables are normally distributed.
validation methods The results from four selective indicators are reported in this section.The indicators xy-scatter plots, R 2 , RMSE and MBE are commonly used and include graphical and numerical descriptive measures.Only the results of the analysis for the 2013 data are shown to avoid the nature of multiple discussion.Similar findings were observed for the 2014 and 2015 data.Figures 4, 5 and 6 show the scatter plots for the hourly global horizontal irradiance, diffuse irradiance and temperature data respectively.

Figure 7 :
Figure 7: Profile analysis plot comparing global horizontal irradiance of 2013 data.

Figure 8 :
Figure 8: Profile analysis plot comparing diffuse irradiance of 2013 data excluding January and February data.

Figure 9 :
Figure 9: Profile analysis plot comparing temperature of 2013 data.Table 4: Two-sample profile analysis inferential assessment results for the comparison of global horizontal irradiance, diffuse irradiance and temperature 2013 data.Test F p-value μ and the pooled standard deviations ( ) σ grouped together for their respective hours.Once estimated, the means and pooled standard deviations provided an average hourly confidence interval estimate using the function () and the average hourly measured data included to visually assess the coverage of the interval.The interval estimates were calculated for all 12 months of the year, but only March results are shown.A 95% confidence interval was considered adequate to illustrate the method, with the upper (UL) and lower (LL) limits represented with red dashed line in Figures 10, 11 and 12. Figure10shows the global horizontal irradiance interval estimate plots for PE.

Figure 10 :
Figure 10: Modelled 95% interval estimate plot for global horizontal irradiance including the measured data for March 2013, 2014 and 2015, where UL= upper limit and LL = lower limit.

Figure 11 :
Figure 11: Modelled 95% interval estimate plot for diffuse irradiance including the measured data for March 2013, 2014 and 2015, where UL= upper limit and LL = lower limit.

Figure 12 :
Figure 12: Modelled 95% interval estimate plot for temperature including the measured data for March 2013, 2014 and 2015, where UL= upper limit and LL = lower limit.

Table 3 : Class A indicator results for solar resource comparison for 2013.
R = coefficient of determination, MBE = mean bias error, RMSE = root mean square error