Feature Correction

Occasionally, PV data can include features such as saturation, failure, or strong seasonality that should be removed prior or during examination. This package includes feature correction methods addressing each of these.

Saturation Removal

Inverter saturation can throw off trends in the data as well. If you know the saturation limit of the inverter for your data, you can run this function to remove saturated data points.

test_dfc_removed_saturation <- plr_saturation_removal(test_dfc, var_list, sat_limit = 3000, power_thresh = 0.99)

# fraction of data kept
nrow(test_dfc_removed_saturation)/nrow(test_dfc)
#> [1] 0.994224

System Failures and Soiling

If a user knows or suspects that a period of system failure is present in the data, this package offers solutions for identifying and removing these data points. The method plr_failure_test filters data for particularly low correlation between power and irradiance, and then executes k-means clustering (with k=2) to identify a cluster of points which may indicate soiling. It has an option to group data by months or look over all data. The function removes data which is in the smaller cluster if the cluster indicates much lower power production per irradiance and accounts for a small (<.25) portion of the data.

# default values inserted for reference
#df_failure <- plr_failure_test(test_dfc, var_list, corr_thresh = 0.95, plot = FALSE, by_month = FALSE)

# fraction of data kept
#nrow(df_failure)/nrow(test_dfc)

The method includes an option to plot slopes of day-by-day linear models of irradiance and power production against days. In order to plot, one must specify a file_path and file_name to save the plot under; it is not returned within the R environment. The generated boxplot is of those same slopes, visually identifying outliers which may represent failures or soiling.

Seasonality Decomposition

Following power prediction, seasonality may still be apparent in the data. This is often the case in the XbX model, the data-driven nature of which is prone to leaving in seasonality. Decomposition, the statistical method of removing seasonality from data, can be performed on such power predicted data.

test_xbx_wbw_decomp <- plr_decomposition(test_xbx_wbw_res, freq = 52, power_var = 'power_var', time_var = 'time_var', plot = FALSE, plot_file = NULL, title = NULL, data_file = NULL)

# generate a pretty table
knitr::kable(test_xbx_wbw_decomp[1:5, ], caption = "XbX Week-by-Week Decomposition: Resulting Data")

XbX Week-by-Week Decomposition: Resulting Data
raw	seasonal	trend	remainder	weights	sub.labels	interpolated	age	sigma	operating	power
2699.661	53.13425	2600.062	46.465313	1	subseries 1	FALSE	1	64.56543	1	2600.062
2686.040	93.35386	2599.413	-6.727034	1	subseries 2	FALSE	2	33.34219	2	2599.413
2711.840	96.67574	2598.748	16.415820	1	subseries 3	FALSE	3	33.58805	3	2598.748
2709.884	117.10601	2598.075	-5.296855	1	subseries 4	FALSE	4	34.55537	4	2598.075
2697.696	109.10639	2597.400	-8.809981	1	subseries 5	FALSE	5	49.33841	5	2597.400

# make plots of the decomposed data
raw_plot <- ggplot2::ggplot(test_xbx_wbw_decomp, aes(age, raw)) +
  geom_point() +
  geom_smooth( method = "lm") +
  theme_bw()

trend_plot <- ggplot2::ggplot(test_xbx_wbw_decomp, aes(age, trend)) +
  geom_point() +
  geom_smooth( method = "lm") +
  theme_bw()

seasonal_plot <- ggplot2::ggplot(test_xbx_wbw_decomp, aes(age, seasonal)) +
  geom_point() +
  geom_smooth( method = "lm") +
  theme_bw()

raw_plot
#> `geom_smooth()` using formula = 'y ~ x'

trend_plot
#> `geom_smooth()` using formula = 'y ~ x'

seasonal_plot
#> `geom_smooth()` using formula = 'y ~ x'