For forecasting electricity generation from fickle renewable energy sources like wind and solar, there is help coming from artificial intelligence. Machine learning and (its subset) deep learning are beginning to replace conventional, weather and satellite data based forecasting or statistical prediction models.

Deep learning models — from simple artificial neural networks to complex ‘long short-term memory’ (LSTM, which is an architecture particularly effective for making predictions based on sequential data) — are coming into play, improving the accuracy of forecasting.

Now three researchers from the Indian Institute of Engineering Science and Technology, Kolkata, Rakesh Mondal, Surajit Kumar Roy and Chandan Giri, have come up with an improved AI technique for forecasting solar generation. Instead of using a simple deep learning model, these scientists employed an ensemble of deep learning models, which they describe as “one step more advanced than simple deep learning models.” The result, they say, is higher accuracy.

The AI advantage

Not that ensemble models, which combine predictions from multiple individual deep learning models, are entirely new. In a paper published in Energy, the authors acknowledge that other researchers have tried the ensemble model method but say that they have “included features that enhance accuracy of prediction” in their own research. These features include parameters like physical characteristics of solar panels including the number of cells in a panel, the maximum working temperature of the panel, the material type and ambient temperature. “None of the existing techniques has considered these parameters for solar power prediction,” the authors say.

Mondal, Roy and Giri have used a technique called ‘Bi-directional Long Short-term Memory’ or BI-LSTM — a type of recurrent neural network (RNN) designed to handle sequential data. Unlike standard LSTM, which processes data in one direction (past to future), BI-LSTM processes data in both directions (past to future and future to past). This allows the model to have a better understanding of the context by considering both past and future information.

The researchers prepared a dataset by combining weather parameters and solar generation data and then enriched the dataset by bringing in meteorological data as well as physical characteristics of the solar panels deployed in the respective solar plants. The BI-LSTM model, they say, can predict the future solar power generation of a specific solar plant on both short and long horizons regardless of the geographical position of the solar plant.

“For short-term prediction, we can predict the generation of solar power for fifteen minutes to one hour ahead, and for long-term forecasting, we are able to predict PV power generation for 1-3 days ahead with noticeable accuracy,” the paper says.

Mondal, Roy and Giri compared the results of the proposed model with the existing dataset and multiple standard deep learning models and found that “our model produced better performance than traditional models.” They also validated their model using different solar plants in Durgapur, India. “For long-term forecasting, our model also outperformed the base model.”

From data to decisions

In an emailed response to quantum’s questions, Dr Giri said the researchers used a time series dataset containing 14 independent features and one dependent feature. The dataset contained data for every 15 minutes from January 1 to December 31, 2022. “We tested our trained model with other datasets collected from solar plants situated in Durgapur, West Bengal. Then we tested our model with a published dataset collected from Denmark. We found our model gives similar results.”

No model is flawless. “We faced some limitations during the test,” Dr Giri said, noting that when there were abrupt changes in the weather parameters, they got slightly different results.

Asked if the ensemble model would call for high computational power, Dr Giri said that their model “is quite light weight” containing only 1.2 million parameters. “We believe that it will not be an issue during large-scale implementation,” he said.

“We believe that our model is trained with a very small amount of data,” he said, adding that they were trying to extend our work with a large amount of data to improve the efficiency of our model.

This work will help the researchers explore the other dimensions rather than a specific dataset but also the scientific knowledge of specific domains, Dr Giri told quantum.

Full version of the emailed responses of Dr Chandan Giri to quantum’s queries:

Q

How did you train the BI-LSTM component of your model? What kind of data pre-processing was required?

We trained our BI-LSTM model using the Google Colab platform and Python programming language. We used Keras API to build and train our ensemble with the help of TensorFlow, scikit-learn, and Numpy libraries.

Our model has an input layer, then two BI-LSTM layers, an average layer and final output. Our model takes input with the input shape of the dataset. Two BI-LSTM layers have different initial weights but are trained with the same input. A dropout layer follows each layer’s first bidirectional LSTM module; again, a bidirectional LSTM module takes the first module output as input, then it passes through a dropout layer, and the output of this layer is fed to another bidirectional LSTM module, and finally the output passes through a dense layer. Likewise, the final output is produced by averaging the output of both layers. We have used LeakyReLu in the LSTM module and the ReLU activation function in the dense layer.

Then we combined the data collected from the IIEST solar hub and CPCBI, and augmented our dataset using the physical characteristics of solar panels used in IIEST solar hub. During data preprocessing, we first deal with the missing values in the data and then the outliers in the said dataset to make it trainable. We then scaled our independent features.

Q

Could you elaborate on the specific metrics you used to compare the performance of your model with traditional models?

We went through a rigorous comparative analysis using standard performance metrics such as MSE( mean square error), RMSE(root mean square error), MAE(mean absolute error) and R2 score (coefficient of determination) to compare the result of the proposed model with the traditional models.

Q

How did you ensure that the comparison between your model and other standard deep learning models was fair?

The said comparison is fair enough because we followed standard rules for comparing the performances. We used the same hyperparameters and the same dataset for our proposed model and other modes.

Q

What was the size and nature of the dataset used for training and evaluation? How did you handle any potential biases in the data?

We used a time series dataset containing fourteen independent features and one dependent feature. The dataset contained data with a frequency of fifteen minutes from 1 January 2022 to 31 December 2022 with an 11-hour window (6:00 a.m. to 5:00 p.m.) as most of the data outside this window is near zero due to very low solar irradiance.

Q

Can you provide more details on how you validated the model’s performance across different geographical positions?

We tested our trained model with other datasets collected from solar plants situated in Durgapur WB, India 150 km far from our institute. On the other hand, we tested our model with a published dataset collected from Denmark and we found our model gives similar results.

Q

Did you identify any limitations or weaknesses of your model during testing? How do you plan to address these in future work? Are there any particular conditions or types of data where your model may not perform as well?

There are very few or no models that are flawless. In our work, we faced some limitations during the test. If there were abrupt changes in the weather parameters then we got slightly different results, but overall performance was quite good.

We believe that our model is trained with a very small amount of data. We are trying to extend our work with a large amount of data to improve the efficiency of our model and also we are exploring other possible ways to make the model more robust.

Q

How do you plan to address the computational demands of training and running an ensemble model based on BI-LSTM, especially for large-scale applications.

Our model is quite lightweight and it contains only 1.2 million parameters. So, we believe that it will not be an issue during large-scale implementation and subject to further investigation.

Q

What do you believe are the most novel aspects of your model compared to existing deep learning approaches?

Most of the deep learning models available for solar energy prediction generally use only weather parameters to predict solar power production but we trained our model, with some extra features produced using the physical characteristics of solar panels which makes the difference with the state-of-the-art models.

Q

How do you see your model influencing future research in the field of time series forecasting and deep learning?

The first author of this work is a JRF of Department of IT, IIEST, Shibpur, is in the early stage of his PhD under the guidance of Dr Chandan Giri and this work was published just two months ago. We also got some responses from other researchers both in India and abroad. We believe this work will help the researchers explore the other dimensions rather than a specific dataset but also the background scientific knowledge of specific domains