Wind Power Forecasting Model Through Data Analysis of Causes and Impacts of Global Warming

In this article, we aim to spread the need for change by informing the reader about global warming and its consequences. Global warming is a phenomenon in which average global temperatures rise due to the increase in greenhouse gases (GHG) emitted and trapped in the atmosphere. These specific gases are special in that they contribute to the flow of heat within the atmosphere as they trap more heat from the sun within the atmosphere. The most influential of these gases is CO2, which most of the data analysis in this article will be focusing on. There are many issues and problems that are attributed to global warming, along with a wide array of data that relates to the topic. By analyzing such data, one can find out the global trend in climate change and even come up with potential solutions for this increasingly threatening event. The lack of action taken by world leaders is due to general negligence and a lack of an effective and coherent solution to tackle climate change. A probable solution would be to find a surrogate to the main cause of GHG emissions, which is the substantial exploitation of conventional (nonrenewable) energy sources. Renewable energy sources are excellent substitutes to conventional energy sources and one of the most effective types is wind. This energy source is currently implemented in many areas around the globe and is producing better results than in the past when conventional energy sources were in use. Although wind turbines are more efficient than solar panels in producing electricity because they are not affected by the presence of sunlight, the amount of electricity generated can fluctuate due to various factors. To adjust to these varying conditions, machine learning regression algorithms can be managed to create a wind power forecasting model that predicts how much electricity can be produced on that day according to the varying factors; it will allow people to be less dependent on those factors. This article explores the trends for climate change and the avenues for change. It must be noted that averting one’s gaze away from this urgent issue resolves nothing.


Introduction
Global Warming is a phenomenon in which the average annual global temperature increases. Throughout earth's history, there have been eras, such as the late Mesozoic era, where the earth's temperature was higher than now at 35.5°C [1]. However, the global warming we face today differs in that it is not natural -rather, it is caused by human activity [2]. Even a small increase in the average annual global temperature can have detrimental environmental consequences that affect both humanity and mother nature. In terms of humanity, global warming will devastate many low-lying population centers and possibly, entire nations. One example of countries under severe threat is Bangladesh; the flat, low-lying landscape makes it extra vulnerable to even a small increase in sea levels whilst its extremely high population density with no place to go will result in a humanitarian crisis on an unparalleled scale. Heat waves will get longer and more frequent, sea levels will rise, rainstorms will intensify, and the coral reefs will encounter insurmountable damage [3]. However, at the status quo, such a disastrous issue-global warming-is undergoing a lack of heedlessness when such concerns are strikingly in need. As COVID-19 became viral, it took away almost all concerns on other problems, causing those dilemmas to trigger vast destruction. Of all the ravaging, global warming stood to be one of the most calamitous issues without the paltry awareness it once had--not to mention the fact that it was so mere that the crisis was at stake. Alas, negligence does not denote that the problem will stay motionless; the global warming problem still exists and even worse, it is going pear-shaped [4]. The issue must be given notice by the whole world and a conceivable solution must be proposed and put into operation.
The greenhouse effect is the fundamental cause of global warming [5]. When the earth receives thermal energy from the sun's radiation, the gases in the earth's atmosphere trap a certain amount of the heat from leaving, acting as an insulation for the planet [6]. Carbon dioxide is an especially crucial factor of global warming, as it charges up to 80% of the total greenhouse gases (GHGs) in the atmosphere [7]. As more and more GHGs are present within the atmosphere, the amount of heat trapped within the atmosphere increases, consequently resulting in global warming. This paper will discuss the urgency of global warming to bring it to the public's attention and put forward a solution to the issue. The key to resolve and amend the problem of global warming would be to find an alternative to the cause of GHG emissions, which includes the mass utilization of conventional energy sources. Hence, the substitute would be the operation of renewable energy sources (RES) [8]. Renewable energy allows the mitigation of GHG emissions and hence, alleviating the severity of global warming [9]. As the conversion from nonrenewable to renewable energy sources by 100% is possible via technological improvements, the world as a whole could seek all ecological, social, political benefits, and more [10].

The Problems
The figure above shows significant keywords of global warming. Hundreds of articles per topic were drawn from Google News and imported as data. These topics include global warming, climate change, and carbon dioxide emission. Keywords were selected from 100 articles of each respective topic, combined, and placed in a cloud-shaped picture. The font size denotes its significance. The most conspicuous keyword in the provided picture is 'climate change'. It implies that the Earth is experiencing a serious transition. Additionally, China and Europe are notable keywords in the picture. This sheds light on the importance of China and Europe within the current global situation.

Lack of Attention
As shown in Figure 2, 'Global Warming' and 'Greenhouse Gas' are receiving significantly less attention compared to other keywords in the United States. The use of 'pytrends' in python allows access to the global trend, which is the of googling the keyword received in a month, of a certain region; in this particular graph, the region is the United States and the keywords are 'COVID-19', 'Black Lives Matter', 'BTS', 'Global Warming', and 'Greenhouse Gas'. 'COVID-19' had the highest search result among the five keywords with 'Black Lives Matter', 'BTS', 'Global Warming', and 'Greenhouse Gas' following after, respectively. Approximately from the mid of February of 2020, 'COVID-19' started to be researched and eventually culminated at about August of 2020. 'BTS', meanwhile, remained constant, for the whole period shown in the graph above. Although it was searched significantly less, meaning that it received less attention than BTS, it was given more concern before COVID-19 was dispersed from its origin. The little fluctuation it had was zeroed after the immediate catastrophe, COVID-19 became known to the public. Here, the inherent lack of awareness of 'Global Warming' and 'Greenhouse Gas', which is closely related to the previous keyword, is manifestly visible. This is a clear indicator of the lack of public attention regarding 'Global Warming' despite it being of a different magnitude in terms of its severity and significance compared to other issues that are receiving more recognition.   Figure 3 shows the strength of correlation between the words between the respective variables from the x and y axises. 1 being the strongest relationship possible and 0 meaning no correlation whatsoever. It can be seen that all the boxes that have the same x and y variables aligned are all at a correlation strength of 1, since they are the same thing. However, there is also one combination with a correlation strength of 1. This correlation is the correlation between carbon dioxide and nitrogen oxide. It also seems to be that the further the representation heads to the bottom right, the stronger the correlation strength is.

Severeness
As depicted in Figure 4, the average number of heatwaves per year and intensity expresses the average temperature above the local threshold during heatwaves are both increasing. In general, both graphs demonstrate an increase throughout the decades (the 1960s to 2010s). In the 'Frequency by Decade' graph, a dramatic rise from the lowest mark, 2.04 (2dp.) heat waves to the highest point, 6.02 (2dp.) Heatwaves is visible. This means that the average number of heatwaves per year increased rapidly from the 1960s to 2010s. On the other hand, in the 'Intensity by Decade' graph, a ceiling trend from the lowest point, 1.99°C (2dp.) in the 1960s to a peak of 2.46°C (2dp.) in the 2010s is visible. Whether it be dramatic or gradual, both the average number of heatwaves per year and the average temperature above the local threshold during heat waves have risen over the past decades and the global warming problem is aggravating as time passes.

Emission Rates
As presented in Figure 5, various sectors contribute to form the whole GHG Emissions. Figure 5 shows GHG Emissions by sector with each of the sub-sectors (exhibited on the y-axis)' global share presented on the x-axis in percentage, where the term, 'sub-sector', are factors that cause GHG Emissions. As presented, there are profuse subsectors that are part of the combustion of fossil fuels. Of the Top 5, three sub-sectors, 'Road', 'Residential', and 'Unallocated fuel combustion'. Directly relate to the burning of fossil fuels. The majority of automobiles on the road, domestic machinery, and the factors of 'Unallocated fuel combustion' use fossil fuels to be managed. A dramatic difference in the percentage range of the GHG emission can be seen from the highest sub-sector being 'Road' by contributing 11.9% and the lowest being 'Grassland' by taking 0.1% of the whole GHG emission. Figure 6 shows the average annual greenhouse gas emission rate of countries like China, the United States, South Korea, the United Kingdom, and the average of 193 countries. China and the United States show massive differences in emission rates compared to other countries. Also notice the abrupt increase in the emission rate in China; the emission rate in 2018 is approximately 4 times greater than the emission rate in 1991. Other countries such as South Korea and the United Kingdom show relatively static and low emission rates. Despite this, the average emission rate is even lower than both South Korea and the United Kingdom, indicating that the majority of the countries have relatively low emission rates. since the data is based on 193 countries. According to the Greenhouse Gas Equivalencies Calculator made by the EPA, the total emission is equal to the greenhouse gas emission by 10,278 passenger vehicles driven from one year, and the sequestered CO 2 by 781,467 tree seedlings grown for 10 years.  Figure  7(A) has a similar shape to the Logistic Growth Curve which finds applications in a range of fields, including biology, economics, and geoscience. The inflection point which is 1997 implies Carbon dioxide emissions would not increase indefinitely because they would soon reach an asymptote. The asymptote actually formed from 2003 to 2007. After the graph reached the asymptote, the emissions gradually decreased. It is important to note that all 3 countries made more or less an inclination in average land temperatures, peaking in the 2000s. United States' global land temperatures show how human settlement causes changes in the natural environment as it slowly develops more and more and the industrial revolution molds the world. This is also interesting as it shows the increase in stability as time flows, as the land temperatures used to fluctuate up 800% or even down 1000%, but as the time gap from the present decreases, the less extreme the increases and decreases become, only amounting to a 150% increase at most. However, it must be noted that the average land temperature has increased as a whole in the more recent decades.

Consequences
As demonstrated in Figure 9, the average percentage of glaciers each year is gradually decreasing relative to that in 1950. Figure 9 shows the average volume of four glaciers (South Cascade Glacier, Gulkana Glacier, Wolverine Glacier, and Lemon Creek Glacier) with respect to that in the year 1950. There is a visible downwards trend and in fact an acceleration in the rate at which the glaciers are melting due to increasing energy demand and consequently, increasing greenhouse gas emissions. There are specific time periodsthat being 1997 and 2008-where the graph exhibits an anomalous behaviour where the rate of decrease falters. This roughly coincides with two major financial crises (1997 Asian Financial crisis and 2008 Global financial crisis) where human activity was significantly impaired. This is an excellent piece of data that shows how human activity such as manufacturing and the gas emission caused by it results in direct consequences.  As shown in Figure 10, the crude summer death rate is immense and this, acknowledging Figure 8, is due to the increase in global land temperature. Figure 10 presents a crude summer death rate per million of both the general population and the age 65+ population due to the high temperature over the years of 1999 to 2018. As shown, the death rate for 'Age 65+ Population' is higher than 'General Population. These deaths were caused by 'Heat-related Cardiovascular Disease' and more than a third of all heat deaths relate and are aftermaths of climate change, which is due to global warming [11]. While the 'General Population' has a moderate change of summer death rate over the period, the 'Age 65+ Population' has a rapid fluctuation. The summer death rate of the 'General Population' has a rather small difference between the climax, 1.08% (2dp.) in 1999, and the bottommost mark, 0.08%(2dp.) in 2004. On the other hand, the summer death rate of the 'Age 65+ Population' reached a peak amounting 5.98% (2dp.) in 1999 and again touched the lowest point amounting to only 0.28% (2dp.) in 2004. A conspicuous similarity between the two populations is that the highest and the lowest marks were, respectively, in the years 1999 and 2004.

Possible Solution for Reducing CO 2 Emission
Renewable energy is energy that is defined as being resources through natural means. Renewable energy includes solar energy, wind energy, geothermal energy, and more [12]. The most prominent of these types being solar and wind. This technology is already being used today, and it is showing great results. Although solar panels are more popular, it is without a doubt that wind energy is more efficient overall. Not only do wind turbines generate more electricity than solar panels, but they also work day and night unlike solar panels, and they release less CO 2 into the air [13].

Theoretical and Actual Analysis of Wind Energy
The above figure represents the correlation between the theoretical data of wind and the actual data of wind. The closer the value of each grid is to 1, the closer the two datas are. The color of the grid is blunter as the value is further away from 1. The main diagonal of the graph is constituted only of 1s, showing that identical factors of the two data have identical values. Other factors show nonidentical relations, but notice that the values are close to 1 near the center region of the graph. Since the order that the factors are in does not have specific criteria and is in a random order, the pattern shown above is a mere coincidence.

Machine Learning Regression Algorithm
In this section, the machine learning prediction model was implemented to predict the power generated by the features in Figure 11 using three algorithms: K-Neighbor regression, Linear regression, and XGB regression. K's nearest Neighbors regression, often called the KNN regression, is an algorithm used in Python to approximate a certain outcome from an independent variable by assessing other data sets in its neighborhood. It does this by averaging all the data points in its neighborhood and applying it to the independent variable given [14].
An example of an algorithm with a similar function as KNN is linear regression. It also predicts an outcome, the dependent variable when it is given the independent variable. The main difference is that it is designed to find the linear average of a set of data. Its advantages are that it performs remarkably well when provided with linearly separable data and that it is uncomplicated to train. Nevertheless, it has deficiencies and they are that it only estimates the linearity between dependent and independent variables and that it is susceptible to high signal-to-noise ratio (SNR) [15].
Another type of predicting algorithm is the XGB Regression, which uses gradient boosting decision trees to calculate the most likely dependent variable from an independent variable when given a data set [16]. This means that decisions are made through splitting decisions which narrow the dependent variable down into a more specific area, eventually calculating an answer. This type of prediction algorithm is good for classification and is better suited for less extreme numbers than the other types of prediction algorithms [17].
A wind power forecasting model built by the utilization of these three machine learning regression algorithms allows for easing the complexity of wind fluctuation. This wind power forecasting model predicts how much wind can be generated according to the different factors shown in Figure 11. Wind, despite its superiority among the renewable energy sources in steadiness, also fluctuates in the amount that it generates due to various factors presented in Figure 11. This wind power forecasting model will allow people to rely more on wind energy than the conventional energies that contribute greatly to global warming. The process of teaching an algorithm to predict is largely constructed of two factors: training and testing. The algorithm divides the given dataset into the 'train dataset' and the 'test dataset' to enact these two sections. The 'train dataset' is "used to fit the machine learning models" and the 'test dataset' is "used to evaluate" how the machine learning model performs in reality [18]. K-neighbors regression scored 86.9 (3sf.)% in training and 57.7 (3sf.)% in testing for accuracy while linear regression resulted in 44.1 (3sf.)% in training and 43.8 (3sf.)% in testing. On the other hand, XGB regression excelled by totaling 99.8 (3sf.)% in training and 95.3 (3sf.)%. All K-neighbors regression, linear regression, and XGB regression were trained and tested and the result was that for the prediction of wind energy generation, XGB regression had the highest accuracy among the three machine learning regression algorithms in both the training and testing dataset.

Conclusion
Global warming is a fast-approaching problem. It is undeniable that global warming is real and threatening, proven by all data across the globe. Because of this, it is crucial to create and initiate possible solutions or at least inhibit the continuation of global warming. The most probable solution to this is most likely using renewable energy to replace fossil fuels. From the collected data, it seems that this will be the most helpful. Additionally, it is possible to know how effective wind energy will be against global warming through coding and data analysis. For future work, we are going to implement the prediction model that can predict the total power from various renewable resources.