Machine Learning and Its Importance for Sustainable Energy Systems
It is often said that artificial intelligence (AI) is the future, but in truth, AI is no longer the future — it is the present. And no, the rise of AI doesn’t mean that robots will take over the world and end humanity. More accurately, AI is represented everywhere from Google’s search algorithms to Spotify’s recommendation system. 20 years ago, life was very different.
Human intelligence is beyond crazy, and although people have created crises like climate change, poverty, and wars, they are also responsible for phenomenal technological innovations. The possibilities to do things like watch videos, turn on a light, send an email, and look up the definition of a word online can all be credited to human intelligence. Humans have the cognitive abilities to learn, understand, and apply logic, and if we could give these traits to something “artificial” like a machine, then think of where we could be in another 20 years.
Demystifying Machine Learning
Humans are often afraid of what they don’t understand, and there can be some mystery and uncertainty around new tech like AI. But if we take the time to understand this tech, then it becomes clearer and clearer that it would be difficult for a robot programmed to pour coffee to suddenly become evil and create a deadly weapon. If a machine is given data or programmed to do something specific, it’s pretty much impossible for it to take over the world (unless that’s what it’s told to do…but that’s a fairly difficult task to code).
One of the most popular AI techniques is known as machine learning (ML). ML is just a subset of AI that solves tasks by learning from data and making predictions. A machine can therefore learn without being programmed with explicit instructions. The machine is also likely not what you are thinking of — rather, it is a program, or an algorithm, with the purpose of identifying patterns and relationships in the data that it has been given.
ML allows software applications to become more accurate at predicting outcomes using historical data as input. The machine imitates the way that humans learn, over time improving in performance and accuracy. A common example is picture to text. When we present a machine with images of handwritten letters from a-z, we ask it to tell us which letter is being shown. The machine will likely say a random letter initially, like c, but then we give it the right answer (like n) and the next time, when presented with the letter ’n’, the machine will say ’n’ instead of ‘c’. It goes on like this until the model is trained to have “learned” the alphabet.
Importance of Machine Learning
Machine learning is important because it is the core sub-field of AI. It enables computers to get into a self-learning mode without explicit programming and gives companies views of trends in customer behaviour and operational patterns. Being able to both collect customer data and associate it with behaviours over time gives algorithms the ability to help enterprises adjust product development and marketing strategies to customer demand. For example, if a company can see 60% of its consumers looking at the same boots over a period of two weeks, maybe it’s a sign to increase supply of those boots.
ML is the core of many large corporations’ operations, from Google to Uber to IBM. Facebook’s recommendation engine? That’s ML. Recommender systems are ML that help users discover new product and services. That’s also Netflix for you. And Amazon, and Instagram. ML is used to personalize how each member’s feed is delivered. If someone looks at a facial care item on Amazon, similar facial care items will start popping up on their feed. What’s happening is there is an engine trying to reinforce patterns into a consumer’s online behaviour. If that member doesn’t look at skin care for the next few weeks but seems to become really engrossed with mini shoes for barbies, the feed will start spitting out barbie shoes. Now, and even more so in the years to come, ML is a necessary competitive edge.
Machine Learning vs Deep Learning
Artificial intelligence in general can get confusing because it’s a huge umbrella topic encompassing a lot. That’s why, when I decided I wanted to get more into AI, I had to define what this actually meant for me. It’s impossible to just “learn AI”. You have to be specific about it and understand the type of model you’re creating. Sometimes machine learning and deep learning can get confused with one another — so what really is the difference?
As we’ve covered, ML is a branch of AI built on the foundation of systems learning from data, identifying patterns and making decisions with little human intervention. Deep learning, or DL, on the other hand, is a sub-field of ML models that leverage multi-layered neural networks (NNs). It’s important to note that neural networks and artificial neural networks (ANNs) are pretty much used interchangeably — the more “correct” term is an artificial neural network because a neural network is what human brains contain, and ANNs are trying to artificially reproduce this in ML. ANNs are, as the name implies, inspired by the biology of human brains. However, where a human brain lets any neuron connect to any neuron, ANNs work with many layers.
Breaking Down ANNs
ANNs are made of a node (also called an “artificial neuron”), which comprise an input layer, hidden layer(s), and output layer. Each node connects to another node, and each layer contains one or more neurons. Further simplified, nodes are organized into layers to comprise a network. An ANN with two or three layers is called a basic neural network, and an ANN with more than three layers is called a deep neural network. With more hidden layers comes more computational and problem-solving abilities — and of course, more complexity.
The input layer takes in parameters that can be loaded from an external source such as a web service or a csv file, and the output layer spits out an outcome. The hidden layers, of which there are one or more of, are sandwiched between these layers. These hidden layers are what makes a neural network a neural network! They allow you to model complex data because of their nodes and are called “hidden” because the true node values are unknown in the training data. The only values we know are the input and output. In the hidden layer, artificial neurons take in weighed inputs and biases and produce an output through an activation function. That may sound confusing, which is why the next section covers a bit of neural network lingo.
ANN Lingo
Weights control the strength of connection between two neurons. Inputs are multiplied by weights, which determines how much influence the input has on the output. Weights reflect how important an input is and are useful for conveying the significance of features in predicting output values. Features with weights closer to 0 have less importance in the prediction process, and features with weights that have a larger value have more importance. For example, if you are trying to predict the size of house someone will choose, your input features may include location, income, age, and number of residents. You may weigh the income higher than age because you think it is more important.
Biases, conversely, are an additional input into the next layer that will always have the value of 1 (it is a constant value). The bias unit allows for activation in the neuron, even if every input is zero. For example, if you wanted your ANN to return 3 when the input is 0, you can add a bias of 3. The bias shifts the result of activation function towards the positive or negative side, and can be referred to as a y-intercept in the line equation.
An activation function buffers the data before it is fed to the output layer. Activation functions add non-linearity to the output, meaning ANNs can solve non-linear problems (which gives it great ability for scaling). Without an activation function, an ANN is just a linear regression model which does not have the same capacities.
How DL and ML Algorithmically Differ
DL and ML differ in how their algorithms learn. DL is sometimes called “scalable machine learning” because it automates the majority of the feature extraction, minimizing the need for manual human intervention and maximizing the size of data sets that can be used. ML is much more dependent on human intervention and structured data. Remember, DL and ML are not two separate “types” of AI: DL is, as we mentioned, “deep machine learning” so it’s kind of like more efficient and complex ML.
An ML model works by being fed with data and learning from it, so that with time, the model becomes better trained because it is continually learning. Also, because ML models are adaptive, they continually evolve and identify patterns in the data so that data is the only input layer. On the other hand, an ANN is very complicated because there are several layers of nodes (like we just talked about) where each node 1.) classifies the information from the previous layer and 2.) passes the results to the nodes in the next layer.
Lastly, ML and DL models are categorized differently. Where ANNs can be classified into feed-forward, recurrent, modular, and convolutional NNs, ML models are categorized into three main groups. We’re about to get to this!
Types of Machine Learning
There are three types of machine learning: supervised learning, unsupervised learning, and reinforcement learning. We’ll talk a bit about each!
Supervised Learning
In supervised learning, an ML algorithm is trained on labelled data. What this means is the algorithm is given a small dataset to train on, which is a smaller part of a bigger dataset. This training dataset is made of inputs and correct outputs, enabling the model to learn over time and adjust until error has been minimized. After this happens, the machine is provided with new data so that the algorithm can analyze the training data and yield the desired output from that labelled data. The supervised ML algorithm will continue to improve after deployment as it trains itself on a new dataset.
For example, say you want to teach a model to be able to identify the name of a flower from looking at picture. The first step is training the machine with a ton of different flowers. The model will have to recognize different attributes, like: If the flower has pink hues and an ice-cream-cone shaped central structure, then it will be labelled as “LOTUS”. After the training data, the model will be given a new separate flower and asked to identify it and the machine will have to take the information from the previous data it has been taught. It will classify the flower with its shape and color and will affirm the flower name as LOTUS, then put it in the lotus flower category.
Supervised learning can be separated into two categories of algorithms: classification and regression. Classification assigns test data into categories, like “pink”, “soft”, “hard”, and “smooth”. Regression is used for understanding the relationship between independent and dependent variables. It is used to predict a continuous outcome, or projection, based on predictor variables.
The different types of supervised learning are:
- Regression
- Logistic Regression
- Classification
- Support Vector Machine
- Naive Bayes Classifiers
- Decision Trees
- K-NN
- Random Forest Model
- Neural Networks
Supervised learning can be very powerful as it allows data collection and output with the help of experience. It can solve a variety of computation problems and is especially helpful for binary classification (dividing data into two categories), multi-class classification (a classification task with more than two classes), regression modelling (for the prediction of continuous values), and ensembling (combining multiple hypotheses to form a better hypothesis).
Challenges with supervised learning include the time and expertise needed to train supervised learning models. There is also an increased chance of human error, and the model cannot cluster data on its own. The difference between classification and clustering is that classification uses pre-assigned classes, and clustering identifies similarities between objects and then groups them according to the characteristics they have in common.
Unsupervised Learning
Unsupervised learning is a type of ML where the data provided to the algorithm is not classified or labelled. This means that the algorithm must identify patterns and information in the training data set on its own — it both analyzes and cluster these unlabeled data sets.
Taking an example, let’s say we give a machine an image with T-shirts and dresses. The machine doesn’t know the difference between these so instead it can categorize them into two groups based on similarities and differences that it can detect. There is no training data to do this, so the model works on its own to find patterns.
Unsupervised learning requires much more minimal human intervention in comparison to supervised learning, and is optimal for clustering (splitting a dataset into groups based on detected similarities/patterns), association mining (finding patterns and relationships between variables in big databases), anomaly detection (identifying data points that deviate from a data set’s normal behaviour), and dimensionality reduction (reducing the number of input variables in a data set). In situations where it doesn’t make sense or is impossible for a human to propose trends in a data set, unsupervised learning can provide unique insights.
The different types of unsupervised learning are:
- K-means clustering
- Hierarchal clustering
- Anomaly detection
- Neural Networks (note that these can be supervised or unsupervised)
- Principle Component Analysis
- Independent Component Analysis
- Apriori algorithm
Challenges with unsupervised learning include longer training times, computational complexity, and more inaccurate results. Also, despite the algorithm splitting the data, this does not mean it will necessarily tell you how it did so, or what the similarities are in the clusters it has created.
Reinforcement Learning
Thirdly there is reinforcement learning (RL), which is an ML method based on rewards (when it does a desired behaviour) and punishments (when it does an undesired behaviour). An RL agent can perceive and interpret its environment and learn through the process of trial and error. You can think of it like a game with rewards and penalties for its actions. The agent wants to maximize the total reward, so when it does an undesirable action it will learn from its mistakes. This shares a similarity with unsupervised ML in that the model has to figure out how to perform a task to maximize the given reward; it is given no hints, suggestions, or training data. In short, RL is about learning optimal behaviour in an environment to obtain the maximum reward.
Two important RL learning models are the Markov Decision Process and Q learning. RL is often used in areas such as robotics, teaching bots to play video games, and helping companies plan their allocation of finite resources. It’s valuable because it helps you to find which situation needs an action and which action yields the highest reward. The main challenge is preparing the actual simulation environment; the more complex the model, the more realistic the simulator has to be. Transferring the model from a training environment to a real-world environment is a difficult task, and altering the NN that controls the agent can also be tricky because the only way to communicate with the network is through rewards and penalties. Thus, in some scenarios, new knowledge = erased old knowledge, which is obviously not great!
Applications in Energy
As I mentioned, AI is an umbrella topic — but after all the different types of ML discussed, we now know that ML too is a pretty huge subject! That’s why when looking into leveraging ML, it’s often a good idea to use it for a specific industry, problem, or purpose. Something I’m super interested in is energy, and as it turns out, ML could be a huge game-changer for determining energy demands, predicting anomalies, and optimizing prices. We’ll go over each in brief.
Predicting Energy Demands
One use of ML algorithms is predicting energy demands on a particular day. We have a lot of new energy sources out there now, but the rise in global population is creating a skyrocketed demand for energy. This is creating a real need for efficient storage to fully utilize the energy that we do have, because if electricity isn’t stored it has to be used the moment it’s generated.
Let’s take an example of solar energy. Solar production peaks at mid-day which causes demand for other energy to drop off. This drop in demand is sometime called the “duck curve” because the more capacity for solar increases, the more the curve resembles a duck belly. Solar floods the market when the sun shines but drops off when electricity demand peaks in the evening. When more solar energy is exported to the grid, which generally happens across the middle part of the day because the sun is shining, the curves deepen.
By tracking daily energy consumption changes for individual customers over a prolonged period of time, ML models can generate accurate energy demand forecasts. This is super valuable for manufacturing companies to be able to optimize their operations, price their products correctly, and understand where there operations need to expand. Poor forecasting can lead to lost money, time, and customers.
Predicting Equipment Failures and Anomalies
It can be very difficult to detect malfunctions in energy systems, and when a system does fail, many negative effects can occur like huge financial losses and even environmental explosions like fire. ML can play an important role in making sense of big amounts of energy data to create predictions about outages, anomalies, and equipment failures. By monitoring and analyzing energy consumption, an algorithm can categorize anomalies and detect problems before their outcomes come to fruition which can help companies run with more efficiency. One example is in fusion energy, where ML can help detect anomalies in the plasma.
Price Optimization
Lastly, neural networks can be leveraged for price optimization models to predict the demand for energy consumption and make pricing recommendations. These models can analyze large amounts of data that humans otherwise could not, make non-linear correlations between supply and demand, and overall help energy companies reach their objectives. ML price optimization techniques can help retailers predict the best prices for services and products to be able to sell in a given time period.
TL;DR
All in all, machine learning has some pretty incredible applications but to be able to properly utilize it, we need to understand its types, algorithmic challenges, and when to use labelled and unlabelled data. Here’s an overview of what we learned:
- ML is a subset of AI that solves tasks by learning from data and making predictions without needing to be programmed with explicit instructions
- An ML model improves in accuracy over time, similar to how humans learn
- Deep learning is a sub-field of ML models that contains multi-layered neural networks, or artificial neural networks (ANNs)
- ANNs are made of nodes which each comprise an input layer, at least one hidden layer, and output layer
- ANNs are like “scaleable ML models” as they are more powerful but also more complex because they have many layers
- There are three types of ML: supervised learning (which use labelled data) unsupervised learning (which use unlabelled data), and reinforcement learning (which is a system based on rewards and punishments)
- ML can be very handy with energy systems, namely for determining energy demands, predicting anomalies, and optimizing prices
Thank you so much for reading this! I’m a 15-year-old passionate about sustainability, and am the author of “Chronicles of Illusions: The Blue Wild”. If you want to see more of my work, connect with me on LinkedIn, Twitter, or subscribe to my monthly newsletter!