Last Updated: 2/6/2023
Regression
What is Regression?
Regression is a way of predicting what might happen in the future. Let's say you want to know how much you will grow in the next year. You might look at how tall you are now and how tall you were last year and use that information to guess how tall you will be next year. That's a straightforward example of regression. In more complicated cases, we might use a bunch of different information to try and make a prediction. For example, we can use information about the weather or about how much you eat and exercise to predict how much you will grow. Regression is a way of using all this information to make the best possible prediction.
"Regression is one of the main types of supervised learning in machine learning. In regression, the training set contains observations (also called features) and their associated continuous target values. The process of regression has two phases: The first phase is exploring the relationships between the observations and the targets. This is the training phase. The second phase is using the patterns from the first phase to generate the target for a future observation. This is the prediction phase." — Python Machine Learning By Example, Yuxi Hayden Liu
Examples
- Regression can be used to predict how much a student's test scores will improve over the course of a school year, based on their previous test scores and how much they study.
- We can predict how much a company's sales will increase over the next quarter, based on their previous sales and other factors like the state of the economy.
- In Healthcare regression can estimate how much a patient's blood pressure will decrease after starting a new medication, based on their previous blood pressure readings and other factors like their age and weight. This could be used to help the patient to improve their eating habits or do more exercise before it's too late.
- One of the most popular regression problems is predicting the price of a house, based on its size, location, and other factors that might affect its value.
- You can also predict how much rainfall a particular region will receive in the next year, based on past rainfall patterns and other factors like the El Niño weather pattern. If you want to include the date or time as another feature to predict, the problem becomes forecasting, we will study it as another use case.
These are just a few examples, but regression can be used to make predictions in many different fields and situations. We invite you to write us that other problems can be solved with Regression and share it with us on our social networks. #MLGRegressionInRealLife
An example of regression in a non-programming context could be trying to estimate the weight of an object based on its volume. In this case, the object's weight would be the dependent variable you are trying to predict, and the object's volume would be the independent variable. By measuring the object's volume and using a continuous estimation model, you could estimate the object's weight with a certain level of accuracy. As humans, we calculate regressions all the time, for example, estimating how long it will take to get to a place based on distance or how long it has taken us on other occasions. We also approximate the age of people by looking at their faces, tone of voice, or how they dress.
"Another typical task is to predict a target numeric value, such as the price of a car, given a set of features (mileage, age, brand, etc.) called predictors. This sort of task is called regression. To train the system, you need to give it many examples of cars, including both their predictors and their labels (i.e., their prices)." — Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, Aurélien Géron
Regression is based on predicting a numeric value (quantitative variable) from a determined number of numeric variables, that is, finding a relationship between one dependent variable and one or more independent variables. This relationship could be linear, polynomial, logarithmic, exponential, or nonlinear. Regression algorithms take a collection of labeled examples to learn the inputs-output relationship. Being a problem that is solved in a supervised way, most algorithms expect to receive the expected labels to train. Once trained, the algorithm could take any unlabeled sample as input and predict its label. Let's explain it better with some examples.
Remember:
- Regression problems predict continuous values like the price of houses or cars.
- A regression algorithm takes this information and finds the relationship between the features (independent variables) and the label (dependent variable).
- Once the algorithm has learned all the patterns between features and labels, it can predict the value of any house or car in the same region or country.
"Regression is a problem of predicting a real-valued label (often called a target) given an unlabeled example. Estimating house price valuation based on house features, such as area, the number of bedrooms, locations and so on is a famous example of regression." — The Hundred-Page Machine Learning Book, Andriy Burkov.
Regression in Real Life
- [Kaggle] Linear Regression
- [Kaggle] Vehicle Dataset
- [Kaggle] House Prices
- [Kaggle] Fish Market
- [Kaggle] Human Resources
- [Kaggle] Bike Rents for the Day
If you want to learn more about Continuous Estimation (Regressions) you should check out the following resources:
Textbooks
Videos