Implementing Simple Linear Regression without any Python Machine learining libraries

Simple Linear Regression implementation in Python

Simple Linear Regression implementation in Python

Simple linear regression implementation in python

Today we are going to implement the most popular and most straightforward regression technique simple linear regression purely in python. When I said purely in python. It’s purely in python without using any machine learning libraries.

When I said simple linear regression. What is going on your mind? Let me guess 😛

  • It’s so simple to implement simple linear regression.
  • Understanding simple linear regression is so comfortable than linear regression.
  • Time complexity level, simple linear regression will take less time to process.

I guess the above analysis you were doing when I said simple linear regression. Maybe the above assumptions were technically reasonable. But there is a particular reason to call it as simple linear regression. First, let’s understand why we are calling it as simple linear regression. Then we can start my favorite part, code the simple linear regression in python.

Building Simple Linear Regression without using any Python machine learning libraries Click To Tweet

What is simple linear regression?

In the linear regression analysis article, we mainly concentrated on explaining the linear regression concepts. We used the below equation while describing the linear regression general equations.

\hat{y} = {w}_{0} + {w}_{1} * {x}

The above equation is more likely the straight line equation.

\textrm{ y = m*x + c }

Where m is the slope of the straight line, and c is the constant value. If we compare the above two equations, we can sense the closeness of both the equations. They only differ in the way written except that everything is same.

 

In linear regression, the m ({w}_{1}) value is known as the coefficient and the c ({w}_{0}) value called intersect. In the above equation, we have only one dependent variable, and one independent variable is there. That’s the reason we have only one coefficient.

  • Dependent variable –> y or \hat{y}
  • Independent variable –> x or  {x}

If we have k independent variables. We will get k coefficient values. If we have more than one independent variable to predict the depended value, then it is called linear regression algorithm. When we have only one independent variable to predict the depended value then it simple linear regression problem.

Let me give few more examples to give you the difference between the linear regression and simple linear regression problems.

Simple linear regression examples

  • Using the feature number of room to predict the house price.
    • Number of rooms independent variable and price is dependent variable.
  • By considering the number of hours student studied to predict the marks percentage, the student will get.
    • Number of hours independent variable and marks percentage dependent variable.
  • Given time predicting the temperature outside your room.
    • Time is independent variable and temperature is dependent variable.

Linear regression examples

  • Using the features like numbers of rooms, how many years old, garden space to predict the house price.
    • The number of rooms, years old, garden area are independent variables, and the house price is the dependent variable.
  • By considering the numbers of hours student spent on English, Mathematics, Physics subject to predict the marks percentage the student will get.
    • The number of hours the student spent on English, mathematics, physics are the independent variables, and the student scores percentage is the dependent variable.
  • Given time, climate details to predict the temperature outside your room.
    • Time and the climate details are the independent variables, and the temperature is the dependent variable.

With the above explanation, I hope I addressed the difference between simple linear regression and linear regression.

In Shot:

Simple Linear Regression: Having one independent variable to predict the dependent variable.

Linear Regression: Having more than one independent variable to predict the dependent variable.

Now let’s build the simple linear regression in python without using any machine libraries.

To implement the simple linear regression we need to know the below formulas.

  • A formula for calculating the mean value.
  • A formula for calculating the variance value.
  • Formula for calculating the covariance between two series of readings (For suppose X, Y)
  • Formulas for calculating the {w}_{0} and {w}_{1} values.

Formula for calculating mean value

\textrm{mean(x)} = \frac{(x_1)+ (x_2)+(x_3) ... + (x_n)} {n}

Formula for calculating the variance value

\sigma^2 = \frac{\displaystyle\sum_{i=1}^{n}(x_i - mean(x))^2} {n-1}

Formula for calculating covariance between two series of readings

cov_{x,y}=\frac{\sum_{i=1}^{N}(x_{i}-mean(x))(y_{i}-mean(y))}{N-1}

Formula for calculating the {w}_{0} and {w}_{1} values

{w}_1 = \frac{covariance(x,y)} {variance(x)}

{w}_0 = mean(y) - (w_1 * mean(x))

We are going to use all the above listed formulas to implement the simple linear regression puruly in Python without any machine learning libraries.

In the process of implementing the simple linear regression in python first. We are going to implement all the above formulas. Then we are going to use the implemented function to build the simple linear regression model.

After that, we are going to use python tabular analysis package to implement the same simple linear regression model with few lines of code. We can treat it as checking the previous implementation.

Let’s start building the required functions in the order.

  • Mean Function.
  • Variance Function.
  • Covariance Function.
  • Functions to calculate the {w}_{0} and {w}_{1} values.

Function to Calculate the mean value

  • With the cal_mean function, we are going to calculate the mean value of the given readings.
  • We are calculating the sum of readings and storing in the readings_total.
  • Finding the number_of_readings by using the len function.
  • Using the readings_total and the number_of_readings values to calculate the mean.
  • Finally, we return the calculated mean value.

Function to Calculate the Variance Value

  • With the cal_variance function, we are going to calculate the variance of the given readings.
  • Using the already implemented cal_mean function, we are calculating the mean value.
  • Then we are calculating the difference between the each and every reading in the given readings to the mean value. After that, we are squaring the calculated difference value and storing the difference squared value in  mean_difference_squared_readings.
  • Finding the sum of the mean_difference_squared_readings and return the ratio of the variance sum and the number of readings -1 value.

Function to Calculate the Covariance Value

  • With the cal_covariance function, we are going to calculate the covariance between two series of readings. Let’s say the covariance between the readings_1 and readings_2.
  • Using the already implemented cal_mean function to calculate the mean of readings_1 and readings_2.
  • Then summing the product of the mean difference of the readings_1 and readings_2.
  • Finally, return the ratio of the covariance and the number of readings (readings_size – 1).

 With the above function we are ready to calculate the simple linear regression coefficients like {w}_{0} and {w}_{1} values. Once We implemented these, we can use those values to perform the prediction.

Functions to calculate the {w}_{0} and {w}_{1} values

  • From the above formulas for calculating {w}_{0} and {w}_{1} we are creating cal_simple_linear_regression_coefficients function.
  • To calculate {w}_{1}  value we need to find the ratio of covariance of the x_readings and y_readings and the variance of the x_readings.
  • Using the {w}_{1} we are calculating the $latext {w}_{0}$ value.
  • Finally, we are returning the {w}_{0} and $ latex {w}_{1}$ values.

Now Let’s use all the above implemented function to predict the house price using the simple linear regression technique.

Predicting House Price With Simple linear Regression In Python

Predicting House Price With Simple Linear Regression In Python

we are using the same house price dataset from linear regression implementation in python.

Let’s first load the dataset and see what are the features in the dataset. To load the dataset, we are going to use pandas.

  • We have given the input_path where the dataset is located.
  • Using the input_path we are loading the data into pandas data frame.
  • Next, with the loaded data frame we are calling the simple_linear_regression model.
  • Inside the simple_linear_regression function as of now we are just getting the header name and trying to print the header names.

If we have the pandas setup ready in our system. We can expect the below output.

Script Output

From the script output, we know that we are having one independent variable (square_feet) and one dependent variable (price). Our intention is to use the square_feet and price readings to calculate the simple linear regression coefficients. Then we are going to using the calculated simple linear regression coefficients to predict the house price.

Now lets’ write a simple function to visualize how the price of the house is varying with the square_feet. We are going to use the plotly scatter plot to visualize.

Now let’s call the scatter_graph function with squre_feet readings as x parameter and price readings as y parameter.

Now let’s use the house price dataset to model the simple linear regression.

  • In the simple_linear_regression function. We are using the already implemented cal_mean function to calculate the mean of square_feet and price.
  • Next, we are using the already implemented cal_variance function to calculate the variance of the square_feet and price.
  • After that, we are calculating the {w}_{0} and {w}_{1} values.
  • We are using the {w}_{0} and {w}_{1} values to perform the prediction. Which is nothing but prediction the house price given square_feet value.

Check out the complete code below.

Give me Five 🙂  with this we implemented the simple linear regression without any machine learning libraries.

The complete code can fork for our Github: simple linear regression code

Follow us:

FACEBOOKQUORA |TWITTERGOOGLE+ | LINKEDINREDDIT FLIPBOARD | MEDIUM | GITHUB

I hope you like this post. If you have any questions, then feel free to comment below.  If you want me to write on one particular topic, then do tell it to me in the comments below.

Trackbacks & Pings

  • How the logistic regression model works :

    […] dependent and the independent variables are the same which we were discussed in the building simple linear regression model. Just to give you a glance. The dependent variable is the target class variable we are going to […]

    4 months ago
  • Four Most Popular Coursera Data Science Specializations :

    […] the regression modeling practice course, you are going to learn multiple linear regression and logistic regression […]

    3 months ago

Leave a Reply

Your email address will not be published. Required fields are marked *