Wednesday, January 20, 2021

Machine learning

 

Machine learning (ML) is a type of artificial intelligence (AI) that allows software applications to become more accurate at predicting outcomes without being explicitly programmed to do so. 

 

Machine learning algorithms use historical data as input to predict new output values.

 




Applications of Machine Learning:

Dynamic Pricing
Product recommendations
Traffic prediction
Self-driving cars
Email Spam and Malware Filtering

 

 

 

Regression models

What is regression??

Regression can be said to be a technique to find out the best relationship between the input variables known as predictors and the output variable also known as response/target variable.

Best relationship is signified by minimal difference between the predicted and the actual values.

Regression analysis is an important tool for modelling and analyzing data. Here, we fit a curve / line to the data points, in such a manner that the differences between the distances of data points from the curve or line is minimized.

It indicates the strength of impact of multiple independent variables on a dependent variable.
 
 

Linear regression


It is a supervised learning algorithm mostly used in predictive analysis which typically means trying to fit the best straight line between the input and output variables in order to model our data.

this best fitting straight line is also known as regression line that minimizes the sum of the squared errors of prediction.

Characteristic of linear regression is the output variable should be continuous.

Linear Regression establishes a relationship between dependent variable (Y) and one or more independent variables (X) using a best fit straight line (also known as regression line).
 

 

 

 

 

 

Best fit - line or curve:

1. The maximum number of points covered.

2. Minimize the distance between other points (error -SSE)

 

 

 

 

 

 

 

The below-given equation is used to denote the linear regression model:

y=mx+c+e

where m is the slope of the line, c is an intercept, and e represents the error in the model

Linear regression where Y is the output variable and X is the input variable/variables.

 

 


 Find Slope & Intercept:

 

 

 

 Linear Regression Python code:



#Importing Needed packages

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

from sklearn import linear_model

 

 

#Reading the data in

path='C:/TrainingDocs/MachineLearningwithPython/homeprices1.csv'

df = pd.read_csv(path)

df

 

# summarize the data

df.describe()

 

#plot graph for datapoints

%matplotlib inline

plt.xlabel('area(sqr ft)')

plt.ylabel('price(US$)')

plt.scatter(df.Area,df.Price,color='red',marker='+')

 

 

#Using sklearn package to model data

#fitting training data and then generating predictions on test data

reg=linear_model.LinearRegression()

reg.fit(df[['Area']],df.Price)

 

 

#Predict price for area (3300 sq feet)

reg.predict([[3300]])

 

#Plot graph for prediction

#Draw the line on the scatter plot

%matplotlib inline

plt.xlabel('Area',fontsize=20)

plt.ylabel('Price',fontsize=20)

plt.scatter(df.Area,df.Price,color='red',marker='+')

plt.plot(df.Area,reg.predict(df[['Area']]),color='blue')

 

 

print ('Coefficients: ', reg.coef_)

print ('Intercept: ',reg.intercept_)

 

#Check Value of Coefficients

reg.coef_

 

#Check Value of intercept

reg.intercept_

 

#Validate linear equation

#y=mx+b

#134.07534247*3300+176232.87671232875

134.07534247*3300+176232.87671232875

 

#Predict price based on given area

path1='C:/TrainingDocs/MachineLearningwithPython/areas.csv'

d= pd.read_csv(path1)

d.head(3)

 

p=reg.predict(d)

d['prices'] = p

path2='C:/TrainingDocs/MachineLearningwithPython/prediction.csv'

d.to_csv(path2,index=False)

 

 


Multiple linear regression:

 

Predict for 3000 sq ft, 3 bedrooms, 40-year-old

 

 Multiple linear regression Python Code:



#Importing Needed packages

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

from sklearn import linear_model

 

 

 

#Reading the data in

path='C:/TrainingDocs/MachineLearningwithPython/homeprices1.csv'

df = pd.read_csv(path)

df

 

#Cleaning of data

import math

median_bedrooms=math.floor(df.Bedrooms.median())

median_bedrooms

 

#Assign some value to NaN

 

df.Bedrooms=df.Bedrooms.fillna(median_bedrooms)

df

 

 

#To Train model

reg=linear_model.LinearRegression()

reg.fit(df[['Area','Bedrooms','Age']],df.Price)

 

 

 

print ('Coefficients: ', reg.coef_)

print ('Intercept: ',reg.intercept_)

 

 

 

#Predict for 3000 sq ft, 3 bedrooms, 40 yeal old

reg.predict([[3000,3,40]])

 

 

#Validate multiple equation

#y=m1x1+m2x2+m3x3+y

#3000*137.25+3*-26025+40*-6825+383724.9999999998

3000*137.25+3*-26025+40*-6825+383724.9999999998

 

 

3 comments:

  1. I at last discovered extraordinary post here.I will get back here. I just added your blog to my bookmark locales. thanks.Quality presents is the vital on welcome the guests to visit the page, that is the thing that this website page is giving.
    data scientist course delhi

    ReplyDelete
  2. Such a very useful information!Thanks for sharing this useful information with us. Really great effort.
    data analytics courses in delhi

    ReplyDelete