Data Prediction using Logistic Regression | Code is written in Python.

What is Logistic Regression?

Logistic regression models the probabilities for classification problems with two possible outcomes.

Here in this tutorial, we are performing prediction using Logistic Regression. The dataset contains three columns: Age, EstimatedSalary and Purchased. We will train our model using this dataset and consequently make the prediction. The model will take the input as age and estimatedsalary of the person and predict if the person is interested to purchase. 1 means yes and 0 for no.

Step 1: Importing the libraries

import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

Step 2: Importing the dataset

dataset = pd.read_csv(‘Social_Network_Ads.csv’)

X = dataset.iloc[:, :-1].values

y = dataset.iloc[:, -1].values

dataset.head() # The dataset contains 300 columns. Here we are printing only first five.

Step 3: Splitting the dataset into the Training set and Test set

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)

Step 4: Feature Scaling

from sklearn.preprocessing import StandardScaler

sc = StandardScaler()

X_train = sc.fit_transform(X_train)

X_test = sc.transform(X_test)

Step 5: Training the Logistic Regression model on the Training set

from sklearn.linear_model import LogisticRegression

classifier = LogisticRegression(random_state = 0), y_train)

# Predicting a new result


Output: 0

# Predicting the Test set results

y_pred = classifier.predict(X_test)

print(np.concatenate((y_pred.reshape(len(y_pred),1), y_test.reshape(len(y_test),1)),1))

Step 6: Making the Confusion Matrix

from sklearn.metrics import confusion_matrix, accuracy_score

cm = confusion_matrix(y_test, y_pred)


accuracy_score(y_test, y_pred)


Leave a Comment


Enjoy this blog? Please spread the word :)