What is Logistic Regression?
Logistic regression models the probabilities for classification problems with two possible outcomes.
Here in this tutorial, we are performing prediction using Logistic Regression. The dataset contains three columns: Age, EstimatedSalary and Purchased. We will train our model using this dataset and consequently make the prediction. The model will take the input as age and estimatedsalary of the person and predict if the person is interested to purchase. 1 means yes and 0 for no.
Step 1: Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
Step 2: Importing the dataset
dataset = pd.read_csv(‘Social_Network_Ads.csv’)
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values
dataset.head() # The dataset contains 300 columns. Here we are printing only first five.
Step 3: Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)
Step 4: Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
Step 5: Training the Logistic Regression model on the Training set
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression(random_state = 0)
# Predicting a new result
# Predicting the Test set results
y_pred = classifier.predict(X_test)
Step 6: Making the Confusion Matrix
from sklearn.metrics import confusion_matrix, accuracy_score
cm = confusion_matrix(y_test, y_pred)