# Introduction to Machine Learning: Logistic Regression

Unlike linear regression, logistic regression is used for classification rather than prediction along a continuous range. The secret sauce to logistic regression is an “activation function” that scores the independent variable(s) and returns a 0 if the resulting score is below threshold and 1 if the resulting score is above threshold. It can be used for a variety of binary classification problems such as predicting whether or not a patient has cancer or whether or not an email is spam.
The sigmoid activation function is $\frac{1}{1+e^{-x}}$ and looks like

In this module, we’ll go over a simple logistic regression model, and do multinomial logistic regression and a train/test split with the Iris dataset from sklearn.
We’ll start by importing the libraries we need. These are the same libraries we installed in the last article, Introduction to Machine Learning: Linear Regression.

import numpy as np
import math
from sklearn.linear_model import LogisticRegression
import matplotlib.pyplot as plt
import random

## Logistic Regression on a Randomized Dataset

Let’s create a test data set. We’ll create a dataset based on our image above, remember that based on the threshold (0.5) all the negative numbers (and 0) will be classified as 0, and the positive numbers will be classified as 1.

x = np.array([-3, -2, -1, 0, 1, 2, 3]).reshape(-1, 1)
y = np.array([0, 0, 0, 0, 1, 1, 1])

Now, all we have to do is create our Logistic Regression model from the sklearn library and run a test to see how it does.

# fit the model
model = LogisticRegression().fit(x, y)

# test data
new_x = np.array([-13, -0.5, 1, 0.3, -5, 11, 12]).reshape(-1, 1)
new_y = np.array([0, 0, 1, 1, 0, 1, 1])

# predict
model.predict(new_x)

# expected output
array([0, 0, 1, 0, 0, 1, 1])

# score the model
model.score(new_x, new_y)

# expected output (this is equal to 6/7 points classified correctly)
0.8571428571428571 

Our model does not do that well on this dataset, but the model isn’t the blame, we only gave it 7 points to work with. It classified 6/7 correctly, and we can see that there’s a margin for error around 0 because 0.3 is misclassified. Below, we’ll plot the expected points in blue, the predicted points in red, and the softmax function.

plt.scatter(new_x, new_y, color="blue", alpha=0.5)
plt.scatter(new_x, model.predict(new_x), color="red", alpha=0.5)
logx = np.linspace(-13, 12, 100)
logy = 1/(1+math.e**-logx)
plt.plot(logx, logy)

The expected output is: (this is the same image I used above to demonstrate what the logistic curve would look like)

## Logistic Regression on the Iris Dataset

In our next example we’ll import data from sklearn’s Iris dataset to do logistic regression on. After we load up the dataset we’ll want to examine what the data looks like before we do regression on it. Let’s load the y values, and the first and last 10 X values.

from sklearn.datasets import load_iris
X, y = load_iris(return_X_y = True)
print(y)
print(X[0:10])
print("...")
print(X[-10:])

# expected output
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2]
[[5.1 3.5 1.4 0.2]
[4.9 3.  1.4 0.2]
[4.7 3.2 1.3 0.2]
[4.6 3.1 1.5 0.2]
[5.  3.6 1.4 0.2]
[5.4 3.9 1.7 0.4]
[4.6 3.4 1.4 0.3]
[5.  3.4 1.5 0.2]
[4.4 2.9 1.4 0.2]
[4.9 3.1 1.5 0.1]]
...
[[6.7 3.1 5.6 2.4]
[6.9 3.1 5.1 2.3]
[5.8 2.7 5.1 1.9]
[6.8 3.2 5.9 2.3]
[6.7 3.3 5.7 2.5]
[6.7 3.  5.2 2.3]
[6.3 2.5 5.  1.9]
[6.5 3.  5.2 2. ]
[6.2 3.4 5.4 2.3]
[5.9 3.  5.1 1.8]]

We see that our y values are discrete so we can perform logistic regression on them. We can also see that we have 3 classes instead of the usual 2 that logistic regression is meant to be used for. Luckily, sklearn’s Logistic Regression package offers an option to do multi class logistic regression. Along with our train test split, we will also scale X to a standard scaler to ensure that the model converges. A standard scaler has a mean of 0 and a standard deviation of 1.

# imports for preprocessing
from sklearn import preprocessing
from sklearn.model_selection import train_test_split

# create normalized x values
xscaler = preprocessing.StandardScaler().fit(X)
xscaled = xscaler.transform(X)

# create training/test datasets
x_train, x_test, y_train, y_test = train_test_split(xscaled, y, test_size=0.2, random_state=1)

# create model
model = LogisticRegression(multi_class="multinomial", random_state=1).fit(x_train, y_train)

# test model
model.score(x_test, y_test)

# expected output
0.9666666666666667

# let's take a look at the predictions
print(y_test)
print(model.predict(x_test))

# expected output
[0 1 1 0 2 1 2 0 0 2 1 0 2 1 1 0 1 1 0 0 1 1 1 0 2 1 0 0 1 2]
[0 1 1 0 2 1 2 0 0 2 1 0 2 1 1 0 1 1 0 0 1 1 2 0 2 1 0 0 1 2]

Nice, our model did a good job. In 30 predictions, it only missed 1. Unfortunately due to the fact that our X variable has 4 dimensions, we won’t be able to graph this.

I run this site to help you and others like you find cool projects and practice software skills. If this is helpful for you and you enjoy your ad free site, please help fund this site by donating below! If you can’t donate right now, please think of us next time.

Yujian Tang

I started my professional software career interning for IBM in high school after winning ACSL two years in a row. I got into AI/ML in college where I published a first author paper to IEEE Big Data. After college I worked on the AutoML infrastructure at Amazon before leaving to work in startups. I believe I create the highest quality software content so that’s what I’m doing now. Drop a comment to let me know!

One-Time
Monthly
Yearly

#### Make a yearly donation

Choose an amount

$5.00$15.00
$100.00$5.00
$15.00$100.00
$5.00$15.00
$100.00 Or enter a custom amount$