2 Ways to Implement Multinomial Logistic Regression In Python

May 15, 2017 Saimadhu Polamuri

Multinomial Logistic Regression Python

Implementing Multinomial Logistic Regression in Python

Logistic regression is one of the most popular supervised classification algorithm. This classification algorithm mostly used for solving binary classification problems. People follow the myth that logistic regression is only useful for the binary classification problems.

Which is not true. Logistic regression algorithm can also use to solve the multi-classification problems. So in this article, your are going to implement the logistic regression model in python for the multi-classification problem in 2 different ways.

In machine learning way of saying implementing multinomial logistic regression model in python.

Implementing multinomial logistic regression model in python. Click To Tweet

The difference between binary classification and multi-classification
- Binary classification problems and explanation
- Multi-classification problems and explanation
Introduction to Multinomial Logistic regression
Glass Dataset description
Multinomial Logistic regression implementation in Python
Conclusion

The difference between binary classification and multi-classification

The name itself signifies the key differences between binary and multi-classification. Below examples will give you the clear understanding about these two kinds of classification. Let’s first look at the binary classification problem example. Later we will look at the multi-classification problems.

Binary Classification:

Given the subject and the email text predicting, Email Spam or not.
Sunny or rainy day prediction, using the weather information.
Based on the bank customer history, Predicting whether to give the loan or not.

Multi-Classification:

Given the dimensional information of the object, Identifying the shape of the object.
Identifying the different kinds of vehicles.
Based on the color intensities, Predicting the color type.

I hope the above examples given you the clear understanding about these two kinds of classification problems. In case you miss that, Below is the explanation about the two kinds of classification problems in detail.

Binary Classification Explanation:

In the binary classification task. The idea is to use the training data set and come up with any classification algorithm. In the later phase use the trained classifier to predict the target for the given features. The possible outcome for the target is one of the two different target classes.

If you see the above binary classification problem examples, In all the examples the predicting target is having only 2 possible outcomes. For email spam or not prediction, the possible 2 outcome for the target is email is spam or not spam.

On a final note, binary classification is the task of predicting the target class from two possible outcomes.

Multi-classification Explanation:

In the multi-classification problem, the idea is to use the training dataset to come up with any classification algorithm. Later use the trained classifier to predict the target out of more than 2 possible outcomes.

If you see the above multi-classification problem examples. In all the examples the predicting target is having more than 2 possible outcomes. For identifying the objects, the target object could be triangle, rectangle, square or any other shape. Likewise other examples too.

On a final note, multi-classification is the task of predicting the target class from more two possible outcomes.

I hope you are having the clear idea about the binary and multi-classification. Now let’s move on the Multinomial logistic regression.

Introduction to Multinomial Logistic regression

Multinomial logistic regression is the generalization of logistic regression algorithm. If the logistic regression algorithm used for the multi-classification task, then the same logistic regression algorithm called as the multinomial logistic regression.

The difference in the normal logistic regression algorithm and the multinomial logistic regression in not only about using for different tasks like binary classification or multi-classification task. In much deeper It’s all about using the different functions.

In the logistic regression, the black function which takes the input features and calculates the probabilities of the possible two outcomes is the Sigmoid Function. Later the high probabilities target class is the final predicted class from the logistic regression classifier.

When it comes to the multinomial logistic regression the function is the Softmax Function. I am not going to much details about the properties of sigmoid and softmax functions and how the multinomial logistic regression algorithms work. As we are already discussed these topics in details in our earlier articles.

Before you drive further I recommend you, spend some time on understanding the below concepts.

I hope you clear with the above-mentioned concepts. Now let’s start the most interesting part. Building the multinomial logistic regression model.

You are going to build the multinomial logistic regression in 2 different ways.

Using the same python scikit-learn binary logistic regression classifier.
Tuning the python scikit-learn logistic regression classifier to model for the multinomial logistic regression model.

Glass Identification Dataset Description

The classification model we are going build using the multinomial logistic regression algorithm is glass Identification. The Identification task is so interesting as using different glass mixture features we are going to create a classification model to predict what kind of glass it could be.

We will look into, what are those glass types in the coming paragraph. Before that let’s quickly look into the key observation about the glass identification dataset.

Title	Glass Identification Dataset
Dataset Associated Task	Classification
Number of Observations	214
Number of features	10
Missing Values	No
Target	Glass Type

Features and Target Information

From the above table, you know that we are having 10 features and 1 target for the glass identification dataset, Let’s look into the details about the features and target.

Features:

Id number: 1 to 214
RI: refractive index
Na: Sodium (unit measurement: weight percent in the corresponding oxide, as attributes 4-10)
Mg: Magnesium
Al: Aluminum
Si: Silicon
K: Potassium
Ca: Calcium
Ba: Barium
Fe: Iron

Target: Type of glass

The glass identification dataset having 7 different glass types for the target. These different glass types differ from the usage.

building_windows_float_processed
building_windows_non_float_processed
vehicle_windows_float_processed
vehicle_windows_non_float_processed (none in this database)
containers
tableware
headlamps

Multinomial Logistic regression implementation in Python

Below is the workflow to build the multinomial logistic regression.

Required python packages
Load the input dataset
Visualizing the dataset
Split the dataset into training and test dataset
Building the logistic regression for multi-classification
Implementing the multinomial logistic regression
Comparing the accuracies

Let’s begin with importing the required python packages.

Required Python Packages

Below are the general python machine learning libraries. If you haven’t setup python machine learning libraries setup. Python machine learning setup will help in installing most of the python machine learning libraries.

# Required Python Packages
import pandas as pd
import numpy as np
from sklearn import linear_model
from sklearn import metrics
from sklearn.cross_validation import train_test_split

import plotly.graph_objs as go
import plotly.plotly as py
from plotly.graph_objs import *
py.sign_in('Your_ployly_username', 'API_key')

Pandas: Pandas is for data analysis, In our case the tabular data analysis.
Numpy: Numpy for performing the numerical calculation.
Sklearn: Sklearn is the python machine learning algorithm toolkit.
- linear_model: Is for modeling the logistic regression model
- metrics: Is for calculating the accuracies of the trained logistic regression model.
- train_test_split: As the name suggest, it’s used for splitting the dataset into training and test dataset.
Plotly: Plotly is for visualizing the data.

Now let’s load the dataset into the pandas dataframe.

Dataset Path

You can download the dataset from UCI Machine learning Repository or you can clone the complete code for dataaspirant GitHub account.

# Dataset Path
DATASET_PATH = "../Inputs/glass.txt"

Loading dataset

def main():
    # Glass dataset headers
    glass_data_headers = ["Id", "RI", "Na", "Mg", "Al", "Si", "K", "Ca", "Ba", "Fe", "glass-type"]
    # Loading the Glass dataset in to Pandas dataframe 
    glass_data = pd.read_csv(DATASET_PATH, names=glass_data_headers)

    print "Number of observations :: ", len(glass_data.index)
    print "Number of columns :: ", len(glass_data.columns)
    print "Headers :: ", glass_data.columns.values

if __name__ == "__main__":
    main()

The downloaded dataset is not having the header, So we created the glass_data_headres.
We are loading the dataset into pandas dataframe by passing the dataset location and the headers.
Next printing the loaded dataframe observations, columns and the headers name.

Script Output

Number of observations ::  214
Number of columns ::  11
Headers ::  ['Id' 'RI' 'Na' 'Mg' 'Al' 'Si' 'K' 'Ca' 'Ba' 'Fe' 'glass-type']

Before we implement the multinomial logistic regression in 2 different ways. Let’s understand about the dataset.

To understand the behavior of each feature with the target (Glass type). We are going to create a density graph. The density graph will visualize to show the relationship between single feature with all the targets types.

Not getting what I am talking about the density graph. Just wait for a moment in the next section we are going to visualize the density graph for example. Then you will get to know, What I mean by the density graph.

Now let’s create a function to create the density graph and stores in our local systems.

def scatter_with_color_dimension_graph(feature, target, layout_labels):
    """
    Scatter with color dimension graph to visualize the density of the
    Given feature with target
    :param feature:
    :param target:
    :param layout_labels:
    :return:
    """
    trace1 = go.Scatter(
        y=feature,
        mode='markers',
        marker=dict(
            size='16',
            color=target,
            colorscale='Viridis',
            showscale=True
        )
    )
    layout = go.Layout(
        title=layout_labels[2],
        xaxis=dict(title=layout_labels[0]), yaxis=dict(title=layout_labels[1]))
    data = [trace1]
    fig = Figure(data=data, layout=layout)
    # plot_url = py.plot(fig)
    py.image.save_as(fig, filename=layout_labels[1] + '_Density.png')

The function scatter_with_color_dimension_graph takes the feature, target, and the laytout_labels as inputs and creates the density graph I am talking about.
Later saves the created density graph in our local system.
The above code is just the template of the plotly graphs, All we need to know is the replacing the template inputs with our input parameters.

Now let’s call the above function with the dummy feature and target.

def main():
    glass_data = pd.read_csv(DATASET_PATH, names=glass_data_headers)

    print "glass_data_RI :: ", list(glass_data["RI"][:10])
    print "glass_data_target :: ", np.array([1, 1, 1, 2, 2, 3, 4, 5, 6, 7])

if __name__ == "__main__":
    main()

Script Output:

glass_data_RI ::  [1.52101, 1.5176100000000001, 1.5161799999999999, 1.51766, 1.51742, 1.51596, 1.5174299999999998, 1.51756, 1.51918, 1.51755]
glass_data_target ::  [1 1 1 2 2 3 4 5 6 7]

The above are the dummy feature and the target.

glass_data_RI: Is the feature and the values of this feature are the refractive index. These are the first 10 values from the glass identification dataset.
glass_data_target: Is the target and the values are the different glass types. In fact, I covered all the glass types (7 types.)

Now let’s use the above dummy data for visualization

def main():
    glass_data = pd.read_csv(DATASET_PATH, names=glass_data_headers)

    print "glass_data_RI :: ", list(glass_data["RI"][:10])
    print "glass_data_target :: ", np.array([1, 1, 1, 2, 2, 3, 4, 5, 6, 7])
    # Graph Labels
    graph_labels = ["Number of Observations", "RI & Glass Type", "Sample RI - Glass Type Density Graph"]

    scatter_with_color_dimension_graph(list(glass_data["RI"][:10]),
                                       np.array([1, 1, 1, 2, 2, 3, 4, 5, 6, 7]), graph_labels)

if __name__ == "__main__":
    main()

Calling the scatter_with_color_dimension_graph with dummy feature and the target. Below is the density graph for dummy feature and the target.

sample RI & Glass Type_Density

The above graph helps to visualize the relationship between the feature and the target (7 glass types)
The Yellow circle is for glass type 7.
The right sidebar will help to know the circle type (target glass type) by its color and the left side values are the corresponding feature values.
If we plot more number of observations we can visualize for what values of the features the target will be the glass type 7, likewise for all another target(glass type)

Now let’s create a function which creates the density graph and the saves the above kind of graphs for all the features.

def create_density_graph(dataset, features_header, target_header):
    """
    Create density graph for each feature with target
    :param dataset:
    :param features_header:
    :param target_header:
    :return:
    """
    for feature_header in features_header:
        print "Creating density graph for feature:: {} ".format(feature_header)
        layout_headers = ["Number of Observation", feature_header + " & " + target_header,
                          feature_header + " & " + target_header + " Density Graph"]
        scatter_with_color_dimension_graph(dataset[feature_header], dataset[target_header], layout_headers)

The function create_density_graph takes the dataset, features_header and target_headers as input parameters.
Inside the function, we are considering each feature_header in the features_header and calling the function scatter_with_clolor_dimenstion_graph.

Now let’s call the above function inside the main function.

def main():
    glass_data = pd.read_csv(DATASET_PATH, names=glass_data_headers)
    glass_data_headers = ["Id", "RI", "Na", "Mg", "Al", "Si", "K", "Ca", "Ba", "Fe", "glass-type"]
    create_density_graph(glass_data, glass_data_headers[1:-1], glass_data_headers[-1])

if __name__ == "__main__":
    main()

The above code saves the below graphs, Each graph gives the relationship between the feature and the target.

Density graph of Ri and glass type

RI & glass-type_Density

Density graph of Na and glass type

Na & glass-type_Density

Density graph of Mg and glass type

Mg & glass-type_Density

Density graph of Al and glass type

Al & glass-type_Density

Density graph of Si and glass type

Si & glass-type_Density

Density graph of K and glass type

K & glass-type_Density

Density graph of Ca and glass type

Ca & glass-type_Density

Density graph of Ba and glass type

Ba & glass-type_Density

Density graph of Fe and glass type

Fe & glass-type_Density

Please spend some time on understanding each graph to know which features and the target having the good relationship. So we can use those features to build the multinomial logistic regression model.

To build the multinomial logistic regression I am using all the features in the Glass identification dataset. You use the most suitable features you think from the above graphs and use only those features to model the multinomial logistic regression.

Training the multinomial logistic regression model requires the features and the corresponding targets. For this, we are going to split the dataset into four datasets. Which are

train_x
test_x
train_y
test_y

We are going to use the train_x and train_y for modeling the multinomial logistic regression model and use the test_x and test_y for calculating the accuracy of our trained multinomial logistic regression model.

Now let’s split the loaded glass dataset into four different datasets.

Split the dataset into training and test dataset

def main():
    glass_data_headers = ["Id", "RI", "Na", "Mg", "Al", "Si", "K", "Ca", "Ba", "Fe", "glass-type"]
    glass_data = pd.read_csv(DATASET_PATH, names=glass_data_headers)

    train_x, test_x, train_y, test_y = train_test_split(glass_data[glass_data_headers[:-1]],
    glass_data[glass_data_headers[-1]], train_size=0.7)

if __name__ == "__main__":
    main()

We are using the scikit-learn train_test_split method to split the glass dataset.
As we are passing 0.7 as the train_size value, The train_test_split method will split the glass dataset randomly 70% for training and remaining 30% for testing.

Building the logistic regression for multi-classification

In the first approach, we are going use the scikit learn logistic regression classifier to build the multi-classification classifier.

def main():
    glass_data_headers = ["Id", "RI", "Na", "Mg", "Al", "Si", "K", "Ca", "Ba", "Fe", "glass-type"]
    glass_data = pd.read_csv(DATASET_PATH, names=glass_data_headers)

    train_x, test_x, train_y, test_y = train_test_split(glass_data[glass_data_headers[:-1]],
    glass_data[glass_data_headers[-1]], train_size=0.7)
    # Train multi-class logistic regression model
    lr = linear_model.LogisticRegression()
    lr.fit(train_x, train_y)

if __name__ == "__main__":
    main()

Using the function LogisticRegression in scikit learn linear_model method to create the logistic regression model instance.
Next using the fit method with the train_x and train_y to fit the logistic regression model for the glass identification training dataset.

Implementing the multinomial logistic regression

In the second approach, we are going pass the multinomial parameter before we fit the model with train_x, test_x

def main():
    glass_data_headers = ["Id", "RI", "Na", "Mg", "Al", "Si", "K", "Ca", "Ba", "Fe", "glass-type"]
    glass_data = pd.read_csv(DATASET_PATH, names=glass_data_headers)

    train_x, test_x, train_y, test_y = train_test_split(glass_data[glass_data_headers[:-1]],
    glass_data[glass_data_headers[-1]], train_size=0.7)

    # Train multinomial logistic regression
    mul_lr = linear_model.LogisticRegression(multi_class='multinomial', solver='newton-cg').fit(train_x, train_y)


if __name__ == "__main__":
    main()

Comparing the accuracies

No compare the train and test accuracies of both the models.

def main():
    glass_data_headers = ["Id", "RI", "Na", "Mg", "Al", "Si", "K", "Ca", "Ba", "Fe", "glass-type"]
    glass_data = pd.read_csv(DATASET_PATH, names=glass_data_headers)

    train_x, test_x, train_y, test_y = train_test_split(glass_data[glass_data_headers[:-1]],
    glass_data[glass_data_headers[-1]], train_size=0.7)

    # Train multi-classification model with logistic regression
    lr = linear_model.LogisticRegression()
    lr.fit(train_x, train_y)

    # Train multinomial logistic regression model
    mul_lr = linear_model.LogisticRegression(multi_class='multinomial', solver='newton-cg').fit(train_x, train_y)
    
    print "Logistic regression Train Accuracy :: ", metrics.accuracy_score(train_y, lr.predict(train_x))
    print "Logistic regression Test Accuracy :: ", metrics.accuracy_score(test_y, lr.predict(test_x))
    
    print "Multinomial Logistic regression Train Accuracy :: ", metrics.accuracy_score(train_y, mul_lr.predict(train_x))
    print "Multinomial Logistic regression Test Accuracy :: ", metrics.accuracy_score(test_y, mul_lr.predict(test_x))

if __name__ == "__main__":
    main()

To calculate the accuracy of the trained multinomial logistic regression models we are using the scikit learn metrics method.
We are calling the metrics method accuracy_score function with actual targets and the predicted targets.

Multinomial logistic regression model Accuracy

Logistic regression Train Accuracy ::  0.885906040268
Logistic regression Test Accuracy ::  0.830769230769
Multinomial Logistic regression Train Accuracy ::  1.0
Multinomial Logistic regression Test Accuracy ::  1.0

From the result, we can say that using the direct scikit-learn logistic regression is getting less accuracy than the multinomial logistic regression model. Now you use the code and play around with.

Multinomial logistic regression Complete Code

#!/usr/bin/env python
# multinomial_logistic_regression.py
# Author : Saimadhu Polamuri
# Date: 05-May-2017
# About: Multinomial logistic regression model implementation

import pandas as pd
import numpy as np
from sklearn import linear_model
from sklearn import metrics
from sklearn.cross_validation import train_test_split

import plotly.graph_objs as go
import plotly.plotly as py
from plotly.graph_objs import *
py.sign_in('Your_ployly_username', 'API_key')

# Dataset Path
DATASET_PATH = "../Inputs/glass.txt"


def scatter_with_color_dimension_graph(feature, target, layout_labels):
    """
    Scatter with color dimension graph to visualize the density of the
    Given feature with target
    :param feature:
    :param target:
    :param layout_labels:
    :return:
    """
    trace1 = go.Scatter(
        y=feature,
        mode='markers',
        marker=dict(
            size='16',
            color=target,
            colorscale='Viridis',
            showscale=True
        )
    )
    layout = go.Layout(
        title=layout_labels[2],
        xaxis=dict(title=layout_labels[0]), yaxis=dict(title=layout_labels[1]))
    data = [trace1]
    fig = Figure(data=data, layout=layout)
    # plot_url = py.plot(fig)
    py.image.save_as(fig, filename=layout_labels[1] + '_Density.png')


def create_density_graph(dataset, features_header, target_header):
    """
    Create density graph for each feature with target
    :param dataset:
    :param features_header:
    :param target_header:
    :return:
    """
    for feature_header in features_header:
        print "Creating density graph for feature:: {} ".format(feature_header)
        layout_headers = ["Number of Observation", feature_header + " & " + target_header,
                          feature_header + " & " + target_header + " Density Graph"]
        scatter_with_color_dimension_graph(dataset[feature_header], dataset[target_header], layout_headers)


def main():
    glass_data_headers = ["Id", "RI", "Na", "Mg", "Al", "Si", "K", "Ca", "Ba", "Fe", "glass-type"]
    glass_data = pd.read_csv(DATASET_PATH, names=glass_data_headers)

    print "Number of observations :: ", len(glass_data.index)
    print "Number of columns :: ", len(glass_data.columns)
    print "Headers :: ", glass_data.columns.values
    print "Target :: ", glass_data[glass_data_headers[-1]]
    Train , Test data split

    print "glass_data_RI :: ", list(glass_data["RI"][:10])
    print "glass_data_target :: ", np.array([1, 1, 1, 2, 2, 3, 4, 5, 6, 7])
    graph_labels = ["Number of Observations", "RI & Glass Type", "Sample RI - Glass Type Density Graph"]
    # scatter_with_color_dimension_graph(list(glass_data["RI"][:10]),
    #                                    np.array([1, 1, 1, 2, 2, 3, 4, 5, 6, 7]), graph_labels)

    # print "glass_data_headers[:-1] :: ", glass_data_headers[:-1]
    # print "glass_data_headers[-1] :: ", glass_data_headers[-1]
    # create_density_graph(glass_data, glass_data_headers[1:-1], glass_data_headers[-1])

    train_x, test_x, train_y, test_y = train_test_split(glass_data[glass_data_headers[:-1]],
                                                        glass_data[glass_data_headers[-1]], train_size=0.7)
    # Train multi-classification model with logistic regression
    lr = linear_model.LogisticRegression()
    lr.fit(train_x, train_y)

    # Train multinomial logistic regression model
    mul_lr = linear_model.LogisticRegression(multi_class='multinomial', solver='newton-cg').fit(train_x, train_y)

    print "Logistic regression Train Accuracy :: ", metrics.accuracy_score(train_y, lr.predict(train_x))
    print "Logistic regression Test Accuracy :: ", metrics.accuracy_score(test_y, lr.predict(test_x))

    print "Multinomial Logistic regression Train Accuracy :: ", metrics.accuracy_score(train_y, mul_lr.predict(train_x))
    print "Multinomial Logistic regression Test Accuracy :: ", metrics.accuracy_score(test_y, mul_lr.predict(test_x))


if __name__ == "__main__":
    main()

You can fork the complete code at dataaspirant GitHub account

Conclusion

In this article, you learn about

The key differences between binary and multi-class classification.
Introduced to the concept of multinomial logistic regression.
Finally, you learned two different ways to multinomial logistic regression in python with Scikit-learn.

Related Data Science Courses

4 Responses to “2 Ways to Implement Multinomial Logistic Regression In Python”

Adan
5 years ago
Reply

Thanks for the article, one thing, train_test_split is now in the sklearn.model_selection module instead of how it is imported in your code.
- Saimadhu Polamuri
  4 years ago
  Reply
  
  Hi Adan,
  
  Thanks for correcting, in the sklearn updated version train_test_split method got changed.
  
  Happy learning.
Mitu
6 years ago
Reply

Hello . It was a great article . I have done it. But i wonder you used “Id” as a feature . When i removed the “Id” feature from my X_train, X_test then the accuracy for training set is 66% and for test set is 50%. I think “Id” is creating a bias here.
- Saimadhu Polamuri
  6 years ago
  Reply
  
  Hi Mitu,
  
  Great. We can try out different features. It’s not a good practice to use the handpicked features in most of the case. The best practice is to perform the feature engineering to come up with the best features of the model and use those features in the model.

Dataaspirant