Support vector machine (Svm classifier) implemenation in python with Scikit-learn

Iris Classification with Svm Classifier

Iris Classification with Svm Classifier

Svm classifier implementation in python with scikit-learn

Support vector machine classifier is one of the most popular machine learning classification algorithm. Svm classifier mostly used in addressing multi-classification problems. If you are not aware of the multi-classification problem below are examples of multi-classification problems.

Multi-Classification Problem Examples:

  • Given fruit features like color, size, taste, weight, shape. Predicting the fruit type.
  • By analyzing the skin, predicting the different skin disease.
  • Given Google news articles, predicting the topic of the article. This could be sport, movie, tech news related article, etc.

In short: Multi-classification problem means having more that 2 target classes to predict.

In the first example of predicting the fruit type. The target class will have many fruits like apple, mango, orange, banana, etc. This is same with the other two examples in predicting. The problem of the new article, the target class having different topics like sport, movie, tech news ..etc

In this article, we were going to implement the svm classifier with different kernels. However, we have explained the key aspect of support vector machine algorithm as well we had implemented svm classifier in R programming language in our earlier posts. If you are reading this post for the first time, it’s recommended to chek out the previous post on svm concepts.

To implement svm classifier in Python, we are going to use the one of most popular classification dataset which is Iris dataset.  Let’s quickly look at the features and the target variable details of the famous classification dataset.

Iris Dataset description

Irises dataset for classification

Irises dataset for classification

This famous classification dataset first time used in Fisher’s classic 1936 paper, The Use of Multiple Measurements in Taxonomic Problems. Iris dataset is having 4 features of iris flower and one target class.

The 4 features are

  • SepalLengthCm
  • SepalWidthCm
  • PetalLengthCm
  • PetalWidthCm

The target class

The flower species type is the target class and it having 3 types

  • Setosa
  • Versicolor
  • Virginica

The idea of implementing svm classifier in Python is to use the iris features to train an svm classifier and use the trained svm model to predict the Iris species type. To begin with let’s try to load the Iris dataset. We are going to use the iris data from Scikit-Learn package.

Analyzing Iris dataset

To successfully run the below scripts in your machine you need to install the required packages. It’s better to please go through the python machine learning packages installation or machine learning packages step up before running the below scripts.

Importing Iris dataset from Scikit-Learn

Let’s first import the required python packages

Now let’s import the iris dataset

Using the DESCR key over the iris_dataset, we can get description of the dataset

Output

Now let’s get the iris features and the target classes

Output

As we are said, these are 4 features first 2 were sepal length, sepal width and the next 2 were petal length and width. Now let’s check the target data

Output

Visualizing the Iris dataset

Let’s take the individual features like sepal, petal length, and weight and let’s visualize the corresponding target classes with different colors.

Visualizing the relationship between sepal and target classes

To visualize the Sepal length, width and corresponding target classes we can create a function with name visuvalize_sepal_data. At the beginning, we are loading the iris dataset to iris variable. Next, we are storing the first 2 features in iris dataset which are sepal length and sepal width to variable x. Then we are storing the corresponding target values in variable y.

As we have seen target variable contains values like 0, 1,2 each value represents the iris flower species type. Then we are plotting the points on XY axis on X-axis we are plotting Sepal Length values. On Y-axis we are plotting Sepal Width values. If you follow installing instruction correctly on Installing Python machine learning packages and run the above code, you will get the below image.

Iris Sepal length & width Vs iris Species type

Iris Sepal length & width Vs Iris Species type

Let’s create the similar kind of graph for Petal length and width

Visualizing the relationship between Petal and target classes

If we run the above code, we will get the below graph.

Iris Petal length & width Vs Species Type

Iris Petal length & width Vs Species Type

As we have successfully visualized the behavior of target class (iris species type) with respect to Sepal length and width as well as with respect to Petal length and width. Now let’s model different kernel Svm classifier by considering only the Sepal features (Length and Width) and only the Petal features (Lenght and Width)

Modeling Different Kernel Svm classifier using Iris Sepal features

To model different kernel svm classifier using the iris Sepal features, first, we loaded the iris dataset into iris variable like as we have done before. Next, we are loading the sepal length and width values into X variable, and the target values are stored in y variable. Once we are ready with data to model the svm classifier, we are just calling the scikit-learn svm module function with different kernels.

Now let’s visualize the each kernel svm classifier to understand how well the classifier fit the train features.

Visualizing the modeled svm classifiers with Iris Sepal features

If we run the above code, we will get the below graph. From which we can understand how well different kernel svm classifiers are modeled.

Svm Classifier with Iris Sepal features

Svm Classifier with Iris Sepal features

From the above graphs, you can clearly understand how different kernel modeled with the same svm classifier. Now let’s model the svm classifier with Petal features using the same kernel we have used for modeling with Sepal features.

Modeling Different Kernel Svm classifier using Iris Petal features

The above code is much similar to the previously modeled svm classifiers code. The only difference is loading the Petal features into X variable. The remaining code is just the copy past from the previously modeled svm classifier code.

Now let’s visualize the each kernel svm classifier to understand how well the classifier fit the Petal features.

Visualizing the modeled svm classifiers with Iris Petal features

If we run the above code, we will get the below graph. From which we can understand how well different kernel svm classifiers are modeled.

Svm Classifier with Iris Petal features

Svm Classifier with Iris Petal features

This is how the modeled svm classifier looks like when we only use the petal width and length to model. With this, we came to an end. Before put an end to the post lets quickly look how to use the modeled svm classifier to predict iris flow categories.

Predicting iris flower category

To Identify the iris flow type using the modeled svm classifier, we need to call the predict function over the fitted model. For example, if you want to predict the iris flower category using the lin_svc model. We need to call lin_svc.predict(with the features). In our case, these features will include the sepal length and width or petal length and width. If you are not clear with the using the predict function correctly you check knn classifier with scikit-learn.

Conclusion

In this article, we learned how to model the support vector machine classifier using different, kernel with Python scikit-learn package. In the process, we have learned how to visualize the data points and how to visualize the modeled svm classifier for understanding the how well the fitted modeled were fit with the training dataset.

Related Articles

Follow us:

FACEBOOKQUORA |TWITTERGOOGLE+ | LINKEDINREDDIT FLIPBOARD | MEDIUM | GITHUB

I hope you like this post. If you have any questions, then feel free to comment below.  If you want me to write on one particular topic, then do tell it to me in the comments below.

Related Courses:

Do check out unlimited data science courses

Title of the course Course Link Course Link
Machine Learning: Classification
Machine Learning: Classification
  • Will learn the basic introduction to classification.
  • This course will introduce the most popular used classification algorithms.
  • You will get a chance to implement them python using the python machine learning libraries.

 

 Data Mining with Python: Classification and Regression
Data Mining with Python: Classification and Regression
  •  Understand the key concepts in data mining and will learn how to apply these concepts to solve the real world problems.
  • Will get hands on experience with python programming language.
  • Hands on experience with numpy, pandas, matplotlib libraries (Python libraries)
 Machine learning with Scikit-learn
Machine learning with Scikit-learn
  •  Load data into scikit-learn; Run many machine learning algorithms both for unsupervised and supervised data.
  • Assess model accuracy and performance
  • Being able to decide what’s the best model for every scenario.

 

 

 

4 Responses to “Support vector machine (Svm classifier) implemenation in python with Scikit-learn

  • i tried to use this code but i got an error message “syntax error: invalid syntax”

    def visuvalize_sepal_data():
    iris = datasets.load_iris()
    X = iris.data[:, :2] # we only take the first two features.
    y = iris.target
    plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.coolwarm)
    plt.xlabel(‘Sepal length’)
    plt.ylabel(‘Sepal width’)
    plt.title(‘Sepal Width & Length’)
    plt.show()

    visuvalize_sepal_data()

    • Hi Abdul,

      When you copied the code from the article and pasted in your system, the code indentation has changed and the comment in the code uncommented. Which leads to the syntax error. Please use the below code.


      def visuvalize_sepal_data():
      iris = datasets.load_iris()
      X = iris.data[:, :2] # we only take the first two features.
      y = iris.target
      plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.coolwarm)
      plt.xlabel('Sepal length')
      plt.ylabel('Sepal width')
      plt.title('Sepal Width & Length')
      plt.show()

      visuvalize_sepal_data()

      When you copied the above code, please check the indentation. Let me know if you still face the same issue.

Trackbacks & Pings

Leave a Reply

Your email address will not be published. Required fields are marked *