Blog Posts:

1. Step by step kaggle competition tutorial:

In this article we are going to see how to go through a Kaggle competition step by step.The contest explored here is the San Francisco Crime Classification contest. The goal is to classify a crime occurrence knowing the time and place it happened.

Read Complete Post:  datanice Blog

2. Introduction to machine learning: 

It is an attempt to make things more intelligent. Most of us have come across terms like “Artificial Neural Networks”, it is an attempt to replicate the working of the human brain. Even something like this is not necessarily always complex. At its heart, it is just multiplication and differentiation. Yes, Maths at it again but it’s rather what you learned at school, no different (This coming from a guy who is petrified of maths)

Read Complete Post:

3. Baidu research chief Andrew NG fixed on self-taught computers self-driving cars:

Artificial-intelligence whiz Andrew Ng hangs his hat these days at a nondescript building in Sunnyvale that serves as the Silicon Valley outpost of the Chinese search giant Baidu.

Read Complete Post: Seattle times

4. Misleading modelling overfitting cross-validation and the bias-variance trade-off:

In this post you will get to grips with what is perhaps the most essential concept in machine learning: the bias-variance trade-off. The main idea here is that you want to create models that are as good at prediction as possible but that are still applicable to new data (i.e. they are generalizable). The danger is that you can easily create models that overfit to the local noise in your specific dataset, which isn’t too helpful and leads to poor generalizability since the noise is random and therefore different in each dataset. Essentially, you want to create models that capture only the useful components of a dataset. On the other hand, models that generalize very well but are too inflexible to generate good predictions are the other extreme you want to avoid (this is called underfitting).

Read Complete Post: Cambridge coding

5. Association rules and the apriori algorithm:

When we go grocery shopping, we often have a standard list of things to buy. Each shopper has a distinctive list, depending on one’s needs and preferences. A housewife might buy healthy ingredients for a family dinner, while a bachelor might buy beer and chips. Understanding these buying patterns can help to increase sales in several ways.

Read Complete Post: Annalyzin

6. Churn prediction pyspark using mllib and ml packages:

Churn prediction is big business. It minimizes customer defection by predicting which customers are likely to cancel a subscription to a service. Though originally used within the telecommunications industry, it has become common practice across banks, ISPs, insurance firms, and other verticals.

The prediction process is heavily data driven and often utilizes advanced machine learning techniques. In this post, we’ll take a look at what types of customer data are typically used, do some preliminary analysis of the data, and generate churn prediction models – all with PySpark and its machine learning frameworks. We’ll also discuss the differences between two Apache Spark version 1.6.0 frameworks, MLlib and ML.

Read Complete Post: mapr

7. Data scientist keeps ranking top every best jobs list:

Data scientist is at, or near, the top of just about every “best jobs” survey, report, or study released in the past few years. Harvard Business Review named it the sexiest job of the 21st century. And with a median base salary of $96,000, data scientist and some engineering specialties are in a very small group of high-paying jobs that don’t require a medical or law degree. However, you’ll likely need more than a bachelor’s degree, as you’ll find out later in this article.

Read Complete Post: goodcall

8. Understand machine learning data descriptive statistics python: 

Looking at the raw data can reveal insights that you cannot get any other way. It can also plant seeds that may later grow into ideas on how to better preprocess and handle the data for machine learning tasks.

Read Complete Post: machinelearningmastery

9. Time series interventions and contribution:

This article illustrates principles of an analysis of the President George W. Bush’s job approval from January 2001 through Sep 2004 with disposable income excluded from the statistical model. To see a version complete with code and its description, visit

Presidents with a job approval rating of less than 50 percent are unlikely to be re-elected. During June, Bush’s job approval rating averaged 47 percent in five major polls.

Read Complete Post: gladwinanalytics

10. Deep neural networks creative deep learning art:

Are deep neural networks creative? It seems like a reasonable question. Google’s “Inceptionism” technique transforms images, iteratively modifying them to enhance the activation of specific neurons in a deep net. The images appear trippy, transforming rocks into buildings or leaves into insects. Another neural generative model, introduced by Leon Gatys of the University of Tubingen in Germany, can extract the style from one image (say a painting by Van Gogh), and apply it to the content of another image (say a photograph).

Read Complete Post: kdnuggets

Video Courses:

  1. Introduction to python for data science
  2. Data exploration with kaggle scripts

That’s all for the April-2016 newsletter. Please leave your suggestions on the newsletter in the comment section. To get all  dataaspirant newsletters you can visit monthly newsletter page.

Follow us:


Leave a Reply

Your email address will not be published. Required fields are marked *