14 Most popular August 2016 data science articles to read

Best Data science articles

August-2016 Most Popular data science articles to read

Hi, data science lovers. Probably all of you eagerly waiting for the best data science articles for the month of August 2016 especially related to data science categories. Such as Datamining , Machine learning, Big data and Deep learning too.

Best Data science Articles:

In this best data science articles section, we were going to present you the top most popular interviews of data scientists, who have done a great work in kaggle competition. Solutions of kaggle problems, in addition to these we were also presenting you the most popular data science articles to read.

Kaggle Problem solutions:

[1] XGBoost vs betting markets:

This notebook presented by Anthony Goldbloom which is a learning guide to predicting the horse races more accurately than the betting markets. It addresses the feature extraction which is popularly known as feature engineering. This notebook also presents the basic intuition of the most popular used machine learning algorithm XGBoost model in kaggle. Read the complete post XGBoost Betting markets 

Kaggle winners Interviews:

[1] Kaggle to google deep mind:

Kaggle to google deep mind is the interview of Sander Dieleman. Who has won the gold medal with his best algorithm strategy in the Galaxy Zoo competition with his team. He also grabbed the first place in Kaggle’s first Data Science Bowl competition.

Sander applies the practical experience he acquired training convolutional neural networks on Kaggle as a research scientist at Google DeepMind. His work at DeepMind has ranged from training policy networks as part of the AlphaGo project.

His advice to aspiring data scientists is to apply what you’ve learned in books ,course to build intuitions about different approaches to solving the real world problems. Read the complete Interview kaggle to google deep mind.

[2] Avito Duplicate Ads Detection, 1st Place Team Winners’ Interview:

This Avito Duplicate Ads Detection competition which is a feature engineer’s dream, In this challenge Kagglers, are challenged to accurately detect the duplicitous duplicate ads which included 10 million images and Russian language text.
In this Fist place team winners’ interview, Stanislav Semenov and Dmitrii Tsybulevskii describe how their single XGBoost model scores among the top three and their ensemble snagged them first place. Read the complete Interview Avito Duplicate ads detection 1st place team winners Interview.

[3] Avito Duplicate Ads Detection, 2nd Place Team Winners’ Interview:

TheQuants team, made up of Kagglers Mikel, Peter, Marios, & Sonny, came in second place by generating features independently and combining their work into a powerful solution.

In this interview, they describe the many features they used (including text and images, location, price, JSON attributes, and clustered rows) as well as those that ended up in the “feature graveyard.” In the end, a total of 587 features were inputs to 14 models which were ensembled through the weighted rank average of random forest and XGBoost models. Read the complete Interview Avito Duplicate ads detection 2nd place team winners Interview.

Data Science Articles links:

[1] Building powerful image classification models using keras.

[2] 50 external machine learning data science resources and articles.

[3] Must reading books about apache-spark scala.

[4] Machine learning with python.

[5] Beginner’s Guide To Understanding Convolutional Neural Networks.

[6] 40 Techniques used by data scientists

[7] Sparkr learning guide for beginners.

[8] Deep-learning artificial brain cybersecurity.

[9] Data science interviews.

[10] SparkSQL and Tableau.


[1] How Convolutional Neural Networks work


That’s all for the August-2016 newsletter. Please leave your suggestions on the newsletter in the comment section. To get all  dataaspirant newsletters you can visit monthly newsletter page.

Follow us:


  • Ananda Kommaluri says:

    Worth reading to acquire knowledge on data analytics

  • >