Ensembles (3): Gradient Boosting Gradient boosting ensemble technique for regression 20Oct2015  Logistic Regression 6.1  Classification  [ Machine Learning ] By Andrew Ng Complete Playlist: https://www.youtube.com/watch?v=1ZzckGxT76g&list=PLLH73N9cB21V_O2JqILVX557BST2cqJw4 For any query you can comment it! We try ... 13Aug2015  
  
 lda Topic modeling and LDA.mpeg topic modeling, LDA. 21Oct2014  Latent Dirichlet allocation  Wikipedia, the free encyclopedia   Latent Dirichlet allocation : In natural language processing, latent Dirichlet allocation (LDA) is a generative model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar. For example, if observations are words collected into documents, it posits that each document is a mixture of a small number of topics and that each word's creation is attributable to one of the document's topics. LDA is an example of a topic model and was first presented as a graphical model for topic discovery by David Blei, Andrew Ng, and Michael Jordan in 2003.[1]  21Oct2014 

Alpine Blog   Multinomial Logistic Regression With Apache Spark  14Oct2014 
 How Random Forest algorithm works In this video I explain very briefly how the Random Forest algorithm works with a simple example composed by 4 decision trees. 08Sep2014  Lecture 23 Decision trees and neural nets CS188 Artificial Intelligence UC Berkeley, Fall 2013 Lecture 23 Decision trees and neural nets Instructor: Prof. Pieter Abbeel. 07Sep2014 
Matrix decomposition  Wikipedia, the free encyclopedia   Matrix decomposition : In the mathematical discipline of linear algebra, a matrix decomposition or matrix factorization is a factorization of a matrix into a product of matrices. There are many different matrix decompositions; each finds use among a particular class of problems.  03Sep2014 
 Support vector machine  Wikipedia, the free encyclopedia   Support vector machine : In machine learning, support vector machines (SVMs, also support vector networks[1]) are supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples into one category or the other, making it a nonprobabilistic binary linear classifier. An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on.  03Sep2014 
 A Gentle Introduction To Machine Learning; SciPy 2013 Presentation Authors: Kastner, Kyle, Southwest Research Institute Track: Machine Learning This talk will be an introduction to the root concepts of machine learning, star... 03Sep2014 
Frequent Pattern Mining  Apriori Algorithm Heres a step by step tutorial on how to run apriori algorithm to get the frequent item sets. Recorded this when I took Data Mining course in Northeastern Un... 22Aug2014  Market Basket Analysis with Mahout  DATASCIENCE HACKS   Also known as Affinity Analysis/Frequent Pattern Mining: Finding patterns in huge amounts of customer transactional data is called market basket analysis. This is useful where store's transactional data is readily available. Using market basket analysis, one can find purchasing patterns. Market basket analysis is also called associative rule mining (actually its otherway around) or affinity?  22Aug2014 
 
  
 
08Apr2014 
06Apr2014 
06Apr2014  
 Slope One  Wikipedia, the free encyclopedia   Slope One : Slope One is a family of algorithms used for collaborative filtering, introduced in a 2005 paper by Daniel Lemire and Anna Maclachlan.[1] Arguably, it is the simplest form of nontrivial itembased collaborative filtering based on ratings. Their simplicity makes it especially easy to implement them efficiently while their accuracy is often on par with more complicated and computationally expensive algorithms.[1][2] They have also been used as building blocks to improve other algorithms.[3][4][5][6][7][8][9] They are part of major opensource libraries such as Apache Mahout and Easyrec.  07Mar2014 
 
07Mar2014 
07Mar2014  Kick Start Hadoop: Mahout Recommendations with Data Sets containing Alpha Numeric Item Ids   Kick Start Hadoop : This Blog is intended to give budding MapReduce developers a start off in developing hadoop based applications. It involves some development tips and tricks on hadoop MapReduce programming, tools that use map reduce under the hood and some practical applications of hadoop using these tools. Most of the code samples provided here is tested on hadoop environment but still do post me if you find any not working.  07Mar2014 

  
  
 CaltechX: CS1156x: Learning From Data  edX   Learning From Data : This is an introductory course in machine learning (ML) that covers the basic theory, algorithms, and applications. ML is a key technology in Big Data, and in many financial, medical, commercial, and scientific applications. It enables computational systems to automatically learn how to perform a desired task based on information extracted from the data. ML has become one of the hottest fields of study today, taken up by undergraduate and graduate students from 15 different majors at Caltech. This course balances theory and practice, and covers the mathematical as well as the heuristic aspects. The lectures follow each other in a storylike fashion:  15Sep2013 
