Ensembles (3): Gradient Boosting
Gradient boosting ensemble technique for regression
6.1 - Classification - [ Machine Learning ] By Andrew Ng
Complete Playlist: https://www.youtube.com/watch?v=1ZzckGxT76g&list=PLLH73N9cB21V_O2JqILVX557BST2cqJw4 For any query you can comment it! We try ...
Topic modeling and LDA.mpeg
topic modeling, LDA.
|Latent Dirichlet allocation - Wikipedia, the free encyclopedia|
Latent Dirichlet allocation : In natural language processing, latent Dirichlet allocation (LDA) is a generative model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar. For example, if observations are words collected into documents, it posits that each document is a mixture of a small number of topics and that each word's creation is attributable to one of the document's topics. LDA is an example of a topic model and was first presented as a graphical model for topic discovery by David Blei, Andrew Ng, and Michael Jordan in 2003.
Multinomial Logistic Regression With Apache Spark
How Random Forest algorithm works
In this video I explain very briefly how the Random Forest algorithm works with a simple example composed by 4 decision trees.
Lecture 23 Decision trees and neural nets
CS188 Artificial Intelligence UC Berkeley, Fall 2013 Lecture 23 Decision trees and neural nets Instructor: Prof. Pieter Abbeel.
|Matrix decomposition - Wikipedia, the free encyclopedia|
Matrix decomposition : In the mathematical discipline of linear algebra, a matrix decomposition or matrix factorization is a factorization of a matrix into a product of matrices. There are many different matrix decompositions; each finds use among a particular class of problems.
|Support vector machine - Wikipedia, the free encyclopedia|
Support vector machine : In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples into one category or the other, making it a non-probabilistic binary linear classifier. An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on.
A Gentle Introduction To Machine Learning; SciPy 2013 Presentation
Authors: Kastner, Kyle, Southwest Research Institute Track: Machine Learning This talk will be an introduction to the root concepts of machine learning, star...
Frequent Pattern Mining - Apriori Algorithm
Heres a step by step tutorial on how to run apriori algorithm to get the frequent item sets. Recorded this when I took Data Mining course in Northeastern Un...
|Market Basket Analysis with Mahout | DATASCIENCE HACKS|
Also known as Affinity Analysis/Frequent Pattern Mining: Finding patterns in huge amounts of customer transactional data is called market basket analysis. This is useful where store's transactional data is readily available. Using market basket analysis, one can find purchasing patterns. Market basket analysis is also called associative rule mining (actually its otherway around) or affinity?
|Slope One - Wikipedia, the free encyclopedia|
Slope One : Slope One is a family of algorithms used for collaborative filtering, introduced in a 2005 paper by Daniel Lemire and Anna Maclachlan. Arguably, it is the simplest form of non-trivial item-based collaborative filtering based on ratings. Their simplicity makes it especially easy to implement them efficiently while their accuracy is often on par with more complicated and computationally expensive algorithms. They have also been used as building blocks to improve other algorithms. They are part of major open-source libraries such as Apache Mahout and Easyrec.
|Kick Start Hadoop: Mahout Recommendations with Data Sets containing Alpha Numeric Item Ids|
Kick Start Hadoop : This Blog is intended to give budding MapReduce developers a start off in developing hadoop based applications. It involves some development tips and tricks on hadoop MapReduce programming, tools that use map reduce under the hood and some practical applications of hadoop using these tools. Most of the code samples provided here is tested on hadoop environment but still do post me if you find any not working.
|CaltechX: CS1156x: Learning From Data | edX|
Learning From Data : This is an introductory course in machine learning (ML) that covers the basic theory, algorithms, and applications. ML is a key technology in Big Data, and in many financial, medical, commercial, and scientific applications. It enables computational systems to automatically learn how to perform a desired task based on information extracted from the data. ML has become one of the hottest fields of study today, taken up by undergraduate and graduate students from 15 different majors at Caltech. This course balances theory and practice, and covers the mathematical as well as the heuristic aspects. The lectures follow each other in a story-like fashion: