Algorithm Series: Classification
Jupyter notebook and dataset for this topic is available at ( https://github.com/kirankbee/ads-Python/blob/master/guassian-normal.ipynb ). I recommend you to make sure to read the explantion below , i am sure it is worth your time.
- Bayesian Classifiers are statistical classifiers which can predict if a given tuple belongs to a particular class. Bayesian classifier is a supervised algorithm with most recent improvements coming from the work on pattern recognition by ( Duda &Hart, 1973 ) , the method stores a probabilistic summary for each class, this summary contains the conditional probability of each attribute value given the data. For most of the machine learning projects the goal is to find patterns that relate algorithms and domain characteristics to behavior, there are many methods in this kind of study like : systematic experimentation, theoretical analysis etc… Bayesian classifiers falls into many of there ranges with varying degree of study and analysis, and different types of algorithms such as Guassian , Multinomial and Bernoulli.This article concerns itself around different types of naive bayes classification.
- Bayesian classifiers are based on conditional probability theorem known as Bayes theorem, which states that given a data-tuple X considered as evidence , described by measurements made on a set of n attributes let H be some hypothesis such that X∈H, then we determine P(H/X) i.e we wish to experiment to check the probability that tuple X belongs to class C .Now let us apply the Naive assumption on the theorem to obtain the Naive Bayes theorem which is , independence among the features.For practical purposes If the event are independent of each other then we can split the evidence to independent parts giving rise to
The term “B” in the above equation gives us the probability to create different Naive Bayes classifiers as explained below.
- Gaussian Naive Bayes classifier – Gaussian ( aka : normal, Laplace-guass) distribution is type of continuous probability distribution for a real-valued random variable. To understand the Gaussian distribution we need to expend a little more clarity on the probability density function given as below assuming the data is normally distributed or similar to Bell curve, for more detail understanding on the PFD check the Wikipedia link here as discussing probability distributions is out of scope of this session ( https://en.wikipedia.org/wiki/Normal_distribution )
By Kiran Balijepalli