Tuesday, August 14, 2018

Difference between Classification and Clustering in Machine Learning



Classification
Clustering
What is it?
Given a set of historical/old data along with their class name and a set of new data, classification is the process of assigning each new data with class name that is obtained from the old/historical data. 
Given a set of samples/data, clustering is the process of grouping the data based on their similarities and the patterns of the data.
Type of input data
Set of data with label/class name
Only the data set. No labeling is required.
Input
  • A set of samples or data
  • A set of classes
A set of samples
Output
Class name of each new sample based on the classes of existing samples
Groupism of set of samples based on the data patterns. 
Each sample will be assigned with a group/label/class name.
Number of classes can be known.
Number of classes
Number of classes are known
Unknown number of classes. 
(number of classes can be known after clustering process is over)
Learning type
Supervised learning
Unsupervised learning
Algorithms
Decision tress
Bayesian classifier
SVM
K-mean
Expectation maximization
Dependencies
Depend on training data
no training data is required. No prior knowledge is required.





[src] :