Classification | Clustering | |
What is it? | Given a set of historical/old data along with their class name and a set of new data, classification is the process of assigning each new data with class name that is obtained from the old/historical data. | Given a set of samples/data, clustering is the process of grouping the data based on their similarities and the patterns of the data. |
Type of input data | Set of data with label/class name | Only the data set. No labeling is required. |
Input |
| A set of samples |
Output | Class name of each new sample based on the classes of existing samples | Groupism of set of samples based on the data patterns. Each sample will be assigned with a group/label/class name. Number of classes can be known. |
Number of classes | Number of classes are known | Unknown number of classes. (number of classes can be known after clustering process is over) |
Learning type | Supervised learning | Unsupervised learning |
Algorithms | Decision tress Bayesian classifier SVM | K-mean Expectation maximization |
Dependencies | Depend on training data | no training data is required. No prior knowledge is required. |
[src] :
https://stackoverflow.com/questions/5064928/difference-between-classification-and-clustering-in-data-mining