Answer all questions in the documents with no plagiarism.
Data Mining Cluster Analysis: Basic Concepts and Algorithms
Lecture Notes for Chapter 7
Introduction to Data Mining, 2nd Edition
by
Tan, Steinbach, Karpatne, Kumar
02/14/2018 Introduction to Data Mining, 2nd Edition ‹#›
What is Cluster Analysis?
Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups
Inter-cluster distances are maximized
Intra-cluster distances are minimized
02/14/2018 Introduction to Data Mining, 2nd Edition ‹#›
Applications of Cluster Analysis
Understanding
Group related documents for browsing, group genes and proteins that have similar functionality, or group stocks with similar price fluctuations
Summarization
Reduce the size of large data sets
Clustering precipitation in Australia
02/14/2018 Introduction to Data Mining, 2nd Edition ‹#›
What is not Cluster Analysis?
Simple segmentation
Dividing students into different registration groups alphabetically, by last name
Results of a query
Groupings are a result of an external specification
Clustering is a grouping of objects based on the data
Supervised classification
Have class label information
Association Analysis
Local vs. global connections
02/14/2018 Introduction to Data Mining, 2nd Edition ‹#›
Notion of a Cluster can be Ambiguous
How many clusters?
Four Clusters
Two Clusters
Six Clusters
02/14/2018 Introduction to Data Mining, 2nd Edition ‹#›
Types of Clusterings
A clustering is a set of clusters
Important distinction between hierarchical and partitional sets of clusters
Partitional Clustering
A division of data objects into non-overlapping subsets (clusters) such that each data object is in exactly one subset
Hierarchical clustering
A set of nested clusters organized as a hierarchical tree
02/14/2018 Introduction to Data Mining, 2nd Edition ‹#›
Partitional Clustering
Original Points
A Partitional Clustering
02/14/2018 Introduction to Data Mining, 2nd Edition ‹#›
Hierarchical Clustering
Traditional Hierarchical Clustering
Non-traditional Hierarchical Clustering
Non-traditional Dendrogram
Traditional Dendrogram
02/14/2018 Introduction to Data Mining, 2nd Edition ‹#›
Other Distinctions Between Sets of Clusters
Exclusive versus non-exclusive
In non-exclusive clusterings, points may belong to multiple clusters.
Can represent multiple classes or ‘border’ points
Fuzzy versus non-fuzzy
In fuzzy clustering, a point belongs to every cluster with some weight between 0 and 1
Weights must sum to 1
Probabilistic clustering has similar characteristics
Partial versus complete
In some cases, we only want to cluster some of the data
Heterogeneous versus homogeneous
Clusters of widely different sizes, shapes, and densities
02/14/2018 Introduction to Data Mining, 2nd Edition ‹#›
Types of Clusters
Well-separated clusters
What Students Are Saying About Us
.......... Customer ID: 12*** | Rating: ⭐⭐⭐⭐⭐"Honestly, I was afraid to send my paper to you, but splendidwritings.com proved they are a trustworthy service. My essay was done in less than a day, and I received a brilliant piece. I didn’t even believe it was my essay at first 🙂 Great job, thank you!"
.......... Customer ID: 14***| Rating: ⭐⭐⭐⭐⭐
"The company has some nice prices and good content. I ordered a term paper here and got a very good one. I'll keep ordering from this website."