Answer all questions in the documents with no plagiarism.

Answer all questions in the documents with no plagiarism.

Data Mining Cluster Analysis: Basic Concepts and Algorithms

Lecture Notes for Chapter 7

Introduction to Data Mining, 2nd Edition

by

Tan, Steinbach, Karpatne, Kumar

02/14/2018 Introduction to Data Mining, 2nd Edition ‹#›

What is Cluster Analysis?

Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups

Inter-cluster distances are maximized

Intra-cluster distances are minimized

02/14/2018 Introduction to Data Mining, 2nd Edition ‹#›

Applications of Cluster Analysis

Understanding

Group related documents for browsing, group genes and proteins that have similar functionality, or group stocks with similar price fluctuations

Summarization

Reduce the size of large data sets

Clustering precipitation in Australia

02/14/2018 Introduction to Data Mining, 2nd Edition ‹#›

What is not Cluster Analysis?

Simple segmentation

Dividing students into different registration groups alphabetically, by last name

Results of a query

Groupings are a result of an external specification

Clustering is a grouping of objects based on the data

Supervised classification

Have class label information

Association Analysis

Local vs. global connections

02/14/2018 Introduction to Data Mining, 2nd Edition ‹#›

Notion of a Cluster can be Ambiguous

How many clusters?

Four Clusters

Two Clusters

Six Clusters

02/14/2018 Introduction to Data Mining, 2nd Edition ‹#›

Types of Clusterings

A clustering is a set of clusters

Important distinction between hierarchical and partitional sets of clusters

Partitional Clustering

A division of data objects into non-overlapping subsets (clusters) such that each data object is in exactly one subset

Hierarchical clustering

A set of nested clusters organized as a hierarchical tree

02/14/2018 Introduction to Data Mining, 2nd Edition ‹#›

Partitional Clustering

Original Points

A Partitional Clustering

02/14/2018 Introduction to Data Mining, 2nd Edition ‹#›

Hierarchical Clustering

Traditional Hierarchical Clustering

Non-traditional Hierarchical Clustering

Non-traditional Dendrogram

Traditional Dendrogram

02/14/2018 Introduction to Data Mining, 2nd Edition ‹#›

Other Distinctions Between Sets of Clusters

Exclusive versus non-exclusive

In non-exclusive clusterings, points may belong to multiple clusters.

Can represent multiple classes or ‘border’ points

Fuzzy versus non-fuzzy

In fuzzy clustering, a point belongs to every cluster with some weight between 0 and 1

Weights must sum to 1

Probabilistic clustering has similar characteristics

Partial versus complete

In some cases, we only want to cluster some of the data

Heterogeneous versus homogeneous

Clusters of widely different sizes, shapes, and densities

02/14/2018 Introduction to Data Mining, 2nd Edition ‹#›

Types of Clusters

Well-separated clusters

What Students Are Saying About Us

.......... Customer ID: 12*** | Rating: ⭐⭐⭐⭐⭐
"Honestly, I was afraid to send my paper to you, but splendidwritings.com proved they are a trustworthy service. My essay was done in less than a day, and I received a brilliant piece. I didn’t even believe it was my essay at first 🙂 Great job, thank you!"

.......... Customer ID: 14***| Rating: ⭐⭐⭐⭐⭐
"The company has some nice prices and good content. I ordered a term paper here and got a very good one. I'll keep ordering from this website."

"Order a Custom Paper on Similar Assignment! No Plagiarism! Enjoy 20% Discount"