Task-Relevant Data

Types of Knowledge to Be Mined

Background Knowledge

Pattern Interestingness Measurements

Visualization of Discovered Patterns

Data Mining Operations Outline

A Data Mining Query Language (DMQL)

Syntax for DMQL

Syntax for Task-relevant Data Specification

or use data warehouse data_warehouse_name

Syntax for Specifying the Kind of Knowledge to be Mined

mine characteristics [as pattern_name] analyze measure(s)

Syntax for Specifying the Kind of Knowledge to be Mined

mine associations [as pattern_name]

Syntax for Concept Hierarchy Specification

Syntax for Interestingness Measure Specification

Syntax for Pattern Presentation and Visualization Specification

Putting It All Together: the Full Specification of a DMQL Query

Task-Relevant Data

Database or data warehouse name

Database tables or data warehouse cubes

Condition for data selection

Relevant attributes or dimensions

Data grouping criteria

Types of Knowledge to Be Mined

Characterization

Discrimination

Association

Classification/prediction

Clustering

Outlier analysis

and so on ???

Background Knowledge

Concept hierarchies

Pattern Interestingness Measurements

Simplicity

Certainty

Utility

Novelty

Visualization of Discovered Patterns

Different background/purpose may require different form of representation

Concept hierarchies is also important

Different knowledge requires different representation.

Data Mining Operations Outline

What is the motivation for ad-hoc mining process?

What defines a data mining task?

Can we define an ad-hoc mining language?

A Data Mining Query Language (DMQL)

Motivation

Design

Syntax for DMQL

Syntax for specification of

Putting it all together -- a DMQL query

Syntax for Task-relevant Data Specification

use database database_name,

or use data warehouse data_warehouse_name

from relation(s)/cube(s)??[where condition]

in relevance to att_or_dim_list

order by order_list

group by grouping_list

having condition

Syntax for Specifying the Kind of Knowledge to be Mined

Characterization

mine characteristics [as pattern_name] analyze measure(s)

Discrimination

Syntax for Specifying the Kind of Knowledge to be Mined

Association

mine associations [as pattern_name]

Syntax for Concept Hierarchy Specification

To specify what concept hierarchies to use

We use different syntax to define different type of hierarchies

Syntax for Interestingness Measure Specification

Interestingness measures and thresholds can be specified by the user with the statement:

Example:

Syntax for Pattern Presentation and Visualization Specification

We have syntax which allows users to specify the display of discovered patterns in one or more forms.

To facilitate interactive viewing at different concept levels, the following syntax is defined:

Putting It All Together: the Full Specification of a DMQL Query