FROM DATA WAREHOUSING TO DATA MINING

Data Warehouse Usage Three kinds of data warehouse applications Information processing supports querying, basic statistical analysis, and reporting usingcrosstabs, tables, charts and graphs Analytical processing multidimensional analysis of data warehouse data supports basic OLAP operations, slice-dice, drilling, pivoting Data mining knowledge discovery from hidden patterns supports associations, constructing analytical models,performing classification and prediction, and presenting themining results using visualization tools. Differences among the three tasks54From On-Line Analytical Processingto On Line Analytical Mining (OLAM) Why online analytical mining? High quality of data in data warehouses DW contains integrated, consistent, cleaned data Available information processing structure surrounding datawarehouses ODBC, OLEDB, Web accessing, service facilities, reporting andOLAP tools OLAP-based exploratory data analysis mining with drilling, dicing, pivoting, etc. On-line selection of data mining functions integration and swapping of multiple mining functions,algorithms, and tasks. Architecture of OLAM
Discovery-Driven Exploration of DataCubes Hypothesis-driven exploration by user, huge search space Discovery-driven (Sarawagi, et al.’98) Effective navigation of large OLAP data cubes pre-compute measures indicating exceptions, guideuser in the data analysis, at all levels of aggregation Exception: significantly different from the valueanticipated, based on a statistical model Visual cues such as background color are used toreflect the degree of exception of each cell57Kinds of Exceptions and their Computation Parameters SelfExp: surprise of cell relative to other cells at samelevel of aggregation InExp: surprise beneath the cell PathExp: surprise beneath cell for each drill-downpath Computation of exception indicator (modeling fitting andcomputing SelfExp, InExp, and PathExp values) can beoverlapped with cube construction Exception themselves can be stored, indexed andretrieved like precomputed aggregates58Examples: Discovery-Driven Data Cubes59Complex Aggregation at MultipleGranularities: Multi-Feature Cubes Multi-feature cubes (Ross, et al. 1998): Compute complex queriesinvolving multiple dependent aggregates at multiple granularities Ex. Grouping by all subsets of item, region, month, find themaximum price in 1997 for each group, and the total sales among allmaximum price tuplesselect item, region, month, max(price), sum(R.sales)from purchaseswhere year = 1997cube by item, region, month: Rsuch that R.price = max(price) Continuing the last example, among the max price tuples, find themin and max shelf live, and find the fraction of the total sales due totuple that have min shelf life within the set of all max price tuples60Cube-Gradient (Cubegrade) Analysis of changes of sophisticated measuresin multi-dimensional spaces Query: changes of average house price inVancouver in ‘00 comparing against ’99 Answer: Apts in West went down 20%,houses in Metrotown went up 10% Cubegrade problem by Imielinski et al. Changes in dimensions  changes inmeasures Drill-down, roll-up, and mutation61From Cubegrade to Multi-dimensionalConstrained Gradients in Data Cubes Significantly more expressive than association rules Capture trends in user-specified measures Serious challenges Many trivial cells in a cube  “significance constraint”to prune trivial cells Numerate pairs of cells  “probe constraint” to selecta subset of cells to examine Only interesting changes wanted “gradientconstraint” to capture significant changes62MD Constrained Gradient Mining Significance constraint Csig: (cnt100) Probe constraint Cprb: (city=“Van”, cust_grp=“busi”,prod_grp=“*”) Gradient constraint Cgrad(cg, cp):(avg_price(cg)/avg_price(cp)1.3)
A LiveSet-Driven Algorithm Compute probe cells using Csig and Cprb The set of probe cells P is often very small Use probe P and constraints to find gradients Pushing selection deeply Set-oriented processing for probe cells Iceberg growing from low to high dimensionalities Dynamic pruning probe cells during growth Incorporating efficient iceberg cubing method64

[Button id=”1″]

Quality and affordable writing services. Our papers are written to meet your needs, in a personalized manner. You can order essays, annotated bibliography, discussion, research papers, reaction paper, article critique, coursework, projects, case study, term papers, movie review, research proposal, capstone project, speech/presentation, book report/review, and more.
Need Help? Click On The Order Now Button For Help

What Students Are Saying About Us

.......... Customer ID: 12*** | Rating: ⭐⭐⭐⭐⭐
"Honestly, I was afraid to send my paper to you, but splendidwritings.com proved they are a trustworthy service. My essay was done in less than a day, and I received a brilliant piece. I didn’t even believe it was my essay at first 🙂 Great job, thank you!"

.......... Customer ID: 14***| Rating: ⭐⭐⭐⭐⭐
"The company has some nice prices and good content. I ordered a term paper here and got a very good one. I'll keep ordering from this website."

"Order a Custom Paper on Similar Assignment! No Plagiarism! Enjoy 20% Discount"