[PDF] Discovering Significant Patterns | Semantic Scholar (2024)

Topics

Pattern Discovery Technique (opens in a new tab)Spurious Patterns (opens in a new tab)

223 Citations

Layered critical values: a powerful direct-adjustment approach to discovering significant patterns

Geoffrey I. Webb

Computer Science, Mathematics

Machine Learning

2008

The assignment of different critical values to different areas of the search space as an approach to alleviating this problem is investigated, using a variant of a technique originally developed for other purposes.

60
Highly Influenced
PDF

A tutorial on statistically sound pattern discovery

W. HämäläinenGeoffrey I. Webb

Computer Science, Mathematics

Data Mining and Knowledge Discovery

2018

This tutorial introduces the key statistical and data mining theory and techniques that underpin statistically sound pattern discovery research or practice and clarifies alternative interpretations of statistical dependence and introduces appropriate tests for evaluating statistical significance of patterns in different situations.

38
Highly Influenced
PDF

Finding the real patterns

Geoffrey I. Webb

Computer Science, Mathematics

PAKDD

2007

The problem of false discoveries is discussed, and techniques for avoiding them are presented.

Permutation Strategies for Mining Significant Sequential Patterns

Andrea TononFabio Vandin

Computer Science, Mathematics

2019 IEEE International Conference on Data Mining…

2019

The results of the experimental evaluation show that PROMISE is an efficient method that allows the discovery of statistically significant sequential patterns from transactional datasets while properly controlling for false discoveries.

Significant Pattern Mining on Continuous Variables

M. SugiyamaK. Borgwardt

Computer Science, Mathematics

ArXiv

2017

This work solves the open problem of significant pattern mining on continuous variables by using Spearman's rank correlation coefficient to represent the frequency of a pattern and detects true patterns with higher precision and recall than competing methods that require a prior binarization of the data.

3
Highly Influenced

[PDF]

48 References

Discovering Predictive Association Rules

N. MegiddoR. Srikant

Computer Science, Mathematics

KDD

1998

Empirical evaluation shows that on typical datasets the fraction of rules that may be false discoveries is very small, and a novel approach is presented for estimating the number of "false discoveries" at any cutoff level.

121
Highly Influential
PDF

Discovering significant rules

Geoffrey I. Webb

Computer Science

KDD '06

2006

Generic techniques that allow definitions of true and false discoveries to be specified in terms of arbitrary statistical hypothesis tests and which provide strict control over the experiment wise risk of false discoveries are presented.

105
Highly Influential
PDF

Pruning and summarizing the discovered associations

B. LiuW. HsuY. Ma

Computer Science

KDD '99

1999

The technique first prunes the discovered associations to remove those insignificant associations, and then finds a special subset of the unpruned associations to form a summary of the discovered association rules, which are then called the direction setting rules.

Finding the Most Interesting Patterns in a Database Quickly by Using Sequential Sampling

T. SchefferS. Wrobel

Computer Science

J. Mach. Learn. Res.

2002

A sampling algorithm that solves this problem by issuing a small number of database queries while guaranteeing precise bounds on the confidence and quality of solutions, and it is proved that there is no sampling algorithm for a popular class of utility functions that cannot be estimated with bounded error.

A Statistical Theory for Quantitative Association Rules

Y. AumannYehuda Lindell

Computer Science, Mathematics

KDD '99

1999

A new definition of quantitative association rules based on statistical inference theory is introduced, reflecting the intuition that the goal of association rules is to find extraordinary and therefore interesting phenomena in databases.

Efficient mining of emerging patterns: discovering trends and differences

Guozhu DongJinyan Li

Computer Science

KDD '99

1999

It is believed that EPs with low to medium support, such as 1%-20%, can give useful new insights and guidance to experts, in even “well understood” applications.

1,160
PDF

Frequent subgraph discovery

Michihiro KuramochiG. Karypis

Computer Science, Chemistry

Proceedings 2001 IEEE International Conference on…

2001

The empirical results show that the algorithm scales linearly with the number of input transactions and it is able to discover frequent subgraphs from a set of graph transactions reasonably fast, even though it has to deal with computationally hard problems such as canonical labeling of graphs and subgraph isomorphism which are not necessary for traditional frequent itemset discovery.

1,239
PDF

Finding association rules that trade support optimally against confidence

T. Scheffer

Computer Science

Intell. Data Anal.

2005

This work presents a fast algorithm that finds the n best rules which maximize the resulting criterion, and dynamically prunes redundant rules and parts of the hypothesis space that cannot contain better solutions than the best ones found so far.

Beyond market baskets: generalizing association rules to correlations

Sergey BrinR. MotwaniCraig Silverstein

Computer Science, Business

SIGMOD '97

1997

This work develops the notion of mining rules that identify correlations (generalizing associations), and proposes measuring significance of associations via the chi-squared test for correlation from classical statistics, enabling the mining problem to reduce to the search for a border between correlated and uncorrelated itemsets in the lattice.

1,617
PDF

Constraint-Based Rule Mining in Large, Dense Databases

R. BayardoR. AgrawalD. Gunopulos

Computer Science

Proceedings 15th International Conference on Data…

1999

A new algorithm that directly exploits all user-specified constraints including minimum support, minimum confidence, and a new constraint that ensures every mined rule offers a predictive advantage over any of its simplifications is described.

665
Highly Influential
PDF

...

Related Papers

Showing 1 through 3 of 0 Related Papers