test
Search publications, data, projects and authors

Thesis

English

ID: <

10670/1.ojv2u0

>

Where these data come from
Capturing the temporal constraints of gradual patterns

Abstract

Gradual pattern mining allows for extraction of attribute correlations through gradual rules such as: “the more X, the more Y”. Such correlations are useful in identifying and isolating relationships among the attributes that may not be obvious through quick scans on a data set. For instance, a researcher may apply gradual pattern mining to determine which attributes of a data set exhibit unfamiliar correlations in order to isolate them for deeper exploration or analysis. Assume the researcher has a data set which has the following attributes: age, amount of salary, number of children, and education level. An extracted gradual pattern may take the form “the lower the education level, the higher the salary”. Since this relationship is uncommon, it may interest the researcher in putting more focus on this phenomenon in order to understand it. As for many gradual pattern mining approaches, there is a key challenge to deal with huge data sets because of the problem of combinatorial explosion. This problem is majorly caused by the process employed for generating candidate gradual item sets. One way to improve the process of generating candidate gradual item sets involves optimizing this process using a heuristic approach. In this work, we propose an ant colony optimization technique which uses a popular probabilistic approach that mimics the behavior biological ants as they search for the shortest path to find food in order to solve combinatorial problems. We apply the ant colony optimization technique in order to generate gradual item set candidates whose probability of being valid is high. This coupled with the anti-monotonicity property, results in the development of a highly efficient ant-based gradual pattern mining technique. In our second contribution, we extend an existing gradual pattern mining technique to allow for extraction of gradual patterns together with an approximated temporal lag between the affected gradual item sets. Such a pattern is referred to as a fuzzy-temporal gradual pattern and it may take the form: “the more X, the more Y, almost 3 months later”. The addition of temporal dimension into the proposed approach makes it even worse regarding combinatorial explosion due to added task of searching for the most relevant time gap. In our third contribution, we propose a data crossing model that allows for integration of mostly gradual pattern mining algorithm implementations into a Cloud platform. This contribution is motivated by the proliferation of IoT applications in almost every area of our society and this comes with provision of large-scale time-series data from different sources. It may be interesting for a researcher to cross different time-series data with the aim of extracting temporal gradual patterns from the mapped attributes. For instance, a ‘humidity’ data set may be temporally crossed with an unrelated data set that records the ‘population of flies’, and a pattern may take the form: “the higher the humidity, the higher the number of flies, almost 2 hours later”. Again, the study emphasizes integration of gradual pattern mining techniques into a Cloud platform because this will facilitate their access on a subscription basis. This alleviates installation and configuration hustles for the users; therefore, it allows them to spend more time focusing on the phenomena they are studying.

Your Feedback

Please give us your feedback and help us make GoTriple better.
Fill in our satisfaction questionnaire and tell us what you like about GoTriple!