Deakin University
Browse

File(s) not publicly available

Validating functional redundancy with mixed generative adversarial networks

journal contribution
posted on 2023-03-02, 05:21 authored by TT Nguyen, TT Huynh, MT Pham, TD Hoang, Thanh Thi NguyenThanh Thi Nguyen, QVH Nguyen
Data redundancy has been one of the most important problems in data-intensive applications such as data mining and machine learning. Removing data redundancy brings many benefits in efficient data updating, effective data storage, and error-free query processing. While it has been studied for four decades, existing works on data redundancy mostly focus on syntactic formulations such as normal forms and functional dependencies, which lead to intractable discovery problems. In this work, we propose a new concept, namely functional redundancy, that overcomes the limitations of functional dependencies, especially on continuous data. We design and develop efficient algorithms based on generative adversarial networks to validate any functional redundancy without heavily depending on the number of attributes and the number of tuples like functional dependencies. The core idea is to use the imputation power of generative adversarial networks to model any semantic dependencies between attributes. Extensive experiments on different real-world and synthetic datasets show that our approach outperforms representative baselines, is applicable for first-order and high-order dependencies, and is extensible for different types of data.

History

Journal

Knowledge-Based Systems

Volume

264

Article number

110342

Pagination

110342-110342

ISSN

0950-7051

Language

en

Publication classification

C1 Refereed article in a scholarly journal

Publisher

Elsevier BV

Usage metrics

    Research Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC