File(s) under permanent embargo
GANBLR++: Incorporating Capacity to Generate Numeric Attributes and Leveraging Unrestricted Bayesian Networks
conference contribution
posted on 2022-10-19, 05:08 authored by Y Zhang, Nayyar ZaidiNayyar Zaidi, J Zhou, Gang LiGang LiGenerative Adversarial Networks (GAN) models have led to a major breakthrough in data generation of various sorts. Over the years, we have seen several applications of GAN-based learning for tabular data generation as well. Very recently, GAN-based learning by incorporating Bayesian Networks (BN) as generator and discriminator – GANBLR, has shown to lead to state-of-the-art (SOTA) results for tabular data generation. Despite the impressive performance, GANBLR has an inherent weakness that it can only generate data with categorical attributes. Additionally, the model is trained and tested only with a restricted Bayesian Network. In this work, we have proposed an extension over GANBLR framework – GANBLR++, that has the capacity to generate numeric attributes, by leveraging Dirichlet Mixture Model. We also leverage unrestricted BN in GANBLR framework, and discuss how the use of unrestricted BN can lead to better quality data, as well as more interpretable model. We evaluate the effectiveness of GANBLR++ on wide range of datasets by demonstrating that it generates data of better quality as compared to existing SOTA models for tabular (numeric and categorical) data generation such as CTGAN, MedGAN and TableGAN.
History
Pagination
298 - 306ISBN-13
9781611977172Title of proceedings
Proceedings of the 2022 SIAM International Conference on Data Mining, SDM 2022Usage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC