File(s) under permanent embargo
Naive-bayes inspired effective pre-conditioner for speeding-up logistic regression
conference contribution
posted on 2014-01-01, 00:00 authored by Nayyar ZaidiNayyar Zaidi, Mark J Carman, Jesus Cerquides, Geoffrey I WebbWe propose an alternative parameterization of Logistic Regression (LR) for the categorical data, multi-class setting. LR optimizes the conditional log-likelihood over the training data and is based on an iterative optimization procedure to tune this objective function. The optimization procedure employed may be sensitive to scale and hence an effective pre-conditioning method is recommended. Many problems in machine learning involve arbitrary scales or categorical data (where simple standardization of features is not applicable). The problem can be alleviated by using optimization routines that are invariant to scale such as (second-order) Newton methods. However, computing and inverting the Hessian is a costly procedure and not feasible for big data. Thus one must often rely on first-order methods such as gradient descent (GD), stochastic gradient descent (SGD) or approximate second-order such as quasi-Newton (QN) routines, which are not invariant to scale. This paper proposes a simple yet effective pre-conditioner for speeding-up LR based on naive Bayes conditional probability estimates. The idea is to scale each attribute by the log of the conditional probability of that attribute given the class. This formulation substantially speeds-up LR's convergence. It also provides a weighted naive Bayes formulation which yields an effective framework for hybrid generative-discriminative classification.
History
Event
Data Mining. International Conference (14th : 2014 : Shenzhen, China)Pagination
1097 - 1102Publisher
IEEELocation
Shenzhen, ChinaPlace of publication
Piscataway, N.J.Publisher DOI
Start date
2014-12-14End date
2014-12-17ISSN
1550-4786eISSN
2374-8486ISBN-13
9781479943029Language
engPublication classification
E1.1 Full written paper - refereedEditor/Contributor(s)
Ravi Kumar, Hannu Toivonen, Jian Pei, Joshua Huang, Xindong WuTitle of proceedings
ICDM 2014 : Proceedings of the 14th IEEE International Conference on Data MiningUsage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC