File(s) under permanent embargo
Prediction of age, sentiment, and connectivity from social media text
conference contribution
posted on 2011-01-01, 00:00 authored by Thin NguyenThin Nguyen, Quoc-Dinh Phung, B Adams, Svetha VenkateshSvetha VenkateshSocial media corpora, including the textual output of blogs, forums, and messaging applications, provide fertile ground for linguistic analysis material diverse in topic and style, and at Web scale. We investigate manifest properties of textual messages, including latent topics, psycholinguistic features, and author mood, of a large corpus of blog posts, to analyze the impact of age, emotion, and social connectivity. These properties are found to be significantly different across the examined cohorts, which suggest discriminative features for a number of useful classification tasks. We build binary classifiers for old versus young bloggers, social versus solo bloggers, and happy versus sad posts with high performance. Analysis of discriminative features shows that age turns upon choice of topic, whereas sentiment orientation is evidenced by linguistic style. Good prediction is achieved for social connectivity using topic and linguistic features, leaving tagged mood a modest role in all classifications.
History
Event
Web Information System Engineering. Conference (12th : 2011 : Sydney, New South Wales)Source
Web Information Systems Engineering, WISE 2011 : 12th International Conference, Sydney, Australia, October 13-14 2011 : proceedingsSeries
Lecture notes in computer science ; 6997Pagination
227 - 240Publisher
Springer-VerlagLocation
Sydney, New South WalesPlace of publication
Berlin, GermanyPublisher DOI
Start date
2011-10-13End date
2011-10-14ISSN
0302-9743eISSN
1611-3349ISBN-13
9783642244346ISBN-10
3642244343Language
engPublication classification
E1.1 Full written paper - refereed; E Conference publicationCopyright notice
2011, Springer-Verlag Berlin HeidelbergExtent
35Editor/Contributor(s)
A Bouguettaya, M Hauswirth, L LiuTitle of proceedings
WISE 2011 : Web Information Systems Engineering : 12th International Conference, Sydney, Australia, October 13-14 2011 : proceedingsUsage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC