File(s) under permanent embargo
Estimating support scores of autism communities in large-scale web information systems
conference contribution
posted on 2017-01-01, 00:00 authored by Thin NguyenThin Nguyen, H Nguyen, Svetha VenkateshSvetha Venkatesh, Quoc-Dinh PhungIndividuals with Autism Spectrum Disorder (ASD) have been shown to prefer communication at a socio-spatial distance. So while rarely found in the real world, autism communities are popular in Web-based forums, convenient for people with ASD to seek and share health related information. Reddit is one such avenue for people of common interest to connect, forming communities of specific interest, namely subreddits. This work aims to estimate support scores provided by a popular subreddit interested in ASD – www.reddit.com/r/aspergers. The scores were measured in both the quantities and qualities of the conversations in the forum, including conversational involvement, emotional, and informational support. The support scores of the subreddit Aspergers was compared with that of an average subreddit derived from entire Reddit, represented by two big corpora of approximately 200 million Reddit posts and 1.66 billion Reddit comments. The ASD subreddit was found to be a supportive community, having far higher support scores than did the average subreddit. Apache Spark, an advanced cluster computing framework, is employed to speed up processing of the large corpora. Scalable machine learning techniques implemented in Spark help discriminate the content made in Aspergers versus other subreddits and automatically discover linguistic predictors of ASD within minutes, providing timely reports.
History
Event
Web Information Systems Engineering. International Conference (18th : 2017 : Puschino, Russia)Volume
10569Series
Lecture Notes in Computer SciencePagination
347 - 355Publisher
SpringerLocation
Puschino, RussiaPlace of publication
Berlin, GermanyPublisher DOI
Start date
2017-10-07End date
2017-10-11ISSN
0302-9743eISSN
1611-3349ISBN-13
9783319687827Language
engPublication classification
E Conference publication; E1 Full written paper - refereedCopyright notice
2017, SpringerEditor/Contributor(s)
A Bouguettaya, Y Gao, A Klimenko, L Chen, X Zhang, F Dzerzhinskiy, W Jia, S Klimenko, Q LiTitle of proceedings
WISE 2017 : Proceedings of the 18th International Conference on Web Information Systems Engineering 2017Usage metrics
Keywords
big dataapache sparklarge-scale distributed computingsupport scoresautism communitiesScience & TechnologyTechnologyComputer Science, Artificial IntelligenceComputer Science, Information SystemsComputer Science, Software EngineeringComputer Science, Theory & MethodsComputer ScienceDistributed Computing
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC