Twitter hashtags provide a high-level summary of tweets,
while cluster hashtags have many applications. Existing text-based methods (relying on explicit words in tweets) are greatly affected by the sparsity of the short tweet texts and the low co-occurrence rates of hashtags in
tweets. Meanwhile, semantically related hashtags but using different textexpressions may show similar temporal patterns (i.e., the frequencies of
hashtag usages changing with the time), which can help capture events,
opinions and synonyms. In this paper, we propose a novel clustering
hashtags by their temporal patterns (CHTP) method as a complement
to text-based methods. In CHTP, hashtags are represented as hashtag
time series that show their temporal patterns, so, hashtag clusters can
be discovered by clustering hashtag time series. Density-based clustering
algorithms are suitable to discover naturally shaped hashtag clusters but
they are not fine enough (use one distance threshold to define density)
to differentiate clusters of various density levels. Therefore, we develop
a new parameter-free Density-Sensitive Clustering (DSC) algorithm to
discover clusters of different density levels and use it in CHTP to group
hashtags by temporal patterns. DSC recursively partitions the dataset
from coarse-grained to fine-grained (using adaptive distance thresholds)
to discover hashtag clusters of different density levels. Experiments conducted on Twitter datasets show that the DSC algorithm finds hashtag
clusters of different densities more effectively than counterpart methods,
and CHTP (using DSC) can discover meaningful hashtag clusters, 36%
of which cannot be found by the text-based approaches.