Web page clustering : a hyperlink-based similarity and matrix-based hierarchical algorithms

Hou, Jingyu, Zhang, Yanchun and Cao, Jinli 2003, Web page clustering : a hyperlink-based similarity and matrix-based hierarchical algorithms, in APWeb 2003 : web technologies and applications : 5th Asia-Pacific Web Conference proceedings, Springer, New York N.Y., pp. 201-212.

Attached Files
Name Description MIMEType Size Downloads

Title Web page clustering : a hyperlink-based similarity and matrix-based hierarchical algorithms
Author(s) Hou, Jingyu
Zhang, Yanchun
Cao, Jinli
Conference name Asia-Pacific Web Conference (5th : 2003 : Xi'an, Shaanxi Sheng, China)
Conference location Xi'an, Shaanxi Sheng, China
Conference dates 23-25 April 2003
Title of proceedings APWeb 2003 : web technologies and applications : 5th Asia-Pacific Web Conference proceedings
Editor(s) Zhou, Xiaofang
Zhang, Yanchun
Orlowska, Maria
Publication date 2003
Series Lecture notes in computer science ; 2642
Start page 201
End page 212
Publisher Springer
Place of publication New York N.Y.
Summary This paper proposes a hyperlink-based web page similarity measurement and two matrix-based hierarchical web page clustering algorithms. The web page similarity measurement incorporates hyperlink transitivity and page importance within the concerned web page space. One clustering algorithm takes cluster overlapping into account, another one does not. These algorithxms do not require predefined similarity thresholds for clustering, and are independent of the page order. The primary evaluations show the effectiveness of the proposed algorithms in clustering improvement.
Notes The original publication can be found at www.springerlink.com
ISBN 3540023542
9783540023548
ISSN 0302-9743
1611-3349
Language eng
Field of Research 080505 Web Technologies (excl Web Search)
HERDC Research category E1 Full written paper - refereed
Copyright notice ©2003, Springer-Verlag Berlin Heidelberg
Persistent URL http://hdl.handle.net/10536/DRO/DU:30005059

Document type: Conference Paper
Collection: School of Information Technology
Connect to link resolver
 
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Versions
Version Filter Type
Citation counts: Scopus Citation Count Cited 2 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 448 Abstract Views, 40 File Downloads  -  Detailed Statistics
Created: Mon, 07 Jul 2008, 09:44:54 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.