Anda belum login :: 30 Nov 2024 16:14 WIB
Home
|
Logon
Hidden
»
Administration
»
Collection Detail
Detail
Exploiting Interclass Rules for Focused Crawling
Oleh:
Altingovde, I. S.
;
Ulusoy, O.
Jenis:
Article from Bulletin/Magazine
Dalam koleksi:
IEEE Intelligent Systems vol. 19 no. 6 (2004)
,
page 66-73.
Topik:
focus
;
interclass rules
;
focused crawling
Ketersediaan
Perpustakaan Pusat (Semanggi)
Nomor Panggil:
II60.7
Non-tandon:
1 (dapat dipinjam: 0)
Tandon:
tidak ada
Lihat Detail Induk
Isi artikel
Crawling the Web quickly and entirely is an expensive, unrealistic goal because of the required hardware and network resources. We started with a focused-crawling approach designed by Soumen Chakrabarti, Martin van den Berg, and Byron Dom, and we implemented the underlying philosophy of their approach to derive our baseline crawler. This crawler employs a canonical topic taxonomy to train a naive - Bayesian classifier, which then helps determine the relevancy of crawled pages. The crawler also relies on the assumption of topical locality to decide which URLs to visit next. Building on this crawler, we developed a rule - based crawler, which uses simple rules derived from interclass (topic) linkage patterns to decide its next move. This rule - based crawler also enhances the baseline crawler by supporting tunneling. A focused crawler gathers relevant Web pages on a particular topic. This rule - based Web - crawling approach uses linkage statistics among topics to improve a baseline focused crawler's harvest rate and coverage.
Opini Anda
Klik untuk menuliskan opini Anda tentang koleksi ini!
Kembali
Process time: 0.015625 second(s)