Anda belum login :: 27 Nov 2024 10:10 WIB
Detail
ArtikelSelecting A Restoration Technique to Minimize OCR Error  
Oleh: Cannon, M. ; Fugate, M. ; Hush, D. R. ; Scovel, C.
Jenis: Article from Journal - ilmiah internasional
Dalam koleksi: IEEE Transactions on Neural Networks vol. 14 no. 3 (May 2003), page 478-490.
Topik: image restoration; restoration technique; minimize; OCR error
Ketersediaan
  • Perpustakaan Pusat (Semanggi)
    • Nomor Panggil: II36.7
    • Non-tandon: 1 (dapat dipinjam: 0)
    • Tandon: tidak ada
    Lihat Detail Induk
Isi artikelThis paper introduces a learning problem related to the task of converting printed documents to ASCII text files. The goal of the learning procedure is to produce a function that maps documents to restoration techniques in such a way that on average the restored documents have minimum optical character recognition error. We derive a general form for the optimal function and use it to motivate the development of a nonparametric method based on nearest neighbors. We also develop a direct method of solution based on empirical error minimization for which we prove a finite sample bound on estimation error that is independent of distribution. We show that this empirical error minimization problem is an extension of the empirical optimization problem for traditional M - class classification with general loss function and prove computational hardness for this problem. We then derive a simple iterative algorithm called generalized multiclass ratchet (GMR) and prove that it produces an optimal function asymptotically (with probability 1). To obtain the GMR algorithm we introduce a new data map that extends Kesler's construction for the multiclass problem and then apply an algorithm called Ratchet to this mapped data, where Ratchet is a modification of the Pocket algorithm . Finally, we apply these methods to a collection of documents and report on the experimental results.
Opini AndaKlik untuk menuliskan opini Anda tentang koleksi ini!

Kembali
design
 
Process time: 0.015625 second(s)