Anda belum login :: 23 Apr 2025 08:33 WIB
Detail
BukuApplication of Deep Learning in Speech-to-Text Systems for Touchscreen Keyboards
Bibliografi
Author: Lenson, Abraham Keane Sahasrara ; Airlangga, Gregorius (Advisor)
Topik: speech recognition; machine learning; audio; OpenAI whisper
Bahasa: (EN )    
Penerbit: Program Studi Sistem Informasi Fakultas Teknik Unika Atma Jaya     Tempat Terbit: Jakarta    Tahun Terbit: 2024    
Jenis: Theses - Undergraduate Thesis
Fulltext:
Abstract
In the realm of modern machine learning, speech recognition technology has emerged as a paramount area of interest, driven by its vast potential applications. However, there exists a noticeable gap in literature specifically addressing comparative analyses of various speech recognition models. Addressing this gap, our study provides a detailed comparison of multiple leading speech recognition systems, evaluating them based on their performance and accuracy. This paper meticulously examines a range of prominent speech recognition models, including Mozilla DeepSpeech, PocketSphinx, Coqui STT, Vosk, and OpenAI Whisper. Additionally, we present a more limited yet insightful analysis of Google's Gboard Text-to-Speech system. Each of these models is scrutinized under various metrics to assess their effectiveness in real-world scenarios. To translate our research into a tangible outcome, we have developed a proof-of-concept model. This model not only demonstrates the practical application of our findings but also serves as a benchmark for the industry. Moreover, our research extends beyond theoretical analysis. We have embarked on creating an Android application that utilizes the most promising model identified in our study. This application is designed to convert speech to text, catering to a wide range of users and use-cases. To ensure the reliability and robustness of our application, we have conducted comprehensive tests. These tests are not limited to standard performance metrics but also include real-world usability scenarios to ensure that our application can withstand varied and dynamic user environments. This thorough testing methodology aims to cement the application's reliability and user trust. In conclusion, our paper not only fills a critical gap in comparative studies of speech recognition systems but also bridges the gap between theoretical research and practical application. Our Android application, built on the backbone of this research, stands as a testament to the applicability and relevance of our findings in the evolving landscape of speech recognition technology.
Opini AndaKlik untuk menuliskan opini Anda tentang koleksi ini!

Lihat Sejarah Pengadaan  Konversi Metadata   Kembali
design
 
Process time: 0.078125 second(s)