A HYBRID APPROACH COMBINING TRIE-BASED CACHING AND LEVENSHTEIN DISTANCE ALGORITHMS TO OPTIMIZE DOCUMENT SEARCH ENGINE PERFORMANCE AT PT CHEMCO HARAPAN NUSANTARA

Penulis

  • Agung Brotokuncoro President University
  • Wiranto Herry Utomo President University

Kata Kunci:

Trie-Based Caching, Levenshtein Distance, Search Engine Performance, Misspelled Queries, Auto Suggestion

Abstrak

In the context of rapid data growth at PT Chemco Harapan Nusantara, optimizing the efficiency and accuracy of document search engines becomes very important. Current search engines face difficulties in accurately predicting user needs and take a long time to find documents. This study aims to overcome these challenges through a hybrid approach that combines Trie-based caching techniques and the Levenshtein Distance algorithm. Trie-based caching functions to significantly increase search speed by pre-indexing document keywords & autosuggestion. Meanwhile, the Levenshtein Distance algorithm improves the accuracy of the system in handling misspelled queries or searches with partial matches. The implementation of both algorithms significantly improved search performance by 43.23%, reducing the processing time to 44.88 ms, compared to 78.92 ms in the previous search engine that did not utilize caching. In addition, this system also achieved an increase in Precision from the previous 50.00% to 97.50%, Recall increased from 41.75% to 94.00%, and F1 Score also increased from 45% to 95%. These values indicate that this system is effective in finding relevant documents while reducing irrelevant search results. The combination of Trie-based caching and Levenshtein Distance not only increases search speed but also provides more accurate search results. Thus, this study successfully provides a solution to improve the performance of the document search engine at PT Chemco Harapan Nusantara, thereby supporting the company's operational efficiency amidst the challenges of increasingly complex data growth.

Unduhan

Diterbitkan

2024-12-30