Title |
Species annotation using a k-merbased KNN model
|
Authors |
Srushti Sangar1, Prathamesh Kolage1 & Pritee Chunarkar-Patil1,*
|
Affiliation |
1Department of Bioinformatics, Rajiv Gandhi Institute of IT and Biotechnology, Bharati Vidyapeeth (Deemed to be University), Pune, Maharashtra, India; *Corresponding author
|
|
Srushti Sangar - E - mail: srushti.sangar-rgitbt@bvp.edu.in; Phone: +91 9325122998 Prathamesh Kolage - E - mail: prathamesh.kolage-rgitbt@bvp.edu.in; Phone: +91 9370215432 Pritee Chunarkar-Patil - E - mail: preeti.chunarkar@bharatividyapeeth.edu; Phone: +91 9730038142 |
Article Type |
Research Article
|
Date |
Received September 1, 2024; Revised September 30, 2024; Accepted September 30, 2024, Published September 30, 2024
|
Abstract |
Bacterial identification is a critical process in microbiology, clinical diagnostics, environmental monitoring, and food safety. Machine learning holds great promise for improving bacterial identification by increasing accuracy, speed, and scalability. However, challenges such as data dependency, model interpretability, and computational demands must be addressed to fully realize it’s potential. k-mer based bacterial identification algorithm is an attempt to address these issues. Sequence matching is completed using the KNN technique. This included feature extraction, dataset preparation, classifier training, and label prediction based on k-mer frequency distribution similarity. The algorithm's performance has been cross-checked through accuracy assessment metrics such as F1 score and precision with an impressive 93% accuracy rate.
|
Keywords |
k-mer, bacterial identification, sequence comparison, KNN classification & bio pytho
|
Citation |
Sangar et al. Bioinformation 20(9): 986-989 (2024)
|
Edited by |
P Kangueane
|
ISSN |
0973-2063
|
Publisher |
|
License |
This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License.
|
|