HOME   |    PDF   |   


Title

Species annotation using a k-merbased KNN model

 

Authors

Srushti Sangar1, Prathamesh Kolage1 & Pritee Chunarkar-Patil1,*

 

Affiliation

1Department of Bioinformatics, Rajiv Gandhi Institute of IT and Biotechnology, Bharati Vidyapeeth (Deemed to be University), Pune, Maharashtra, India; *Corresponding author

 

Email

Srushti Sangar - E - mail: srushti.sangar-rgitbt@bvp.edu.in; Phone: +91 9325122998

Prathamesh Kolage - E - mail: prathamesh.kolage-rgitbt@bvp.edu.in; Phone: +91 9370215432

Pritee Chunarkar-Patil - E - mail: preeti.chunarkar@bharatividyapeeth.edu; Phone: +91 9730038142

 

Article Type

Research Article

 

Date

Received September 1, 2024; Revised September 30, 2024; Accepted September 30, 2024, Published September 30, 2024

 

Abstract

Bacterial identification is a critical process in microbiology, clinical diagnostics, environmental monitoring, and food safety. Machine learning holds great promise for improving bacterial identification by increasing accuracy, speed, and scalability. However, challenges such as data dependency, model interpretability, and computational demands must be addressed to fully realize it’s potential. k-mer based bacterial identification algorithm is an attempt to address these issues. Sequence matching is completed using the KNN technique. This included feature extraction, dataset preparation, classifier training, and label prediction based on k-mer frequency distribution similarity. The algorithm's performance has been cross-checked through accuracy assessment metrics such as F1 score and precision with an impressive 93% accuracy rate.

 

Keywords

k-mer, bacterial identification, sequence comparison, KNN classification & bio pytho

 

Citation

Sangar et al. Bioinformation 20(9): 986-989 (2024)

 

Edited by

P Kangueane

 

ISSN

0973-2063

 

Publisher

Biomedical Informatics

 

License

This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License.