Using traditional machine learning and deep learning methods for on- and off-target prediction in CRISPR/Cas9: a review
Version 3 2024-06-19, 18:40Version 3 2024-06-19, 18:40
Version 2 2024-06-05, 12:06Version 2 2024-06-05, 12:06
Version 1 2023-06-14, 06:28Version 1 2023-06-14, 06:28
journal contribution
posted on 2024-06-19, 18:40authored byZeinab Sherkatghanad, Moloud AbdarMoloud Abdar, Jeremy Charlier, Vladimir Makarenkov
Abstract
CRISPR/Cas9 (Clustered Regularly Interspaced Short Palindromic Repeats and CRISPR-associated protein 9) is a popular and effective two-component technology used for targeted genetic manipulation. It is currently the most versatile and accurate method of gene and genome editing, which benefits from a large variety of practical applications. For example, in biomedicine, it has been used in research related to cancer, virus infections, pathogen detection, and genetic diseases. Current CRISPR/Cas9 research is based on data-driven models for on- and off-target prediction as a cleavage may occur at non-target sequence locations. Nowadays, conventional machine learning and deep learning methods are applied on a regular basis to accurately predict on-target knockout efficacy and off-target profile of given single-guide RNAs (sgRNAs). In this paper, we present an overview and a comparative analysis of traditional machine learning and deep learning models used in CRISPR/Cas9. We highlight the key research challenges and directions associated with target activity prediction. We discuss recent advances in the sgRNA–DNA sequence encoding used in state-of-the-art on- and off-target prediction models. Furthermore, we present the most popular deep learning neural network architectures used in CRISPR/Cas9 prediction models. Finally, we summarize the existing challenges and discuss possible future investigations in the field of on- and off-target prediction. Our paper provides valuable support for academic and industrial researchers interested in the application of machine learning methods in the field of CRISPR/Cas9 genome editing.