Eyeing the Generation of AI in Bioinformatics

by: Jan Esther Dequito (Informosomes)
    Artificial intelligence (AI) is the machine’s ability to comprehend inputs, learn patterns from experiences, and execute tasks likened to human capabilities. The use of AI had its own niches in specific disciplines and industries, but now AI traverses every sector of society and daily life. AI especially gained popularity when a new level of generative model became free, accessible, and user-friendly to people across the globe, such as with the breakthrough of large language models in recent years. Its ability to engage in conversational types of communication and adjust with prompts has made it widely usable. The development and use of artificial intelligence are not entirely new, but the more its technological prowess grows, so do the ethical considerations and issues surrounding it.
    Bioinformatics is born from the utilization of computational techniques to analyze complex biological data and produce meaningful interpretations. The ability to read the blueprint of life down into the basic nucleotides and proteins from the amino acid sequences paved the way for the “omics” revolution. Coupled with advances in modern molecular biology techniques, bigger and more complex data are generated and demand curated processing.
    It’s undeniable that technology would always want to point to the direction of improvement and upgrading efficiency. Perhaps it is only natural that artificial intelligence would run its trajectory towards data analysis in the biological sciences as if it were predestined.
    One of the most pressing challenges today in bioinformatics is dissecting the causal interactions from voluminous omics data (Dibaeinia et al. 2025). Recent advancements demonstrate AI’s beneficial role. AI algorithms have accelerated genomic data analyses, such as in variant calling and pinpointing the genes underlying diseases, revolutionizing clinical genetics and diagnostics.
    For example, Google’s open-source DeepVariant scans the output of next-generation sequencing (NGS) by converting candidate variants into stacked images. These image representations will be subjected to a convolutional neural network (CNN) that ultimately distinguishes between true genetic variants and sequencing errors.
A screengrab from the slides of Maria Nattestad / DeepVariant / Google Health
    AI models have also been utilized in diagnosis, especially for rare and previously undiagnosed diseases. AI-MARRVEL (AIM), developed by researchers from Texas Children's Hospital and Baylor College of Medicine, took into account sequenced samples, recorded symptoms, and a pool of potential candidate genes to determine the causative gene(s) responsible for rare disorders.
    Deep learning models have also been shown to forecast enzyme function based on protein sequences alone. In 2023, a research team from South Korea and the US introduced DeepECTransformer—an AI system that performs homology analysis of proteins with information on active sites and cofactor binding sites to elucidate metabolic processes. This facilitates research on functional genomics, especially on the enzymes with obscure biochemistry.
    Meanwhile, AlphaFold’s breakthrough in predicting protein structures and generating accurate visual models based solely on amino acid sequence earned its innovators a Nobel Prize in Chemistry in 2023, underscoring AI’s immense power.
    AI’s influence stretches its reach to microbial genetics and metagenomics. Metagenomics encompasses scanning genomes extracted directly from mixed microbial communities, producing an enormous volume of data. Dealing with large-scale input is made efficient and comprehensive with AI-driven approaches, offering many insights ranging from taxonomic diversity to metabolic pathways and ecological functions.
    At this point, we realize AI is truly superb in sorting colossal data and generating insights from it. There are indeed endless possibilities for what it can do and how it can be utilized with its continuous development. Still, however, we must also recognize that AI as a tool has its downsides and reserves its inherent limitations.
    As mentioned earlier, at times, the utilization of AI struggles to uphold a balance in moral and ethical considerations. Issues surround data privacy, bias, and model interpretability (Khan et al., 2023).
    On data security and confidentiality, some major concerns arise from handling and storing sensitive personal medical records of patients, especially health and genetic information. Moreover, there is a lingering possibility of data breach from hackers. With these, strict anonymity must be kept, informed consent must be implemented, and data ownership policies must be strictly reviewed. Currently, there are no universal guidelines governing AI’s extent in healthcare, highlighting the urgency for regulatory frameworks (Khan et al., 2023).
    Another fundamental limitation of AI algorithms is enclosed in the so-called “black boxes,” in which software developers and scientists struggle to dissect the thought process behind every output generated by the machine (Sidharthan, 2025). Biases may also occur consciously or accidentally. Some AI algorithms may be trained on limited or biased datasets, which may perpetuate inaccuracies and lead to poor reproducibility. There is also a lack of clarity regarding accountability when errors or flawed conclusions are made by the machine.
    To mitigate these challenges of AI in bioinformatics, a proactive approach and continuous measures are essential. There should be stringent, thorough quality control of data to minimize noise and bias in datasets, improving the reliability of AI systems. Employing diverse representative datasets also improves model robustness and generalizability. Transparency in AI architecture can also be promoted by explainable AI (XAI) approaches, which help demystify the black box effects. Moreover, collaborations among developers, data scientists, physicians, and ethicists are crucial for expanding the scope and responsible application of AI in life and health sciences. Discussions on AI’s ethical use, creation of clear guidelines, and efforts to raise awareness are all vital steps in ensuring responsible use of AI systems in the near future.
    The capabilities of AI are truly promising and fit the need in addressing the intricacy and scale of biological data generated by evolving molecular technologies. Its bioinformatics applications span genome sequencing, protein structure prediction, clinical science, ecology, and more. However, on the other side of the coin of this great potential are ethical, privacy, and technical challenges that must be addressed with careful attention and urgency. Addressing these challenges requires rigorous quality control and monitoring, upholding transparency in methodologies, incorporating robust bioethical frameworks, and encouraging collaboration. Through these, the scientific community can harness AI as a transformative tool for advancing life sciences studies in a responsible manner.

SOURCES
AI MARRVEL: A new AI tool to diagnose genetic disorders. (2024, April 26). Texas Children’s. https://www.texaschildrens.org/content/research/ai-marrvel-new-ai-tool-diagnose-genetic-disorders
Bisoi, A. V., & Ramsundar, B. (2024). A modular open source framework for genomic variant calling. ArXiv, abs/2411.11513. https://doi.org/10.48550/arXiv.2411.11513
Dibaeinia, P., Ojha, A., & Sinha, S. (2025). Interpretable AI for inference of causal molecular relationships from omics data. Science Advances, 11(7), eadk0837. https://doi.org/10.1126/sciadv.adk0837
Ferrara, E. (2023). Fairness and bias in artificial intelligence: A brief survey of sources, impacts, and mitigation strategies. Sci, 6(1), 3. https://doi.org/10.3390/sci6010003
Kamalov, F., Santandreu Calonge, D., & Gurrib, I. (2023). New Era of artificial intelligence in education: Towards a sustainable multifaceted revolution. Sustainability, 15(16), 12451. https://doi.org/10.3390/su151612451
Khan, B., Fatima, H., Qureshi, A., Kumar, S., Hanan, A., Hussain, J., & Abdullah, S. (2023). Drawbacks of Artificial Intelligence and Their Potential Solutions in the Healthcare Sector. Biomedical materials & devices (New York, N.Y.), 1–8. Advance online publication. https://doi.org/10.1007/s44174-023-00063-2
Megan Craig, M. S. (2023, November 27). AI reveals enzyme functions in E. coli proteins. AZoRobotics. https://www.azorobotics.com/News.aspx?newsID=14496
Sidharthan, C. (2025, April 14). The real limitations of AI in life sciences. AZoLifeSciences. https://www.azolifesciences.com/article/The-Real-Limitations-of-AI-in-Life-Sciences.aspx

This article was originally published in the GENEWS May 2025 Issue.


0 Comments