Treffer: Leveraging LASSO-based methodologies for enhanced SNP analysis in plant genomes.
Plant Genome. 2016 Jul;9(2):. (PMID: 27898829)
Bioinformatics. 2007 Oct 1;23(19):2633-5. (PMID: 17586829)
BMC Bioinformatics. 2019 Apr 2;20(1):167. (PMID: 30940067)
Biology (Basel). 2022 Oct 12;11(10):. (PMID: 36290397)
Am J Hum Genet. 2011 Jan 7;88(1):76-82. (PMID: 21167468)
Bioinformatics. 2012 Sep 15;28(18):2397-9. (PMID: 22796960)
Nat Genet. 2012 Jan 08;44(2):212-6. (PMID: 22231484)
Am J Hum Genet. 2007 Sep;81(3):559-75. (PMID: 17701901)
Weitere Informationen
Summary: Genome-wide association studies (GWAS) have been widely used to reveal the associations between genetic variations and phenotypes in a population of individuals. However, they have been criticized for missing important genetic markers usually due to the fact that the data may not fit the statistical models well. In this study, we address the challenge of identifying significant single nucleotide polymorphisms (SNPs) in GWAS by harnessing the capabilities of two sophisticated regression models, BIGLASSO and AUTALASSO. They are both variants of the least absolute shrinkage and selection operator (LASSO). Our research contributes to the field of genomics through detailed comparative analysis of Arabidopsis thaliana , revealing how each method specializes in uncovering SNPs for different trait types. Our findings indicate that BIGLASSO shows stronger alignment with GWAS results, particularly excelling in the analysis of binary traits, even when these are derived from categorical phenotypes. AUTALASSO could be effective for quantitative traits and complement GWAS. We demonstrate that these LASSO-based methods can significantly enhance the identification of genetic markers, offering a potent complement to traditional GWAS approaches. Our findings not only bridge the gap between statistical and machine learning methodologies in genetic studies but also provide a practical framework for researchers seeking to validate reported SNPs or explore new genomic regions for trait association. This work stands as a pivotal step toward the integration of advanced computational techniques in genomics, paving the way for more precise and comprehensive genetic analyses.
Availability and Implementation: Key results from the paper are available at the https://github.com/DongdongHou006/LASSO-SNP. The program was implementated using Python and R, and was tested using the Digital Research Alliance of Canada.
(© The Author(s) 2025. Published by Oxford University Press.)
No competing interest is declared.