Treffer: Removing Bottlenecks in Research Workflows: Improving SERS Sampling and Computational Zeolite Experiments Through the Application of Machine Learning and the Construction of Custom Data Pipelines
public
English
qt7531788r
https://escholarship.org/uc/item/7531788r
https://escholarship.org/
1298733519
From OAIster®, provided by the OCLC Cooperative.
Weitere Informationen
Advances in machine learning and growing computational power are enabling large scale data analysis experiments. To facilitate these experiments, data must be cleaned from its raw form into one suitable for analysis. Automatic data pipelines are able to facilitate the creation of large-scale experiments by automating the transformation and cleaning of data into a form amenable to analysis. In this thesis an automatic data pipeline is developed for two separate projects: “Machine Learning Assisted Sampling of SERS Substrates Improves Data Collection Efficiency” and “The Multiscale Atomic Zeolite Simulation Environment (MAZE): A Python Package for Improved Zeolite Structural Manipulations”. These two projects are related in that both automate key sections of the experimental data analysis, building the groundwork for future autonomous experiments. Machine Learning Assisted Sampling of SERS Substrates Improves Data Collection Efficiency: Surface-enhanced Raman scattering (SERS) is a powerful technique for sensitive label-free analysis of chemical and biological samples. While much recent work has established sophisticated automation routines via machine learning (ML) and related artificial intelligence (AI) methods, these efforts have largely focused on downstream processing (e.g., classification tasks) of previously collected data. While fully automated analysis pipelines are desirable, current progress is limited by cumbersome and manually-intensive sample preparation and data collection steps. Specifically, a typical lab-scale SERS experiment requires the user to evaluate the quality and reliability of the measurement (i.e., the spectra) as the data is being collected. This need for expert user-intuition is a major bottleneck that limits applicability of SERS-based diagnostics for point-of-care clinical applications, where trained spectroscopists are likely unavailable. While application-agnostic numerical approaches (e.g., signal-to-noise thresholding) are useful