Treffer: Applying text mining to identify relevant literature in food science: Cold denaturation as a case study.
Weitere Informationen
In a research environment characterized by the five V's of big data, volume, velocity, variety, value, and veracity, the need to develop tools that quickly screen a large number of publications into relevant work is an increasing area of concern, and the data-rich food industry is no exception. Here, a combination of latent Dirichlet allocation and food keyword searches were employed to analyze and filter a dataset of 6102 publications about cold denaturation. After using the Python toolkit generated in this work, the approach yielded 22 topics that provide background and insight on the direction of research in this field, as well as identified the publications in this dataset which are most pertinent to the food industry with precision and recall of 0.419 and 0.949, respectively. Precision is related to the relevance of a paper in the filtered dataset and the recall represents papers which were not identified in the screeningmethod. Lastly, gaps in the literature based on keyword trends are identified to improve the knowledge base of cold denaturation as it relates to the food industry. This approach is generalizable to any similarly organized dataset, and the code is available upon request. Practical Application: A common problem in research is that when you are an expert in one field, learning about another field is difficult, because you may lack the vocabulary and background needed to read cutting edge literature froma new discipline. The Python toolkit developed in this research can be applied by any researcher that is new to a field to identify what the key literature is, what topics they should familiarize themselves with, and what the current trends are in the field. Using this structure, researchers can greatly speed up how they identify new areas to research and find new projects. [ABSTRACT FROM AUTHOR]
Copyright of Journal of Food Science (John Wiley & Sons, Inc.) is the property of John Wiley & Sons, Inc. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)