Treffer: Lamzsen/BCclassifier: v1.0.1

Title:
Lamzsen/BCclassifier: v1.0.1
Authors:
Publisher Information:
Zenodo
Publication Year:
2025
Collection:
Zenodo
Document Type:
E-Ressource software
Language:
unknown
DOI:
10.5281/zenodo.15203949
Rights:
Accession Number:
edsbas.4AF8E839
Database:
BASE

Weitere Informationen

v1.0.0 - Breast Cancer Diagnosis and Classification Analysis This is the first stable release (v1.0.0) of our Breast Cancer Diagnosis and Classification Analysis software, as described in our manuscript submitted to Nature Biomedical Engineering. It leverages circular RNA and tumor marker data, along with machine learning and deep learning models, to classify breast cancer (BC) patients and healthy controls (HC). Highlights in v1.0.0 Core Features Data Preprocessing: Cleans Excel inputs, handles formulas, and fills missing values. Feature Engineering: Integrates circular RNAs (e.g., hsa_circ_0044235) and tumor markers (e.g., CEA, CA125, CA153). Clustering: Applies PCA and K-means to visually separate BC and HC samples. Classification Models: Support Vector Machine (SVM) Logistic Regression Gradient Boosting Random Forest Neural Network (Keras) Ensemble VotingClassifier Evaluation Metrics: Accuracy, AUC-ROC, confusion matrix, precision, recall. Visualization: Generates PCA plots, ROC curves, heatmaps, and boxplots. Output Files CSV Files: pca_clustering_results.csv, roc_curves_data.csv, model_performance_summary.csv, etc. Images: pca_bc_hc_cluster.png, roc_curves_improved.png, feature_importance.png, etc. Installation Python Version: 3.5 or later Dependencies: pip install pandas numpy matplotlib seaborn scikit-learn keras tensorflow Note: This release is compatible with older versions of Keras and Seaborn. Usage 1. Prepare Data Place the following Excel files in the project root directory: ROC曲线数据.xlsx:Contains biomarker data for BC and HC samples.Required columns: hsa_circ_0044235 current/μΑ, hsa_circ_0000250 current/μΑ.Optional columns: CEA, CA125, CA153.Missing values for tumor markers are filled with normal ranges (e.g., CEA = 2.5 ng/mL). BC病理分期.xlsx:Contains BC pathology staging data.Required column: 病理分期 (e.g., T1N0M0). 2. Run the Analysis Execute the following command in your terminal: python main.py Citation If you use this software, please cite: [Your Name]. (2025). MySoftware: Breast Cancer Diagnosis ...