cellcanvas / optimize-xgboost / 0.0.5

Optimize XGBoost with Optuna on Zarr Data

A solution that optimizes an XGBoost model using Optuna, data from a Zarr zip store, and performs 10-fold cross-validation.
Tags
imagingcryoetPythonnaparicellcanvas
Solution written by
Kyle Harrington
License of solution
MIT
Source Code

Arguments

--input_zarr_path
Path to the input Zarr zip store containing the features and labels. (default value: PARAMETER_VALUE)
--output_model_path
Path for the output joblib file containing the trained XGBoost model. (default value: PARAMETER_VALUE)
--best_params_path
Path for the output file containing the best parameters from Optuna. (default value: PARAMETER_VALUE)
--n_splits
Number of splits for cross-validation. (default value: PARAMETER_VALUE)
--subset_size
Total number of points for balanced subset. (default value: PARAMETER_VALUE)
--seed
Random seed for reproducibility. (default value: PARAMETER_VALUE)
--num_trials
Number of Optuna trials to run. (default value: PARAMETER_VALUE)
--objective_function
Objective function to optimize. Options are: accuracy, f1, precision, recall. (default value: PARAMETER_VALUE)

Usage instructions

Please follow this link for details on how to install and run this solution.