Pipelines

The analysis pipelines within this repository are contained in the analysis/ directory and include the following:

GALAH Preprocessinggalah_filter

  • Applies quality, abundance, and orbital cuts to GALAH DR3 stars. Cross-matches with Gaia EDR3 for distances and orbital parameters, selecting metal-poor, high-eccentricity stars suitable for halo analysis.

APOGEE Filteringapogee_filter

  • Filters APOGEE DR17 stars based on log g, SNR, abundance flags, and derived kinematics. Optionally queries Gaia DR3 for precise photogeometric distances to apply distance uncertainty cuts.

Extreme DeconvolutionXDPipeline

  • Performs uncertainty-aware Gaussian Mixture Model clustering through Extreme Deconvolution. Includes model fitting (run_XD), component selection (compare_XD), probabilistic assignment (assigment_XD), summary table generation (table_results_XD), and visualisation (plot_XD). Designed for reproducible functionality with the ability to save analysis and import previous results.

Reduced Dimensionality GMMReducedGMMPipeline

  • An additional clustering pipeline that mirrors the XD high-dimensional analysis but operates on a UMAP-reduced space. It applies Gaussian Mixture Models directly to the lower-dimensional projection, with the functionality to map these assignments back to the original stellar features.

Dimensionality + Clustering Initial Visualisationinvestigate_umap

  • Explores and visualises how well stellar populations separate in low-dimensional UMAP space and tests the behaviour of unsupervised methods (GMM or HDBSCAN) before applying full clustering pipelines.