Speaker : Nikolas Barbarousis (Undergraduate Student)
Date : 2nd of December
Location: Orphanoudakis meeting room
Paper Abstract: AstroCLIP is a single and versatile model designed to embed both galaxy images and spectra into a shared and physically meaningful latent space. These embeddings enable a range of downstream tasks without requiring any model fine-tuning. The tasks include accurate in-modality and cross-modality semantic similarity search, photometric redshift estimation, galaxy property estimation from both images and spectra, and morphology classification. AstroCLIP is implemented in two key steps. First, galaxy images and spectra are embedded separately by pretraining transformer-based image and spectrum encoders in self-supervised settings. Second, the encoders are aligned using a contrastive loss. This approach is applied to spectra from the Dark Energy Spectroscopic Instrument and images from the corresponding Legacy Imaging Survey. AstroCLIP demonstrates strong performance across all downstream tasks, often surpassing supervised baselines. For photometric redshift estimation, it achieves performance comparable to a specifically trained ResNet18. Additionally, for tasks such as stellar mass, age, metallicity, and specific star formation rate estimation, AstroCLIP outperforms the supervised baseline by 19 percent in terms of R-squared. Compared to a state-of-the-art self-supervised single-modal model for galaxy images, AstroCLIP outperforms it by roughly a factor of two for photometric redshift estimation and physical property prediction in terms of R-squared, while achieving similar performance for morphology classification. AstroCLIP represents the first cross-modal self-supervised model for galaxies and introduces the first self-supervised transformer-based architectures for galaxy images and spectra.