TY - JOUR
T1 - RADiff
T2 - controllable diffusion models for radio astronomical maps generation
AU - Sortino, Renato
AU - Cecconello, Thomas
AU - De Marco, Andrea
AU - Fiameni, Giuseppe
AU - Pilzer, Andrea
AU - Magro, Daniel
AU - Hopkins, Andrew M.
AU - Riggi, Simone
AU - Sciacca, Eva
AU - Ingallinera, Adriano
AU - Bordiu, Cristobal
AU - Bufano, Filomena
AU - Spampinato, Concetto
PY - 2024
Y1 - 2024
N2 - Along with the nearing completion of the square kilometer array (SKA), comes an increasing demand for accurate and reliable automated solutions to extract valuable information from the vast amount of data it will allow acquiring. Automated source finding is a particularly important task in this context, as it enables the detection and classification of astronomical objects. Deep-learning-based object detection and semantic segmentation models have proven to be suitable for this purpose. However, training such deep networks requires a high volume of labeled data, which is not trivial to obtain in the context of radio astronomy. Since data needs to be manually labeled by experts, this process is not scalable to large dataset sizes, limiting the possibilities of leveraging deep networks to address several tasks. In this work, we propose RADiff, a generative approach based on conditional diffusion models trained over an annotated radio dataset to generate synthetic images, containing radio sources of different morphologies, to augment existing datasets and reduce the problems caused by class imbalances. We also show that it is possible to generate fully synthetic image-annotation pairs to automatically augment any annotated dataset. We evaluate the effectiveness of this approach by training a semantic segmentation model on a real dataset augmented in two ways: 1) using synthetic images obtained from real masks; and 2) generating images from synthetic semantic masks. Finally, we also show how the model can be applied to populate background noise maps for simulating radio maps for data challenges.
AB - Along with the nearing completion of the square kilometer array (SKA), comes an increasing demand for accurate and reliable automated solutions to extract valuable information from the vast amount of data it will allow acquiring. Automated source finding is a particularly important task in this context, as it enables the detection and classification of astronomical objects. Deep-learning-based object detection and semantic segmentation models have proven to be suitable for this purpose. However, training such deep networks requires a high volume of labeled data, which is not trivial to obtain in the context of radio astronomy. Since data needs to be manually labeled by experts, this process is not scalable to large dataset sizes, limiting the possibilities of leveraging deep networks to address several tasks. In this work, we propose RADiff, a generative approach based on conditional diffusion models trained over an annotated radio dataset to generate synthetic images, containing radio sources of different morphologies, to augment existing datasets and reduce the problems caused by class imbalances. We also show that it is possible to generate fully synthetic image-annotation pairs to automatically augment any annotated dataset. We evaluate the effectiveness of this approach by training a semantic segmentation model on a real dataset augmented in two ways: 1) using synthetic images obtained from real masks; and 2) generating images from synthetic semantic masks. Finally, we also show how the model can be applied to populate background noise maps for simulating radio maps for data challenges.
KW - Data-augmentation
KW - diffusion-models
KW - generative models
KW - radio-astronomy
KW - semantic-image-synthesis
UR - http://www.scopus.com/inward/record.url?scp=85200248470&partnerID=8YFLogxK
U2 - 10.1109/TAI.2024.3436538
DO - 10.1109/TAI.2024.3436538
M3 - Article
AN - SCOPUS:85200248470
SN - 2691-4581
VL - 5
SP - 6524
EP - 6535
JO - IEEE Transactions on Artificial Intelligence
JF - IEEE Transactions on Artificial Intelligence
IS - 12
ER -