Risks and solutions for StyleGAN2-ADA Dual Conditioned on ISIC 2020 and Fitzpatrick17k combined? #8847
Replies: 1 comment
-
|
I haven't trained StyleGAN2-ADA on this exact ISIC-2020 plus Fitzpatrick17k mix. But I've spent a lot of time on skin-lesion fairness benchmarking with HAM10000 and DDI, and most of the risk here comes from what these two datasets are, not from the architecture. A few things I'd check before committing. 1. The biggest source of variation will be the imaging modality, not lesion type or skin tone. ISIC-2020 is dermoscopic. The optics are fairly standardized and the lesion fills the frame. Fitzpatrick17k is clinical photos pulled from dermatology atlases, so you get variable lighting, framing, and zoom. Merge them as-is and the easiest signal for the discriminator is "dermoscopic vs clinical." You'll probably see the generator split into two modes or collapse toward one. ADA helps when data is limited. It does nothing for a real domain gap. 2. You don't actually have both labels on any single image. Conditioning on lesion type and skin tone together needs both labels per image. ISIC-2020 has malignancy labels but no reliable Fitzpatrick type. Fitzpatrick17k has skin type but a different diagnosis taxonomy and no dermoscopic malignancy label. So you'd have to impute one side. One option is estimating skin tone on ISIC from peri-lesional skin using ITA (Individual Typology Angle). Another is mapping both taxonomies onto a shared coarse label. Both add noise, and the Fitzpatrick17k skin-type labels are already known to be noisy. That's a validity risk worth planning for up front. 3. A practical note on the conditioning. StyleGAN2-ADA's built-in 4. If the real goal is fairness augmentation, meaning you want synthetic darker-skin lesions to rebalance the data, be careful with evaluation. Downstream classifiers can learn the generator's fingerprint. Measuring fairness gains on GAN-augmented data can also turn circular. I'd hold out splits that are disjoint by modality and by tone, report per-subgroup FID instead of one global number, and run a domain-classifier probe on the generated images to see how separable they still are. Happy to go deeper on the ITA tone estimation or the disjoint-split setup if it helps. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Has anyone trained a StyleGAN2-ADA on a combined dataset of ISIC-2020 and Fitzpatrick 17k? I am wondering about how was the domain shift between the two handled, and also if anyone has tried to extend the conditioning to multiple simultaneous labels (lesion type + skin tone)? Looking for experienced opinions before committing to this architecture.
Beta Was this translation helpful? Give feedback.
All reactions