A Systematic Comparison of Robustness in Bayesian Deep Learning on Diabetic Retinopathy Diagnosis Tasks