Evaluation metric code

Aramis_Vesal · May 13, 2019, 11:38am

Hi,

I would like to suggest organizer to share the python script for dice metric evaluation similar to the past challenges, like Segmentation Decathlon, ACDC and etc. This would definitely reduce the evaluation error for both sides – participants and organizer.

Regards

neheller · May 14, 2019, 2:17pm

Thanks for the suggestion. We were planning to release that with the test data but we can certainly do it sooner. I’ll try to get around to it this week.

Aramis_Vesal · May 26, 2019, 10:35am

Hi @neheller,

Would you please let us know, when you release the evaluation metric code?

neheller · May 26, 2019, 4:49pm

Will do, apologies for the delay

neheller · May 28, 2019, 7:01pm

You can find the evaluate function in starter_code/evaluation.py. The meat of it is below:

# Compute tumor+kidney Dice
tk_pd = np.greater(predictions, 0)
tk_gt = np.greater(gt, 0)
tk_dice = 2*np.logical_and(tk_pd, tk_gt).sum()/(
    tk_pd.sum() + tk_gt.sum()
)
# Compute tumor Dice
tu_pd = np.greater(predictions, 1)
tu_gt = np.greater(gt, 1)
tu_dice = 2*np.logical_and(tu_pd, tu_gt).sum()/(
    tu_pd.sum() + tu_gt.sum()
)

Please let me know if you have any questions or concerns.

Lihao_LIU · July 12, 2019, 8:10am

Hi, I think there is a slight problem about the evaluation metrics.

It should be 2.0 * np.logical… instead of 2 * np.logical…

The dice should be float instead of integers.

neheller · July 12, 2019, 11:23am

Thanks for pointing this out. I should have mentioned that this is meant to be run in Python 3, where all division returns floats. I haven’t tested it in Python 2, but you may very well be correct that this causes a problem. Luckily the evaluation platform on grand-challenge.org runs Python 3.6.