Tyrrell4innovation

My name is Yiyun Gu and I am a fourth-year student studying mathematics and statistics at University of Toronto. After taking some statistical courses and machine learning courses, I was quite interested in applying machine learning methods and statistical methods to practice. Medical imaging is a popular field where machine learning methods have great impacts. Therefore, I contacted Dr. Pascal Tyrrell and he would like to supervise me.

Last September, my initial research direction was Bayesian optimization on hyperparameters of Convolutional Neural Networks based on the previous model information and the distributions. Besides Dr. Pascal Tyrrell’s instruction, he introduced his graduate student who was also interested in this field. We had weekly meetings to discuss how to make the idea implementable. I read many papers and learned relevant knowledge of Gaussian process, acquisition functions and surrogate functions. However, there was a huge challenge on how to update the hyperparameters of the prior distribution based on the information from the CNNs model. I was anxious about the progress. Dr. Pascal Tyrrell encouraged me to shift the direction a little bit because he cared about what a student learned and felt about the project.

Since November, out of interest in Bayesian concepts, I have been working on a project about comparing frequentist CNNs and Bayesian CNNs for the projects with sample size restrictions. Because there might not be sufficient data in medical imaging, I would like to determine whether Bayesian CNNs would benefit from prior information for small datasets and outperform frequentist CNNs. Bayesian CNNs update the distributions of weights and bias while frequentist CNNs use point estimates. The resources of the codes of Bayesian CNNs were limited. I tried to make full use of and modify the codes so that I could run the experiments from training sample size equal to 500 to training sample size equal to 50000. I applied customized architectures and AlexNet to MNIST and CIFAR-10 datasets. I found out that Bayesian CNNs didn’t perform well as I expected. Frequentist CNNs achieved higher accuracy and took less time compared to Bayesian CNNs. However, there is an interesting feature of Bayesian CNNs. Bayesian CNNs incorporate uncertainty measure. Since Bayesian CNNs have the distributions of weights, the models can also output the distributions of outputs. Therefore, Bayesian CNNs could tell how confident the decision is made.

I hope to apply more architectures of Bayesian CNNs to more datasets in medical imaging projects because architectures and datasets have great influences on the performance. Also, I would like to try more prior distributions and learn how to determine which distributions are more appropriate.

I had great research experience in this project with Dr. Pascal Tyrrell’s guidance and other graduate students’ help. It was my first time to write scientific report. Dr. Pascal Tyrrell kept instructing me how to write the report and offered great advice. I really appreciated the guidance and enjoyed the unique research experience in the end year of my undergraduate life. I look forward to contributing to medical imaging research and more opportunities to apply machine learning methods!

Yiyun Gu