Hi everyone! I’m Amar Dholakia and I’m a fourth-year/recent graduate having majored in Neuroscience and Statistics, and am starting a Masters’ in Biostatistics at UofT in the fall of 2020. I’ve had the pleasure of being a part of Dr. Tyrrell’s lab for almost two years now and would like to take the opportunity to reflect on my time here.
I started in Fall 2018 as a work-study student, tasked with managing the Department of Medical Imaging’s database. A highlight was discussing and learning about my peers’ work, which sparked my initial interest in the field of artificial intelligence and data science.
The following fall, I began a fourth-year project in statistics, STA498Y under the supervision of Dr. Tyrrell. My project investigated the viability of clustering of image features to assess dataset heterogeneity on deep convolutional network accuracy. Specifically, I compared the behaviour of six clustering algorithms to see if the choice of algorithm affected the ability to capture heterogeneity.
My project started out with reaching out to my labmate and good friend Mauro Mendez, who had recently undertaken a project very similar to mine. He sent me his paper, which I read, and re-read, and re-re-read… It took me about four months to only begin to grasp what Mauro had explored, and how I could use what he had learned to develop my project. But months of struggle was definitely worth the “a-ha!” moment.
First I started by replicating Mauro’s results using Fuzzy K as a clustering to make sure I was on the right track. Reading, coding, and testing the very first time was a nightmare – I had some Python experience but had never applied it before. It took a lot of back and forth with Mauro and Dr. Tyrrell , a lot of learning, understanding, and re-learning what I THOUGHT I understood to get me on the right track. By the start of the Winter term, I had finally conjured preliminary results – banging my head on the wall was slowly becoming worth it.
Once I had the code basics down, getting the rest of the results was relatively smooth sailing. I computed and plotted changes in model accuracy with sample size, and heterogeneity in model accuracy with sample size, as captured by different clustering methods. My results for one model were great from the get go – I was set! I thought to challenge myself by generalizing to a second model – and that was far from easy. But by taking that extra challenge, I felt I learned more about my project, and importantly, how to scientifically justify my results. The results didn’t match up, and I had to support my rationale with evidence (from the literature). If I couldn’t find an explanation, I may have done something incorrectly. And lo and behold, my ‘inexplicable’ results were in fact due to human error – something I very painstakingly troubleshooted, but now I understand much more and justify.
Ultimately, we showed that regardless of clustering technique, or CNN model, clustering could effectively detect how heterogeneity affected CNN accuracy. To me, this was an interesting result as I expected vastly different behaviour between partition-based and density-based
clustering. Nonetheless, it was welcome, as it suggested that any clustering method could be used to assess CNN.
I struggled most with truly appreciating what my research aimed to solve. I attribute this partially to not being as proactive with my readings and questions to Dr. Tyrrell to really verify my understanding. And to be honest, exploring this project is still a work-in-progress – something I will continue learning about this summer!
My advice to any future students – read, read, read! Diving into a specific academic niche is truly a wonderful experience. The learning curve was steep and initially involved a lot of trying, failing, fixing, and then trying again. But this experience only reinforced my notions of “success through failure” and “growth through struggle”. It may be challenging at first, but with some perseverance and support from a wonderful PI – like Dr. Tyrrell – you’ll be able to accomplish so much more than you originally imagined.