Today’s MiWORD of the day is … YOLO!

YOLO? You Only Live Once! Go and take adventures before we waste life in the common days, as in The Motto by Drake.

Well, maybe we should go back from the lecture hall of PCS100 (Popular Culture Study) to the classroom of computer science and statistics. In the world of algorithms, YOLO refers to You Only Look Once. Its name has indicated that it is very powerful with full confidence on its efficiency. But what is such a powerful algorithm and how does it work?

YOLO is an algorithm of bounding box regression that performs object detection. It can recognize the classes of objects in images and bound those objects with predicted boxes, where the tasks of classification and localization are completed at the same time. Compared with previous region-based algorithms like R-CNN, YOLO is more efficient because it is region-free.

Object detection methods usually use sliding windows to go through the whole image and see whether there is an object in each window. Region-based algorithms like R-CNN apply Region Proposal to reduce the number of windows to check. YOLO is different as it makes predictions on the entire image at the same time. As an analogy for fishing, R-CNN first divides the regions and picks those regions where fish might occur, while YOLO puts a fishing net and catch fishes together. YOLO divides the image into grids where each grid recognizes an object whose center is inside the grid by its bounding boxes. When several grids declare that an object occurs inside, non-maximal suppression is applied to only keep the grid with highest confidence. Thus, the combination of grid confidence and grid predicted bounding boxes could tell the final classification and localization of each object in the image. 

As the development of region-free algorithms, there have been several versions of YOLO. One practical and advanced version is YOLOv3, which is also the version that I put in my project. It is widely applied in many fields, including the popular auto-driving and … also medical imaging analysis! YOLOv3 is popular because of its efficiency and simple usage, which could save much time for any potential user.

Now we can go to the fun part! Using YOLO in a sentence by the end of the day (I put both serious and not together):

Manager: “Where is Kolbe? He was supposed to finish his task of detecting all the tumors in these CT images tonight! Had he already gone through all thousands of images during the past hour?”

Yvonne: “Well, he was pretty stressed about his workload and asked me if there is any quick method that can help. I said YOLO.”

Manager: “That sounds good. The current version has good performance in many fields, and I bet it could help. Wait, but where did he go? He should be training models right now.”

Yvonne: “No idea. He just got excited and shouted YOLO, turned off the computer and left quickly without any message. I guess he was humming like Tik Tok when phoning with his friends.”

Manager: “Okay, I can probably guess what happened. I need a talk with him tomorrow…”

See you in the blogosphere! 

Jihong Huang

Jihong Huang’s ROP399 Journey

Hi, my name is Jihong Huang and I have finished my third year in computer science and statistics at the University of Toronto. During this summer, I had the great chance to work on my ROP399 project under the guide of Dr. Pascal Tyrell. In such a pandemic, everything was a bit different from usual, including this program. Still, I would like to share my experience and lessons from this summer with you!

After three years in the university and so many different courses in statistics and computer science, I thought that I was totally prepared to take a try in some research projects with knowledge learnt in lectures. However, it turned out that my thoughts were completely wrong! Everything was different from the lectures, where professors will teach step by step with detailed notes. I needed to create my own proposal and design the experiments, independently like a scholar instead of a student. Despite Dr. Tyrrell’s help, I struggled to figure out my schedule for the project. Such an experience was quite unique and special to me compared with time in lecture assignments.

After all the setups, I began to handle the coding part of my project. I picked YOLOv3 as my application of bounding box regression. YOLOv3 is one of the most popular bounding box regression algorithms and it already has excellent performances in many fields. At the same time, it has its complex structures and mechanisms that are longer and more complicated than any code that I have ever learnt. It looks like only the combination of classification and localization, where each single algorithm is easy to understand but the combination is much more advanced than my lectures notes! It took me weeks to roughly figure out its mechanism. Then, I devoted myself to debugging the code. That was difficult, as I was not familiar with most of the packages used. Some issues were caused by different versions of packages, while some were made by subtle wrong code. The adjustments of hyperparameters were also annoying as I usually could not find the optimal solutions for them. Thanks to the great help from Mauro, I finally made my code work on the server successfully.

At the end of the whole trip in my project, I gained a lot of advanced knowledge about bounding box regression and many relating packages, which I would probably never touch before my graduation if I did not take this project. However, my most precious lessons are not about any specific coding ability. The most important lesson is what scientific research is and how it should be done. I learnt that it is very important to make a clear and specific proposal as the plan in the beginning as it would provide the guidelines for any further experiments on coding. Otherwise, it would be easy to go off track and lose the initial goal when thousands of lines of code overwhelm. Also, there could always be failures in scientific research. I spent more than half of my time making and fixing mistakes during the project, which frustrated me a lot in the process. My final conclusion was suggesting that the algorithm selected was not performing well. But they were all common in scientific research. As we learn from failures, the failures are meaningful, and we could make further progress based on them. Thanks to the help from Dr. Tyrrell and all other lab members, it was them that helped me out of frustration during the project and offered me valuable advice.

After this project of three months, I learnt a lot from my first try in the world of scientific research, including coding skills and scientific spirits. This experience provided me with important guidance on my future direction of study and I think all the time and efforts are worthwhile.

– Jihong Huang