5 Tips for Students of PhD in AI to succeed

Nearly a year has passed since I completed my four-year Ph.D. journey, a period dedicated to exploring the intricate convergence of medical imaging and artificial intelligence. In that time, the landscape of AI has transformed dramatically. This field evolves at a breathtaking pace, with a pivotal moment being the 2012 publication of AlexNet, a milestone that significantly accelerated AI’s industrialization.

I first delved into AI and machine learning within the medical domain in 2017. My initial endeavor involved developing innovative ensembling techniques for feature selection methods. This work culminated in a publication in the Information Sciences journal. Despite facing numerous challenging rebuttals, the paper was ultimately published in 2019.

#1 Most crucial skill of PhD in AI: Mathematics

PhD in AI studying math

The availability of frameworks indeed lowered the bar of mathematical knowledge you need to know to work with AI. If you want to perform classification tasks on natural images, it is easy. Get a pre-trained network from torch/torchvision, manipulate the network if needed, change the number of classes in the classification head, choose a loss function, and train. This would be doable nowadays with ChatGPT without any skill. But what about research?

The review process of new research is flawed, and we all know it. Reviewers are not paid anything to do the hard work, and the publishing companies keep all the profit. Review statistics of NeurIPS (one of the most prestigious conferences on ML/AI) can be found in the ArXiv paper.

See the table for yourself — two committees reviewed the same 882 duplicated papers. So, for instance, the first committee selected 25 papers for the spotlight. The second committee rejected 13 of them! That is more than 50% disagreement. The first committee selected 177 papers for the poster version. The second one left 94 of them. Seventy-four were selected for poster, 7 for spotlight, and 2 for oral!

Original committee / Shadow commiteeOralSpotlightPosterRejectWithdrawn
Source: Comparison of review decisions for two committees for the same papers.

Researchers, especially in AI, try to include math details irrelevant to the paper to overwhelm the reviewers. A question on academia stack exchange with 50 upvotes asks whether reviewers skim over the equation… So, to keep track of the newest research, you must understand the math instantly and read it without any hesitation. Basic notation for gradients and mathematical functions (cross-entropy, min-max loss in GAN loss, energy-based models, backprop., l1/l2 losses) must not be overwhelmed in even simpler papers.

If I had been given the chance to do all of it again, I would have started the PhD with sufficient mathematics knowledge not to be intimidated by the new work and mathematical notations.

#2 Good practice for PhD in AI: Follow all the best conferences as soon as they are published

As I’ve observed, the pace of AI research is incredibly rapid, almost beyond belief. The volume of papers published annually is staggering. It’s impractical to keep up with all of them, yet staying motivated and generating new ideas requires keeping abreast of the foremost conferences in the field. Key conferences in AI include AAAI, NeurIPS, ICLR, ICML, while in the computer science-medical domain, notable ones are MICCAI, ISBI.

However, it’s not just about the conferences. Many researchers extend their conference presentations with additional proofs, experiments, or datasets and later publish them in reputable journals. It’s important to be discerning about the sources of your information; avoid journals from publishers with dubious reputations, such as (see here, here, here), etc. For high-quality research in computer science, look to journals like IEEE Transactions on Neural Networks, IEEE Transactions on Medical Imaging, IEEE Transactions on Pattern Recognition and Machine Intelligence, etc. Check out my previous post for an example of what can be published at MICCAI.

#3 Tips for sanity of PhD in AI: Collaborate!

two phd in AI behing a computer

I am from Slovakia. That country is small, 5.5 million people. When I started, there was not any GPU equipment whatsoever to begin with. Google Colab was yet to be created, and there were not any alternatives. Before the advent of Google Colab, and with no alternatives available, this lack of resources led to a significant gap in AI awareness at our university. Consequently, I found myself building many projects from the ground up. Unbeknownst to me then, being a Ph.D. student was vital for the senior academic staff.

Today, prestigious conferences like MICCAI and NeurIPS facilitate mentor/mentee programs, making it easier than ever to connect with experienced professionals worldwide. While reaching out directly to renowned figures like Prof. Le Cun or Prof. Bengio might be ambitious, there are numerous opportunities for collaboration that can greatly benefit your research. My early days were marked by ample time but a dearth of ideas. Now, as ideas abound, finding the time for pure research has become a new challenge.

#4 Playing long game for PhD in AI: Stick to one programming framework and plan the projects with simplicity in mind

Throughout my research, I frequently encountered reviewers requesting additional experiments. In instances where I provided two experiments, they asked for three; when I had one, they wanted two. And sometimes, even with three experiments, there was a call for a different deep learning architecture.

What does this teach us? The necessity of conducting complex experiments becomes inevitable, often revisiting previous projects after some time. It’s essential to maintain thorough documentation and adhere to common patterns in your work. Avoid getting sidetracked by the latest and most hyped frameworks; they often come and go, with the exception of enduring ones like Torch and TensorFlow (though even TensorFlow’s longevity can be debated). I recall frameworks like Theano and Caffe, which were once popular.

My advice is to understand the entire process of training deep learning networks before relying on high-level frameworks like fast.ai or Keras. Ensure to archive your code and strive for clean, readable coding practices.

This approach is key in disseminating your research effectively. After all, what’s the purpose of publishing papers if they don’t gain traction and influence the field?

#5 Divide and Conquer as a good PhD in AI

Research in AI is undeniably challenging, as I often find myself repeating. Consider the implementation of a paper like “Latent Diffusion” – it demands extensive knowledge, and yet, the details provided are rarely sufficient for direct replication. Take, for instance, the initial step involving a Vector Quantized Variational Autoencoder with Discriminator. What was groundbreaking in 2014 with a simple VAE is now a basic prerequisite.

Achieving a correct implementation is just the beginning. It then needs to be adapted for your specific data, hardware, and software versions, and success is not guaranteed. How do you troubleshoot a deep learning model? Even when code is available on platforms like GitHub, as with the Latent Diffusion paper, it’s often complex and non-intuitive.

When I began a new project, aiming to reimplement a paper, I initially tried to tackle everything simultaneously. My advice now is to never be daunted by the mathematics or coding. Break down the problem: start with creating the autoencoder, then integrate the VAE, follow with the vector quantization layer, and finally, add the discriminator, evaluating each step.

Consider this example: encoding and decoding images (compressing high-dimensional images into a latent space and back). With VQVAE, the results were decent, but the addition of a discriminator in VQVAE+Disc significantly enhanced reconstruction accuracy. This step-by-step approach revealed that the discriminator vastly improved quality – a critical detail not mentioned in the original paper. Without this incremental method, I might have missed this insight, leading to a subpar performance in the latent diffusion process.

Left – Ground truth Data. Right – Reconstructed data with VQVAE
Comparison of reconstruction of two images.
Left – Ground truth Data. Right – Reconstructed data with VQVAE + Discriminator

As I reflect on these lessons from my PhD journey in AI, I’m constantly reminded of the field’s ever-evolving nature and the unending demand for learning. Each challenge faced and obstacle overcome is a step towards deeper understanding and innovation.

To those embarking on or navigating your own AI research journey, remember that your unique experiences will contribute significantly to this dynamic field. I encourage you to share your insights, questions, and stories in the comments below or reach out via your preferred contact method. Let’s build a community where we can learn from each other, celebrate our achievements, and collaboratively overcome challenges.

Stay curious, stay connected, and let’s collectively push the boundaries of AI in healthcare and beyond.