Skanda Vaidyanath

Second Year Master’s Student of Computer Science (AI Track)

Stanford University


I am a second year master’s student of computer science (AI track) at Stanford University.

My research interests lie primarily in the area of reinforcement learning (RL) and control. Through my research, I hope to gain a better understanding of how machines learn and try to make them learn tasks in the same way humans do. My long-term goal would be to develop machines that understand and approach learning tasks in the same way that a human would and can generalize to a variety of settings.

I recently completed an internship at DeepMind where I worked with Xinghua Lou, Miguel Lazaro-Gredilla, Dileep George and others, working on planning and reinforcement learning!

During my undergraduate, I was advised by Prof. N. L. Bhanu Murthy. In the past, I’ve also had the good fortune of working with Prof. Kallirroi Georgila, Prof. David Traum, Prof. Andrew Yates, Dr. Paramita Mirza and Prof. Sriram Rajamani.

If you are interested in my work or would like to chat about technical interests we might share, feel free to get in touch!


  • Reinforcement Learning
  • Sequential Decision Making


  • MS in Computer Science, AI track, 2023

    Stanford University

  • BE (Hons) in Computer Science with a Minor in Data science, 2020

    BITS Pilani, Hyderabad Campus

Recent Experience


Research Engineer Intern

Google DeepMind

Jun 2022 – Sep 2022 Mountain View, USA
Worked on planning and reinforcement learning

Research Intern

Microsoft Research

Dec 2020 – Jul 2021 Bangalore, India
Used program synthesis techniques to generate code from multi-modal user input using large language model like GPT-3. Developed a Jupyter notebook extension to generate code from user commands and I/O examples for the Pandas library in Python

Research Intern

Max Planck Institute for Informatics

Aug 2019 – May 2020 Saarbrucken, Germany
Worked on deep reinforcement learning applications in NLP and IR to build a conversational recommender system and to improve document retrieval performance as a part of my undergraduate thesis.

Research Intern

USC Institute for Creative Technologies

May 2019 – Jul 2019 Los Angeles, USA
Developed reinforcement learning algorithms for human-swarm interactions at the Natural Language Dialogue Group. Part of the IUSSTF-Viterbi summer internship programme

Blog Posts

An Overview of Bandits

In this post, we will take a look at the bandit problem and discuss some solution strategies. This is a fairly introductory overview so …

Inverse Reinforcement Learning

This is a review of the paper Algorithms for Inverse Reinforcement Learning. I recommend some reinforcement learning (RL) basics before …

Modeling RL Problems

Disclaimer: The content for this article does not come from any textbook or other reliable sources. They are observations made purely …

Bridging the Gaps With Reinforcement Learning

In this post, I will be talking about a unique way to use reinforcement learning (RL) in deep learning applications. I definitely …