David Andrews is a 3rd-year computer science major studying AI agents and reasoning with Dr. Chao Zhang. 

David smiles at the camera.

How long have you been an undergraduate researcher at Georgia Tech?

I’ve been doing undergraduate research at GT for about a year now. I joined Professor Zhang’s lab one year ago at the beginning of the Fall 2024 semester.

How did you get involved with undergraduate research?

Ever since high school, I’ve been fascinated with AI research. I would always read the latest AI papers when they came out so I could closely follow the field. I even took the research track at my high school, consisting of AP Seminar and AP Research, where I conducted my own research on the latest AI systems of the time, like GPT-3 and GPT-4. More than anything, I was curious to explore more of the field and to be able to contribute back to the community with my own work. When I entered Georgia Tech, I was so excited for all the research opportunities and great people I could work with. In my first year, I was able to work on some great AI projects with Aran Komatsuzaki, a 5th year Ph.D. student at the time, who happened to be my linear algebra TA! After that year, I found Professor Zhang and his work on post-training for LLM agents especially intrigued me. After emailing him and having a meeting, he invited me to join his lab.

What are you working on?

Currently, I’m working on expanding the capabilities of open-source large language models (LLMs) to match the leading closed-source LLMs in terms of reasoning and agentic capability. By agentic capability, we mean the ability for LLMs to effectively use tools, such as browser, code interpreter, or terminal to solve complex multi-step problems in software development, web search, and math. We are approaching this problem through reinforcement learning (RL), a training technique that reinforces behavior in the model that leads to correctly solving problems, allowing the model to discover and refine its problem-solving skill in an unsupervised fashion. A key importance of our work is that we support multi-turn environments, where the AI uses a series of tools and interactions with the environment to solve a problem, rather than only thinking once and then answering. This behavior can more closely mimic how humans iteratively interact with their computers and the world, enabling the LLMs to perform more complex tasks than previous systems.

To facilitate this learning, we are developing large-scale training infrastructure. This infrastructure supports the requirement of assigning a personal computer, or sandbox, to each agent, each equipped with a browser and other important tools. Additionally, our RL training code is fully asynchronous, maximizing the training throughput and minimizing idle resources, an especially important consideration when having long-running agent tasks compared to the previous paradigm of single-turn LLMs. The problems we train on are diverse, covering a wide range of topics in mathematics, coding, software development, web search, and puzzles. Our hope is that training an agent with a fixed set of general tools on a diverse set of problems can effectively teach the model how to use its tools to solve any new problem without needing specialized training.

 

David works on his laptop.

 

What is your favorite thing about research/researching?

My favorite thing about research is that I have the freedom to deeply explore my own ideas and directions with the support of my professor, who is open to allowing me to direct my own projects. This allows research to be an outlet for creativity and originality for me, giving me deep fulfillment when I’m able to create something inspired from my mind. Research involves working on problems that haven’t been solved or even asked before, so I feel deep satisfaction when working at the forefront of the field and especially when being able to contribute back. Fundamentally, I derive my joy in research from curiosity and being able to pursue things that simply interest me and I feel are important to the field and human knowledge.

What are your future plans and how has research influenced them?

I plan to pursue either a Ph.D. in AI or work as a researcher in an industry lab, such as OpenAI. These two options allow me to continue doing the research I enjoy, but now at an even more advanced level. I would primarily focus on the research area of LLMs, agents, and reasoning. My research at Georgia Tech with Professor Zhang has greatly influenced my plans, as I’ve come to realize that AI research is my true passion. With AI becoming increasingly important each day, I’m excited for the new progress that will be made in the next few years, and I hope to contribute to the field in meaningful ways.

What advice do you have for students who want to be undergraduate researchers?

My advice is that finding a good PI to work with is very important. Find someone whose recent work excites you and whose research interests align with yours. Also, make sure to maintain good communication with the PI and other colleagues, through weekly meetings or updates to share progress, blockers, and plans.

Regarding the research itself, keep reading papers, create testable hypotheses, and document everything you do. This approach can really help you later when you are writing the paper and presenting the research. And, some final words of wisdom from my professor: don’t rush the research process, the most important thing is to produce high quality research, not strictly about meeting deadlines.