I am a PhD student at ETH Zurich and an Associated Researcher at ETH AI Center, supervised by Prof. Francesco Corman and Prof. Andreas Krause. My research interests are sequential decision-making and data-driven algorithms. I am particularly passionate about reinforcement learning, multi-agent learning, safe learning, and their applications in transportation systems.
I recently finished a research internship at Google hosted by Eric Malmi and Aliaksei Severyn.
Before starting a PhD, I worked as a Quantitative Researcher at Morgan Stanley and as a team-lead Data Scientist in a startup, Cantab Predictive Intelligence.
During my internship hosted by Eric Malmi and Aliaksei Severyn, I played a significant role in a cutting-edge research project that enhanced language models (LMs) with search-based planning techniques to improve multi-step reasoning in board games such as Chess, Chess960, Connect Four, and Hex. My contributions were integral to the project’s development and success, culminating in a publication and integrating the model within Gemini as a Chess Gem.
My primary contribution focused on external Monte Carlo Tree Search (MCTS), where I addressed the challenge of balancing exploration and exploitation in low simulation count settings. I designed and implemented dynamic virtual counts, a novel mechanism that dynamically adjusts virtual count weights closer to leaf nodes, overcoming limitations of fixed virtual counts and virtual losses. This innovation significantly improved search efficiency and enabled the agent to achieve Grandmaster-level performance in Chess with a search move count per decision comparable to human players.
Additionally, I collaborated on:
This internship experience highlighted my ability to tackle complex technical challenges and contribute to impactful research, demonstrating the potential of combining LMs with search-based methods for broader inference and training applications.
Lead a team of five data scientists.
Behavioral Credit Scoring:
ML-Driven Marketing Campaign:
Personalized Newsletter and E-Commerce Recommender Systems:
Delivery Delay Estimation:
Systemic Risk Model Execution Efficiency:
Treasury Department Cash Traceability:
E-Trading Execution Limits Calibration:
Listed derivatives liquidity:
Challenge: Find a solution such that white to move wins.
I am happy to hear your solution if you can solve it even with the assistance of an engine! I am also open to discussing why many modern engines fail to solve it.