Adrian de Wynter



.

I am a principal applied scientist at Microsoft and a researcher (PGR) at the University of York. I work in projects related to natural language understanding/generation and fundamental problems in deep learning, such as reasoning and formal modelling of dialogue, like LLMs.

My primary interest is computation, and specifically, the study of reasoning as it relates to humans and machines. My approach is mainly intuitionistic in nature, contrasting with some other formalisms used in this field. In English: algorithms have provable guarantees of complexity and convergence via construction, and this proof must be closely-related to a computable (e.g., realistic, decidable, production) scenario. This gives meaningful answers about complex problems, while also circumventing mathematical results that are rarely seen in practice. For example, we recently used category theory to prove that some prompting strategies are objectively better than others; and that they would produce more preferrable outcomes by users.

I'm a strong proponent of training small and efficient models, as opposed to overspecified networks--which I call Jurassic networks. This matters! The power required to train these models translates into tons of carbon emitted into the atmosphere, and it's devastating to the environment. Although I showed that finding a globally optimal solution to this problem is generally undecidable, I have also proved that it is possible to find approximation algorithms that give near-optimal solutions in polynomial time--going as far as applying these results to BERT and reaching a (then) state-of-the-art on model compression. This last contribution was later adapted for quantum circuit optimization in a rather fantastic work by folks at ORNL.

Other of my research interests are related to recreational mathematics (especially about games), preserving endangered languages, and applications of LLMs to create inclusive environments to traditionally excluded groups in ML (e.g., neurodiverse individuals such as myself, non-English speakers, etcetera).

Last updated: Dec '24.

I've found it useful to have a series of "posts" on the work I do, to make it more accessible and share my passion for mathematics, especially since I don't have any social media (does LinkedIn count?)
I'm absolutely terrible at updating this site (record: 2 years), so bear with me.

Links to code, resources, TL;DR of the paper, and videos of the model playing the game.
A brief note about my paper "Turing Completeness and Sid Meier's Civilization". We talk about how to execute arbitrary algorithms inside Civ, and what does that mean for this and other 4X games.
A post on how hard neural architecture search (NAS) and machine learning can be, from a computational perspective. It also discusses the workarounds and applications of this result, with a particular emphasis on why some NAS approaches do not do better than random search. This is a summary of my poorly-titled, ever-misinterpreted paper "On The Bounds of Function Approximations."
A post on the algorithms used to obtain Bort, an optimally compressed version of the BERT language model. This can be viewed as a summary of my papers "Optimal Subarchitecture Extraction for BERT", "An Algorithm for Learning Smaller Representations of Models With Scarce Data", and "An Approximation Algorithm for Optimal Subarchitecture Extraction", albeit less concise than these titles, if you can believe it.

Following Larry Wasserman's essay, I invite comments on the papers below. Feel free to email me.
For a longer, complete list of works see here.
For how to handle my last name's weird spelling rules, see here.

If Eleanor Rigby Had Met ChatGPT: A Study on Loneliness in a Post-LLM World
Adrian de Wynter
Preprint
Will GPT-4 Run DOOM?
Adrian de Wynter
IEEE Transactions on Games (2024)
LLM as a Mastermind: A Survey of Strategic Reasoning with Large Language Models
Yadong Zhang, Shaoguang Mao, Tao Ge, Xun Wang, Adrian de Wynter, Yan Xia, Wenshan Wu, Ting Song, Man Lan and Furu Wei
COLM 2024
One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks
Fangru Lin, Shaoguang Mao, Emanuele La Malfa, Valentin Hofmann, Adrian de Wynter, Jing Yao, Si-Qing Chen, Michael Wooldridge, Furu Wei
Preprint
RTP-LX: Can LLMs Evaluate Toxicity in Multilingual Scenarios?
Adrian de Wynter, Ishaan Watts, et al.
AAAI 2025
On Meta-Prompting
Adrian de Wynter, Xun Wang, Qilong Gu, and Si-Qing Chen
Preprint (2023)
An Evaluation of LLM Outputs: Discourse and Memorization
Adrian de Wynter, Xun Wang, Alex Sokolov, Qilong Gu, and Si-Qing Chen
The Natural Language Processing Journal
"I'd Like to Have an Argument, Please": Argumentative Reasoning in Large Language Models
Adrian de Wynter and Tangming Yuan
COMMA 2024
On the Opportunities and Dangers of LLM-Based Evaluation
Chris Quirk and Adrian de Wynter
Invited talk at the 2023 MLADS Conference
Are Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation?
Rishav Hada, Varun Gumma, Adrian de Wynter, Harshita Diddee, Mohamed Ahmed, Monojit Choudhury, Kalika Bali, and Sunayana Sitaram
EACL 2024
Turing Completeness and Sid Meier's Civilization
Adrian de Wynter
IEEE Transactions on Games
An algorithm for learning representations of models with scarce data
Adrian de Wynter
Information Geometry (2024)

Some media coverage of the work I do, in case my posts remain as confusing as the original papers.

Some of the coverage of the work I did with DOOM and GPT-4. You can also read about it here (Tom's Hardware), here (PC Mag), and here (The Register).
Another post edited by Larry Hardesty. This one talks about Bort.
This is an interview I, along with other researchers, gave for InfoQ around AutoML. It's so interesting to see people of such different backgrounds arriving to the same conclusions :)

Contact: first-initial-full-last-name-including-tussenvoegsel (at) microsoft.com

Factoid: my ORCID (326797241) is a prime number; it is expressible as the sum of two squares (1715 and 17996); and it is the square root (hypothenuse) of the sum of two squares (61726280 and 320914791). Yay.