I am a principal applied scientist at Microsoft and a researcher (PGR) at the University of York. I work on projects addressing natural language understanding/generation and fundamental problems in deep learning, such as reasoning and formal modelling of LLMs.
My primary research interest lies within computation, and specifically, the development of algorithms and meta-algorithms for machine learning. My approach is mainly intuitionistic in nature, contrasting with some other formalisms used in this field. Namely, algorithms should have provable guarantees of complexity and convergence via construction, and this proof must be closely-related to a computable (e.g., realistic, decidable, production) scenario. This has the advantage of providing feasible, meaningful statements about complex problems, while at the same time circumventing mathematical results that are rarely, or at all, seen in practice. For example, we used category theory to prove that some prompting strategies are objectively better than others; and that they would produce more suitable outcomes as defined by the users (p < 0.1).
I'm a strong proponent of training small and efficient models, as opposed to overspecified networks--which I call Jurassic networks--via the development of algorithms with provable optimality guarantees. Here "efficient" would mean "only as big as needed for the task". This is important because the power required to train these huge models can be translated directly into tons of carbon emitted into the atmosphere, and it's devastating to the environment.
Although I showed that finding a globally optimal solution to this problem is undecidable in its general form, I have also proved that for several interesting cases it is possible to find approximation algorithms that give near-optimal solutions in polynomial time--going as far as applying these results to the well-known BERT language model and reaching a new state-of-the-art on model compression. This last contribution was later adapted for quantum circuit optimization in a rather fantastic work by folks at ORNL.
Other research interests of mine are related to preserving endangered languages; as well as applications of LLMs to foster a more inclusive environment to traditionally excluded groups in ML research and application (e.g., neurodiverse individuals such as myself, non-English speakers, etcetera).
Last updated: Mar '24.
I've found it useful to have a series of "posts" on the work I do, to make it more accessible and share my passion for mathematics. Especially since I don't have any social media (does LinkedIn count?)... I'm absolutely terrible at updating this site (record: 2 years), so bear with me.
Will GPT-4 Run DOOM?: Links to code, resources, TL;DR of the paper, and videos of the model playing the game.
Turing Completeness and Sid Meier's Civilization: A brief note about my paper "Turing Completeness and Sid Meier's Civilization". We talk about how to execute arbitrary algorithms inside Civ, and what does that mean for this and other 4X games.
Some Computational Aspects of NAS: A post on how hard neural architecture search (NAS) and machine learning can be, from a computational perspective. It also discusses the workarounds and applications of this result, with a particular emphasis on why some NAS approaches do not do better than random search. This is a summary of my poorly-titled, ever-misinterpreted paper "On The Bounds of Function Approximations."
Bort: Algorithms and Applications: a post on the algorithms used to obtain Bort, an optimally compressed version of the BERT language model. This can be viewed as a summary of my papers "Optimal Subarchitecture Extraction for BERT", "An Algorithm for Learning Smaller Representations of Models With Scarce Data", and "An Approximation Algorithm for Optimal Subarchitecture Extraction", albeit less concise than these titles, if you can believe it.
Following Larry Wasserman's essay, I invite comments on the papers below. Feel free to email me.
For a longer list of publications see here. For how to handle my last name's weird spelling rules, see here.
2024 |
Will GPT-4 Run DOOM?
[pdf]
[BibTex]
[Code]
[Post]
Adrian de Wynter Preprint |
2023 |
On Meta-Prompting
[pdf]
[BibTex]
[Code]
Adrian de Wynter, Xun Wang, Qilong Gu, and Si-Qing Chen Preprint |
A User-Centered Evaluation of Spanish Text Simplification
[pdf]
[BibTex]
[Data]
Adrian de Wynter, Anthony Hevia, and Si-Qing Chen Preprint |
|
An Evaluation of LLM Outputs: Discourse and Memorization
[pdf]
[BibTex]
Adrian de Wynter, Xun Wang, Alex Sokolov, Qilong Gu, and Si-Qing Chen The Natural Language Processing Journal |
|
On the Opportunities and Dangers of LLM-Based Evaluation
Chris Quirk and Adrian de Wynter Invited talk at the 2023 MLADS Conference |
|
Are Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation?
[pdf]
[BibTex]
Rishav Hada, Varun Gumma, Adrian de Wynter, Harshita Diddee, Mohamed Ahmed, Monojit Choudhury, Kalika Bali, and Sunayana Sitaram Accepted at EACL 2024 |
|
"I Wish To Have An Argument!": Argumentative Reasoning in Large Language Models
[pdf]
[BibTex]
[Code]
Adrian de Wynter and Tommy Yuan Preprint |
|
The Curse of the Biased Researcher
Adrian de Wynter Invited talk at the 2023 MLADS Conference |
|
2022 |
Turing Completeness and Sid Meier's Civilization
[pdf]
[BibTex]
[The Turing Machine in Action]
Adrian de Wynter IEEE Transactions on Games |
2020 |
Optimal Subarchitecture Extraction for BERT
[pdf]
[BibTex]
[Code]
Adrian de Wynter and Daniel J. Perry Preprint |
An Algorithm for Learning Smaller Representations of Models With Scarce Data
[pdf]
[BibTex]
Adrian de Wynter Preprint |
|
2019 |
On the Bounds of Function Approximations
[pdf]
[BibTex]
Adrian de Wynter ICANN 2019 (oral presentation) |
Some coverage of the work I do, in case my posts remain as confusing as the original papers.
A version of the BERT language model that’s 20 times as fast - Another post edited by Larry Hardesty. This one talks about Bort.
State of the Art in Automated Machine Learning - This is an interview I, along with other researchers, gave for InfoQ around AutoML. It's so interesting to see people of such different backgrounds arriving to the same conclusions :)
Alexa Research Paper Shows Genetic Algorithms Offer Best Solution for Neural Network Optimization - This post sums up very nicely my work around NAS/ASP/FA.
Amazon researchers say evolutionary approach improves the selection of AI models - From Venturebeat.
How to Construct the Optimal Neural Architecture for Your Machine Learning Task - A post edited by the awesome Larry Hardesty.
Contact: first-initial-full-last-name-including-tussenvoegsel (at) microsoft.com