All Works
.
2025
Does using LLMs in daily life help or hinder learning a second language?
Proceedings of the Annual Meeting of the Cognitive Science Society (2025)
A Multilingual, Culture-First Approach to Addressing Misgendering in LLM Applications
EMNLP 2025 Main
If Eleanor Rigby Had Met ChatGPT: A Study on Loneliness in a Post-LLM World
ACL 2025 Main (ann. 2024)
Assessing Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks
ACL 2025 Main (ann. 2024)
CELA Open Data Award
For the MCFM corpus, a dataset designed to mitigate misgendering in LLMs in 42 languages and dialects.
RTP-LX: Can LLMs Evaluate Toxicity in Multilingual Scenarios?
(Ann. 2024); AAAI 2025
2024
LLM as a Mastermind: A Survey of Strategic Reasoning with Large Language Models
COLM 2024
"I'd Like to Have an Argument, Please": Argumentative Reasoning in Large Language Models
(Ann. 2023); COMMA 2024
Are Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation?
(Ann. 2023); EACL 2024
An Algorithm for Learning Smaller Representations of Models With Scarce Data
(Ann. 2020); Information Geometry (2024)
2023
An Evaluation of LLM Outputs: Discourse and Memorization
The Natural Language Processing Journal
On the Opportunities and Dangers of LLM-Based Evaluation
Invited talk at the 2023 MLADS Conference
The Curse of the Biased Researcher: Common Pitfalls in LLM-based Evaluation
Invited talk at the 2023 MLADS Conference
CELA Open Data Award
For the CLANDESTINO corpus, a dataset for localized Spanish toxic-language detection.
Older
Turing Completeness and Sid Meier's Civilization
IEEE Transactions on Games (2022)
Bort: Algorithms and Applications
Invited talk at the 2021 Alexa Prize Summit
Mischief: A Simple Black-Box Attack Against Transformer Architectures
Preprint (2020)
Harder Performance Measures for Language Models
Invited talk at the 2020 Alexa Prize Summit