Can ChatGPT and Other AI Models Replace Human Bidders in Auctions?
We put it to the test in my latest research project
What if large language models could take the place of human participants in economic experiments? Could they save researchers time, money, and logistical headaches—without sacrificing data quality?
In our latest paper, “On the Usefulness of Using Current LLMs for Experimental Auction Valuation,” we explored that question
Together with co-authors Jay Corrigan and Carola Grebitus, we tested whether today’s top AI models—ChatGPT, Claude, and Gemini—could generate human-like behavior in experimental auctions, a cornerstone method in applied economics for assessing consumer demand.
The question: Can LLMs bid like humans in these auctions?
The answer: It depends.
🧪 What We Did
We ran LLMs through a battery of tests using two data sets from actual auction experiments with humans—one involving labeled coffee in the U.S., another involving apples with varying transport distances in Germany. We tested three conditions:
No reference data – We only gave the LLM the experimental instructions and demographic details for our participants.
10% reference bids – In, addition, we gave the LLM 10% of the real human bids for the products.
20% reference bids – Instead of 10% of the bids, we gave them 20%.
We even asked whether LLMs trained on one product (apples) could bid accurately on a different product (wine) using only the previous data.
🔍 What We Found
Without any real human bids, AI struggled. All three models produced bids significantly different from actual human responses—often dramatically so.
But with just 10–20% of reference bids, the story changed. ChatGPT, Claude, and Gemini started producing much closer approximations of human bidding behavior.
For some visuals on this, check out this chart from our paper:
Results varied across models and contexts. Claude often overbid, Gemini was inconsistent, and ChatGPT came closest to matching humans overall.
Cross-product transfer was tricky. When we asked LLMs to bid on wine based on bids participants placed for apples, the absolute bid levels were off—but interestingly, the percent premiums across product labels were often very close to what humans would have generated.
What Does This Mean?
Can LLMs replace human subjects in experimental economics? Not yet. But they’re getting surprisingly close—especially when provided with a little human context.
That opens up possibilities:
Cost savings: Auction experiments often cost tens of thousands to run. AI could help scale up samples or test initial designs more affordably.
Ethics and access: AI can stand in for humans in studies where running a real auction might be unethical or impractical.
Instruction testing & teaching: Want to test your experimental design before fielding it? Or generate data for a classroom project? LLMs might help.
Of course, the tech is still evolving—and caution is warranted. But our findings suggest a future where LLMs serve as powerful complements to human participants in economic research.
You can read the full paper here
This was a fun project and thanks to my amazing colleagues Jay and Carola for being such a fun team to work with!
Fittingly, this post was written with some assistance from ChatGPT! :)
Very interesting work, Matt!