I think "correct the errors in this ChatGPT essay" is a short-term viable homework exercise, but those errors might be gone in GPT-5 so I don't think it's long-term viable. Soon the LLM will just produce perfect essays at college level and there won't be hallucinations for the student to correct.
However, the "simulate the historical environment" task is great and I think it has long-term potential. I think it can be taken further; rather than "spot the errors that ChatGPT made", you could flip the script and make it "survive 20 turns of conversation without making a historical error", so you'd need to know things like local traditions, perhaps the geography of the ancient settlement you're studying, contemporaneous history like "who is the emperor and what's the sentiment towards him" and so on.
I'm also envisioning that, since text-based exercises are extremely easy to game (just pipe your text prompt into ChatGPT), and since ChatGPT is soon going to be strictly superior to a high-school level student, we could get around this by having the homework as an in-person verbal role-play or Q&A session, like a viva voce; essentially you have a verbal discussion with ChatGPT and you need to really know your material as it can dig into any part of the curriculum. Then ChatGPT can summarize each student's interaction, and the teacher doesn't have to sit through each individual one start-to-finish (1:1 exams are too time-consuming to be viable).
This round-trip through verbal interaction would potentially make the task more interesting (lots of people simply hate writing essays), shifts the focus away from tasks that will become obsolete (writing essays) in favor of ones that will be more relevant (human synthesis of ideas, and interpersonal interaction), and helps to mitigate the issue of LLM-assisted cheating by constructing an assignment that LLMs can't trivially solve.
Yes, exactly. This is where I've been heading with my planning for assignments. For instance, when confronting Ea-nāṣir about his poor quality copper, I'd want my students to actually show some knowledge of the geography and political dynamics of ancient Mesopotamia.
The "Fall of the Ming Dynasty" simulator I link to at the bottom of post is probably the most well developed example of this that I've come up with so far. In that one, I added a "political intrigue minigame" in which ChatGPT is supposed to assess the human player's ability to deploy rhetoric appropriate for a minor courtier in 1640s China (from the prompt: "success depends on your luck score + rhetorical skill, tested via a series of open-ended prompts that HistoryLens will assess and grade; only the highest scoring responses will allow you to succeed in the minigame.")
Here is the full prompt for that one if people want to try it: https://chat.openai.com/share/86815f4e-674c-4410-893c-4ae3f1...
I was thinking of “king hearing petitions” as another potentially interesting scenario; it could go either into minutia that requires cultural knowledge, or strategic stuff like the game Crusader Kings where you need to understand the geopolitical allegiances of the time, the geography, and the national economy.
More generally I have been wondering if games like “start a company in a simulated sandbox world” could actually teach transferrable Econ/Business/startup skills. There is a lot of territory to explore here.
How much was the average ancient Mesopotamian aware of those things?
Why leave hallucinations to chance? ;) The prompt could tell ChatGPT to randomly insert several authoritative sounding but verifiably false facts, to give the students debunking challenges! That solves the problem of GPT-5 being too smart to hallucinate, while still leaving open the possibility of talking rats.
What you're envisioning reminds me of Timothy Leary's Mind Mirror, published by Electronic Arts in 1985 for the Apple ][ and other home computers:
https://scalar.usc.edu/works/timothy-leary-software/index
https://www.rockpapershotgun.com/diy-transcendence-with-timo...
>Players answer questions that, when churned by Mind Mirror’s cryptic algorithms, can allegedly help them reveal intriguing new aspects of their psyche. Gameplay predominantly revolves around defining, comparing and then role-playing through different personalities in various text-based life simulations.
https://www.myabandonware.com/game/timothy-leary-s-mind-mirr...
https://store.steampowered.com/app/1603300/Timothy_Learys_Mi...
I extracted all the text from the Apple ][ Mind Mirror floppy disk image:
https://donhopkins.com/home/mind-mirror.txt
Hello, I'm Timothy Leary.
Welcome to MIND MIRROR.
MIND MIRROR (c) copyright 1985, 1986, Futique, Inc.
Published by Electronic Arts
MIND MIRROR
Design and script by Timothy Leary.
MIND MIRROR
Program and Design by Peter Van den Beemt and Bob Dietz.
MIND MIRROR reflects and qualifies your thoughts.
OPTION 1
MIND TOOLS
Enhance Insight, Mental Fitness, Learning Skills and Performance.
OPTION 2
MIND PLAY
SIGNIFICANT PURSUITS.
Sophisticated Head Games.
MODE 1
MIND MIRROR
Learn how to Micro-Scope and Map your thoughts.
MODE 2
LIFE SIMULATION
Test your empathy in amusing Role-Play Odysseys.
SELECT LEVEL
Beginner
Intermediate
Master
Consultant
Choose AUTO-PLAY
or INTER-PLAY.
Mirror your own thoughts.
Compare them with others.
RETURN begins game.
SPACE BAR clears text.
[...]
"Mirrors should reflect a little before throwing back images." -Jean Cocteau
Also, here are the scales represented as JSON:https://donhopkins.com/home/mind-mirror.json
Just for laughs, here's ChatGPT's summary of that file, and its answers to questions about Timothy Leary -- I sure hope it's not hallucinating:
https://chat.openai.com/share/044c41a3-fbc5-49cd-a3d1-c42f07...
What's interesting is that game was based on Timothy Leary’s PhD dissertation “The Social Dimensions of Personality: Group Process and Structure”, which he ultimately used to break out of jail.
https://archive.org/details/leary/leary.300dpi/mode/2up
Before he got into LSD, he designed the Leary Interpersonal Behavior Circle personality assessment, which laid the foundations for understanding human personality and interpersonal behaviors.
https://en.wikipedia.org/wiki/Interpersonal_circumplex
http://paei.wikidot.com/leary-timothy-interpersonal-circle-m...
In the 1970s, Leary was arrested for possession of marijuana. As part of the intake process, he was given a psychological assessment designed to gauge the risk of escape or violent behaviors in inmates. This test was known as the "Group Psychological Assessment Test." Leary was familiar with the test – having designed it or at least aspects of it. Understanding the criteria being measured, Leary answered in such a way that he was categorized as someone who posed a very low risk of escape or violence.
As a result, he was assigned to a minimum-security prison. With the lower level of security and his connections, Leary managed to escape prison in September 1970. His escape involved various affiliations, including with the Weather Underground, a radical left-wing organization. After his escape, Leary fled the country and spent time in various locations, including Algeria and Switzerland, before eventually being recaptured in 1973.
Basically, re-iterate the original instructions each time, describe last 2 moves in details, and provide brief summary of all the previous moves. Can have much longer games this way - maybe this deserves to be a python script.
I'm sure there are ways around this if you use the API and connect it to a MySQL database to allow users to "save" their spot... I'm not technical so my understanding of what's involved is hazy, but curious if people have ideas of how to do this simply. But for my current use case, I'm working with dozens/hundreds of college students so I need to make sure the whole thing is free. I've applied for a grant that could fund use of the API though, fingers crossed.
I haven’t used these but saw a post on them:
https://cobusgreyling.medium.com/flowise-for-langchain-b7c40...
In this case, there is some interesting structural psychological stuff, which would be the hard part to get the LLM to stick to rigorously, but the rest of the application could very much be reimplemented with an LLM.
"LLM as a mind mirror" is definitely a use-case that we'll see more of, IMO.