zlacker

This is awesome. I've been speculating along similar lines, and it's great to see this fleshed out.

I think "correct the errors in this ChatGPT essay" is a short-term viable homework exercise, but those errors might be gone in GPT-5 so I don't think it's long-term viable. Soon the LLM will just produce perfect essays at college level and there won't be hallucinations for the student to correct.

However, the "simulate the historical environment" task is great and I think it has long-term potential. I think it can be taken further; rather than "spot the errors that ChatGPT made", you could flip the script and make it "survive 20 turns of conversation without making a historical error", so you'd need to know things like local traditions, perhaps the geography of the ancient settlement you're studying, contemporaneous history like "who is the emperor and what's the sentiment towards him" and so on.

I'm also envisioning that, since text-based exercises are extremely easy to game (just pipe your text prompt into ChatGPT), and since ChatGPT is soon going to be strictly superior to a high-school level student, we could get around this by having the homework as an in-person verbal role-play or Q&A session, like a viva voce; essentially you have a verbal discussion with ChatGPT and you need to really know your material as it can dig into any part of the curriculum. Then ChatGPT can summarize each student's interaction, and the teacher doesn't have to sit through each individual one start-to-finish (1:1 exams are too time-consuming to be viable).

This round-trip through verbal interaction would potentially make the task more interesting (lots of people simply hate writing essays), shifts the focus away from tasks that will become obsolete (writing essays) in favor of ones that will be more relevant (human synthesis of ideas, and interpersonal interaction), and helps to mitigate the issue of LLM-assisted cheating by constructing an assignment that LLMs can't trivially solve.

replies(3): >>benbre+li >>Obscur+EF >>DonHop+201

>>thepti+(OP)
"I think it can be taken further; rather than "spot the errors that ChatGPT made", you could flip the script and make it "survive 20 turns of conversation without making a historical error", so you'd need to know things like local traditions, perhaps the geography of the ancient settlement you're studying, contemporaneous history like "who is the emperor and what's the sentiment towards him" and so on."

Yes, exactly. This is where I've been heading with my planning for assignments. For instance, when confronting Ea-nāṣir about his poor quality copper, I'd want my students to actually show some knowledge of the geography and political dynamics of ancient Mesopotamia.

The "Fall of the Ming Dynasty" simulator I link to at the bottom of post is probably the most well developed example of this that I've come up with so far. In that one, I added a "political intrigue minigame" in which ChatGPT is supposed to assess the human player's ability to deploy rhetoric appropriate for a minor courtier in 1640s China (from the prompt: "success depends on your luck score + rhetorical skill, tested via a series of open-ended prompts that HistoryLens will assess and grade; only the highest scoring responses will allow you to succeed in the minigame.")

Here is the full prompt for that one if people want to try it: https://chat.openai.com/share/86815f4e-674c-4410-893c-4ae3f1...

replies(3): >>thepti+qp >>Tao330+as >>rastap+m32

>>benbre+li
That’s great, courtroom drama sounds like an excellent angle.

I was thinking of “king hearing petitions” as another potentially interesting scenario; it could go either into minutia that requires cultural knowledge, or strategic stuff like the game Crusader Kings where you need to understand the geopolitical allegiances of the time, the geography, and the national economy.

More generally I have been wondering if games like “start a company in a simulated sandbox world” could actually teach transferrable Econ/Business/startup skills. There is a lot of territory to explore here.

>>benbre+li
> knowledge of the geography and political dynamics of ancient Mesopotamia

How much was the average ancient Mesopotamian aware of those things?

replies(1): >>benbre+St

>>Tao330+as
Likely very little. But a merchant capable of writing (or paying a scribe to write) a formal cuneiform complaint about bad copper, then having it delivered, would know quite a bit more. Great question IMO - thinking critically about exactly these kinds of questions is one of the goals of the assignment.

>>thepti+(OP)
This kind of thing would be a great way to integrate ChatGPT into the education system and help with media literacy as well. Find the mistakes and interrogate them to learn about critical thinking and how much more difficult it is to defend against misinformation than to simply disseminate it.

>>thepti+(OP)
Great idea!

Why leave hallucinations to chance? ;) The prompt could tell ChatGPT to randomly insert several authoritative sounding but verifiably false facts, to give the students debunking challenges! That solves the problem of GPT-5 being too smart to hallucinate, while still leaving open the possibility of talking rats.

What you're envisioning reminds me of Timothy Leary's Mind Mirror, published by Electronic Arts in 1985 for the Apple ][ and other home computers:

>>32578683

https://scalar.usc.edu/works/timothy-leary-software/index

https://www.rockpapershotgun.com/diy-transcendence-with-timo...

>Players answer questions that, when churned by Mind Mirror’s cryptic algorithms, can allegedly help them reveal intriguing new aspects of their psyche. Gameplay predominantly revolves around defining, comparing and then role-playing through different personalities in various text-based life simulations.

https://www.myabandonware.com/game/timothy-leary-s-mind-mirr...

https://store.steampowered.com/app/1603300/Timothy_Learys_Mi...

I extracted all the text from the Apple ][ Mind Mirror floppy disk image:

https://donhopkins.com/home/mind-mirror.txt

  Hello, I'm Timothy Leary.
  Welcome to MIND MIRROR.

  MIND MIRROR (c) copyright 1985, 1986, Futique, Inc.
  Published by Electronic Arts

  MIND MIRROR
    Design and script by Timothy Leary.

  MIND MIRROR
    Program and Design by Peter Van den Beemt and Bob Dietz.

  MIND MIRROR reflects and qualifies your thoughts.

  OPTION 1
    MIND TOOLS
    Enhance Insight, Mental Fitness, Learning Skills and Performance.

  OPTION 2
    MIND PLAY
    SIGNIFICANT PURSUITS.
    Sophisticated Head Games.

  MODE 1 
    MIND MIRROR
    Learn how to Micro-Scope and Map your thoughts.

  MODE 2
    LIFE SIMULATION
    Test your empathy in amusing Role-Play Odysseys.

  SELECT LEVEL
    Beginner
    Intermediate
    Master
    Consultant

  Choose AUTO-PLAY
  or INTER-PLAY.

  Mirror your own thoughts. 
  Compare them with others.

  RETURN begins game.
  SPACE BAR clears text.

  [...]

  "Mirrors should reflect a little before throwing back images." -Jean Cocteau

Also, here are the scales represented as JSON:

https://donhopkins.com/home/mind-mirror.json

Just for laughs, here's ChatGPT's summary of that file, and its answers to questions about Timothy Leary -- I sure hope it's not hallucinating:

https://chat.openai.com/share/044c41a3-fbc5-49cd-a3d1-c42f07...

What's interesting is that game was based on Timothy Leary’s PhD dissertation “The Social Dimensions of Personality: Group Process and Structure”, which he ultimately used to break out of jail.

https://archive.org/details/leary/leary.300dpi/mode/2up

Before he got into LSD, he designed the Leary Interpersonal Behavior Circle personality assessment, which laid the foundations for understanding human personality and interpersonal behaviors.

https://en.wikipedia.org/wiki/Interpersonal_circumplex

http://paei.wikidot.com/leary-timothy-interpersonal-circle-m...

In the 1970s, Leary was arrested for possession of marijuana. As part of the intake process, he was given a psychological assessment designed to gauge the risk of escape or violent behaviors in inmates. This test was known as the "Group Psychological Assessment Test." Leary was familiar with the test – having designed it or at least aspects of it. Understanding the criteria being measured, Leary answered in such a way that he was categorized as someone who posed a very low risk of escape or violence.

As a result, he was assigned to a minimum-security prison. With the lower level of security and his connections, Leary managed to escape prison in September 1970. His escape involved various affiliations, including with the Weather Underground, a radical left-wing organization. After his escape, Leary fled the country and spent time in various locations, including Algeria and Switzerland, before eventually being recaptured in 1973.

replies(2): >>mmasu+W32 >>thepti+687

>>benbre+li
What are your thoughts on this update? https://chat.openai.com/share/6178453a-4a8a-430a-ac13-156b54...

Basically, re-iterate the original instructions each time, describe last 2 moves in details, and provide brief summary of all the previous moves. Can have much longer games this way - maybe this deserves to be a python script.

replies(1): >>benbre+hk2

>>DonHop+201
That Cocteau quote is one of my favourites, and did not hear it in a long time :-) thanks for posting this

>>rastap+m32
I like it. Tried something similar early on, but decided it wasn't worth it because the ChatGPT context window is only enough to run for 10-20 turns anyway. But with Claude's 100k token context window, this should work great. Thank you.

I'm sure there are ways around this if you use the API and connect it to a MySQL database to allow users to "save" their spot... I'm not technical so my understanding of what's involved is hazy, but curious if people have ideas of how to do this simply. But for my current use case, I'm working with dozens/hundreds of college students so I need to make sure the whole thing is free. I've applied for a grant that could fund use of the API though, fingers crossed.

replies(1): >>thepti+rv2

>>benbre+hk2
Perhaps a no-code UI might let you wire up the conversational memory being suggested here.

I haven’t used these but saw a post on them:

https://cobusgreyling.medium.com/flowise-for-langchain-b7c40...

https://cobusgreyling.medium.com/langflow-46563a8af323

>>DonHop+201
Love this. I think there is a lot of fun to be had going back through all the old text-based experiments and seeing if they can be rebooted in a LLM.

In this case, there is some interesting structural psychological stuff, which would be the hard part to get the LLM to stick to rigorously, but the rest of the application could very much be reimplemented with an LLM.

"LLM as a mind mirror" is definitely a use-case that we'll see more of, IMO.