Are they even trying to be good at that? Serious question; using LLMs as a logical processor are as wasteful and as well-suited as using the Great Pyramid of Giza as an AirBnB.
I've not tried this, but I suspect the best way is more like asking the LLM to write a COQ script for the scenario, instead of trying to get it to solve the logic directly.
, were you allowed to do it, would be an extremely profitable venture. Taj Mahal too, and yes, I know it's a mausoleum.
1 star: No WiFi, no windows, no hot water
1 star: dusty
1 star: aliens didn't abduct me :(
5 stars: lots of storage room for my luggage
4 stars: service good, but had weird dream about a furry weighing my soul against a feather
1 star: aliens did abduct me :(
2 stars: nice views, but smells of camel