Pushing ChatGPT's Structured Data Support to Its Limits

>>goranm+(OP)
Both of ChatGPT's is_palindrome functions have terrible performance. The algorithmic efficiency doesn't matter because the cost of iterating through each character in pure Python dwarfs everything. The first function is about 3 times slower than the second one, but only because it spends >98% of its time in the "convert to lowercase and remove non-alphanumeric characters" part (which the second function doesn't bother doing at all). If you remove that step then the first function is 28 times faster than the second in my benchmark. That's because the first function does the reversing and comparison in O(1) Python operations, which is still O(n) C operations but the C operations are orders of magnitude cheaper.

An optimal version would combine the second function's algorithmic improvement with the first function's 'leave it to C' approach:

    def is_palindrome(s):
        half_length = (len(s) + 1) // 2
        return s[:half_length] == s[:-half_length-1:-1]

This is a bit under twice as fast as ChatGPT's first function with the cleaning removed. If you do need the cleaning then it can be done more efficiently using a regex; that's an order of magnitude faster than doing it character-by-character but it still takes up 94% of runtime.

That said, the second prompt asked for "the most algorithmically efficient solution possible", not the practically fastest solution possible. Arguably ChatGPT gave the correct answer, especially since . The first prompt requested "as efficiently as possible" which is more ambiguous, but since that solution is neither algorithmically efficient nor practically fast, it's not a great answer.

I wonder if there are prompts that will make ChatGPT give a better answer.

--------

Benchmark is here: https://gist.github.com/comex/81ff10bf095db2d86a52a148c8b11d...

This is all using CPython. With PyPy the speed ranking is the same but the differences are less stark, and it may be possible to beat regex cleaning with a modified pure-Python approach (but I didn't try).

>>comex+Pz
Yes, I was going for algorithmic complexity instead of real-world speed since algorithmic complexity is better to demonstrate the contrast of prompt engineering.

I just ran some tests to engineer the prompt for CPU utilization: even GPT-4 does the standard Pythonic approach but does recognize "This solution is very efficient because it uses Python's built-in string slicing, which is implemented in C and is therefore very fast."

zlacker