Pushing ChatGPT's Structured Data Support to Its Limits

>>goranm+(OP)
Both of ChatGPT's is_palindrome functions have terrible performance. The algorithmic efficiency doesn't matter because the cost of iterating through each character in pure Python dwarfs everything. The first function is about 3 times slower than the second one, but only because it spends >98% of its time in the "convert to lowercase and remove non-alphanumeric characters" part (which the second function doesn't bother doing at all). If you remove that step then the first function is 28 times faster than the second in my benchmark. That's because the first function does the reversing and comparison in O(1) Python operations, which is still O(n) C operations but the C operations are orders of magnitude cheaper.

An optimal version would combine the second function's algorithmic improvement with the first function's 'leave it to C' approach:

    def is_palindrome(s):
        half_length = (len(s) + 1) // 2
        return s[:half_length] == s[:-half_length-1:-1]

This is a bit under twice as fast as ChatGPT's first function with the cleaning removed. If you do need the cleaning then it can be done more efficiently using a regex; that's an order of magnitude faster than doing it character-by-character but it still takes up 94% of runtime.

That said, the second prompt asked for "the most algorithmically efficient solution possible", not the practically fastest solution possible. Arguably ChatGPT gave the correct answer, especially since . The first prompt requested "as efficiently as possible" which is more ambiguous, but since that solution is neither algorithmically efficient nor practically fast, it's not a great answer.

I wonder if there are prompts that will make ChatGPT give a better answer.

--------

Benchmark is here: https://gist.github.com/comex/81ff10bf095db2d86a52a148c8b11d...

This is all using CPython. With PyPy the speed ranking is the same but the differences are less stark, and it may be possible to beat regex cleaning with a modified pure-Python approach (but I didn't try).

>>comex+Pz
All of these have the same worst-case algorithmic efficiency, O(n). The difference is the best-case efficiency. The "optimized" version in the article is O(1). Your solution is still O(n) best case.

The optimal solution will depend on the data. If most strings aren't palindromes then optimizing the best case is likely the better approach. (Example: You are adding an easter egg which will trigger on "random" user input.) If palindromes (or near-palendromes) are common than your solution will be faster as the slope is lower.

zlacker