AI usefulness is a Bathtub Curve

Interestingly AI has a "bathtub curve" for usefulness.

At the low level, Copilot is good at turning a few words of English into 2-4 lines of code. A super auto-complete. At the high level, Phind / ChatGPT can describe concepts and break them down one level. This works okay and is great to learn and iterate at the human level.

In the middle, AI doesn't work as well. Often when AI writes multipart code, each piece will be okay, but they don't connect together. I've seen instances where it obviously takes a function from project A, a function from project B, then handwaves a connection between them. If A and B have different assumptions, this doesn't work. For Dev code, this is annoying. For DevOps (resource code), this can be really bad. It's easier to validate Dev code.

I've used the Chain of Thought pattern with some luck. You give high-level requirements to the AI and ask it to write a series of steps, without code. Then you can iterate on the steps. Adding constraints, reordering the steps, etc. Lastly you say "okay give me code for step 3". This sort of works. If the AI forgets you can give it the steps again as a context.

Top comments (1)

John Mitchell • Jul 8

restated:

AI has a "bathtub curve" of value. At the low level, it's a super-autocomplete, able to write 1-3 lines of code that works good enough. At the high level, it's great for explaining high-level concepts that are relevant to a task at hand.

In the middle... AI doesn't work very well.

If an AI writes a multi-step plan, where the pieces have to fit together, I've found it goes off the rails. Parts 1 and 3 of a 4-part plan are fine. So is part 2. However they don't fit together! AI has no concept of "these four parts have to be closely connected, building a whole". It just builds from A to B in four steps... but taking two different paths and stitching the pieces together poorly.