Just a shower thought I had when thinking about claims like “80% of all code will be written by AI”…

  • MangoCats@feddit.it
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    2 hours ago

    LLMs are by design probabilistic.

    And that’s a strength as compared with the machines that have attempted 100% determinisim since the days of Lady Ada and Charles Babbage. It also makes it a different beast which must be handled differently than a rigid machine.

    You’re supposed to be able to give the same model the same prompt twice and get two different answers

    Like creative writers. The Late Show monologue wouldn’t be very good if the writers used the exact same formula every night.

    You’re inherently leaving much more to chance by using LLMs to generate code

    Not if you give proper (complete and testable) requirements. I’d argue that LLMs are no more “unpredictable” than a pool of randomly selected human programmers.

    This is where the power of diversity / randomness comes into play: with proper (complete and testable) requirements, the randomized agents, be they LLMs or consultants for hire, will iterate until they meet the requirements or give up / run out of time or resources.

    and creating more work for yourself as you have to review LLM code

    That’s a matter of practices - and code review is a good practice, but in the world where LLMs are writing the code - only LLM code reviewers are going to be capable of keeping with the flood of code that the LLMs are producing: https://youtu.be/pzkwn3hu1Cc?t=60

    which is generally lower quality than human-written code.

    That has been changing, rather quickly over the past year. The size of problems that LLMs are solving as well as humans has been increasing steadily for many months.

    Now, having said all that, I’m paid to produce code, so I do review everything the LLMs make because nobody is asking me (yet) to make anything at super-human speed, so I’m not asking LLMs to make anything at super-human speed for the overall development process. I’ve been doing this for about 6 months, and the quality of what they produce, and the complexity of things they are able to successfully produce, has been steadily increasing throughout that time. Six months ago, I couldn’t ask an LLM to make more than a simple sub-module in things I was working on and get a reasonable result. Today, most things I’m tasked with - I can have the LLM develop a set of requirements for the whole problem statement, and then implement to those requirements, develop meaningful tests (six months ago the LLM generated tests were garbage, lately they’re getting to be on-par to better than what my human test department colleagues make), and do self-reviews and refinements to a point where the code meets our standards better than code written by our human programmers.

    One of the most productive prompts you can give an LLM is: “Review these requirements; identify gaps, ambiguities, conflicts or any other problems that may hinder implementation. Report all findings and suggest potential corrections.” You won’t get the same result every time, and repeating that prompt in a fresh context window on the “corrected” requirements often leads to additional refinements, but eventually you do end up with a good set of self-consistent and complete requirements. The thing you have to do is read those (extensive) requirements and ensure that they reflect what you are thinking correctly, because any “hallucinations” that creep into the requirements will then be implemented in the code and the tests and sail right on through to the finished product.