OpenAI still leads in agentic terminal coding, but by less.

Claude can plan the work and then run hundreds of parallel subagents in a single session (and with Opus 4.8, the agents can run for even longer)

That’s one way to turn profitable before the IPO, I guess. Goodbye tokens.

  • Echo Dot@feddit.uk
    link
    fedilink
    English
    arrow-up
    2
    ·
    2 days ago

    Yeah it’s interesting as long as you can completely disregard all of the negative impacts but if you disregard all of the negative impacts and I would argue you’re not assessing the technology in a fair manner.

    The Turing test was also designed back in the day when a computer was just a big box in a room. An AI passing the Turing test is just something to throw at the media, it’s not a meaningful experiment. The Apple 2 was able to pass the Turing test.

    • unpossum@sh.itjust.worksOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 days ago

      I’m sorry, but I don’t agree with your first point at all. Things can have negative sides and still be interesting.

      The Turing test, as I interpret it at least, is more of a philosophical than a technical thing, trying to provide a way to evaluate the thinking ability of someone or -thing without being able to look at its innards. I’ve always found it fascinating, but I can understand if people disagree (just don’t drag the Chinese room into it). However, if you don’t think a conversation with Claude is more interesting than a faux psychiatrist session with ELIZA, I don’t know where we could go from there 🤷

      • Echo Dot@feddit.uk
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 day ago

        Yeah it was interesting 4 years ago when it was brand new. But I’m bored of that now and I wanted to do something useful.

        It’s a product that’s been around for half a decade and its own creators cannot tell me why I should use it. How does that not set up alarm bells in your head?

        • unpossum@sh.itjust.worksOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          14 hours ago

          I feel like it gets more intrinsically interesting the better it gets, even when the initial shock has faded a bit, but tastes vary of course.

          The LLM creators won’t shut up about what we can use it for and why. Some of those use cases actually work fairly well, like coding, so that part doesn’t really trigger any alarms.

          What I don’t see is how they intend to make actual money when open weight models catch up in the next months, but if we can lose the frontier labs and keep the current abilities available that’s fine by me (apart from the whole “possible collapse of Western economy”, that is)