Comments Locked

20 Comments

Back to Article

  • bernstein - Wednesday, December 6, 2023 - link

    So whats the point of all those tops on an end-user pc? outside of apple there really is no software that does any LLM stuff on device, its all done in the cloud…
  • meacupla - Thursday, December 7, 2023 - link

    Microsoft wants to push copilot, and is asking for CPUs/APUs to have something like 8000 copilot AI points. And 8000 copilot points (I don't remember the exact number or units) can't quite be achieved with a 12CU RDNA3 GPU alone.
  • mode_13h - Thursday, December 7, 2023 - link

    In its current form, Ryzen AI isn't faster than the GPU. If you look at the slide entitled AMD Ryzen AI Roadmap, it shows that Hawk Point's Ryzen AI will only provide 16 of the SoC's 39 total TOPS.

    Ryzen AI isn't about adding horsepower, but rather about extending battery life when using AI.
  • nandnandnand - Saturday, December 9, 2023 - link

    It looks like Strix Point XDNA will be faster and more efficient than the iGPU, which is interesting.
  • mode_13h - Monday, December 11, 2023 - link

    Yeah, as long as the AI engine's clocks stay in the sweet spot and the GPU doesn't get a lot more powerful.

    45 TOPS is a big number, though. To put it in perspective, a 75 W Nvidia GTX 1050 Ti could only manage about 8 TOPS, and that was even using its dp4a (8-bit dot-product + accumulate) instruction.

    ...however, AI is bandwidth-hungry and the GTX 1050 Ti had over 100 GB/s of memory bandwidth, so I'm not sure how much it makes sense to scale up compute without increasing bandwidth to match. A little on-chip SRAM only gets you so far...
  • nandnandnand - Monday, December 11, 2023 - link

    Hawk Point will be an interesting test case. The silicon should not have changed, so they can only get 40-60% more performance by increasing the clock speeds. I think it was no more than 1.25 GHz before, so they may have ramped it up to exactly 2 GHz (60% higher). Source: https://chipsandcheese.com/2023/09/16/hot-chips-20...

    XDNA 2.0 means business and must use improved or larger silicon. Also, if Strix Halo rumors are accurate, Strix Point and Halo will have the same 45-50 TOPS performance, but Halo gets a doubled memory bus width.
  • mode_13h - Tuesday, December 12, 2023 - link

    I hadn't heard about that memory bus. So, I guess AMD is joining the on-package LPDDR5X crowd?
  • nandnandnand - Monday, December 25, 2023 - link

    Very likely, but I'd expect to see it used with the new CAMM standard too.

    Strix Halo is for laptops first, mini PCs second. We'll see how upgradeable the memory is.
  • Threska - Thursday, December 7, 2023 - link

    Well AI is far more than LLMs. It's currently being done with compute shaders.
  • mode_13h - Thursday, December 7, 2023 - link

    > outside of apple there really is no software that does any LLM stuff on device

    LLMs only gained such public attention about 1 year ago, whereas Ryzen AI launched in Phoenix, back in Q2. It obviously wasn't designed for LLMs, especially given that it's essentially some IP Xilinx already had from before their acquisition by AMD.

    What AMD said about Ryzen AI at the time of launch is that it was designed for realtime AI workloads, like audio & video processing for video conferencing. Intel actually had a more extensive set of example uses, in their presentations pitching the new VPU in Meteor Lake.
  • name99 - Thursday, December 7, 2023 - link

    And everyone doing it in the cloud would prefer to move as much inference as practical to the device so that they don't have to pay for it...

    Same reason streamers would prefer you to have h.265 or AVI decode available even though h.264 works fine – it reduces their costs.

    Everything takes time! But you get to ubiquitous inference HW in five years by starting TODAY not five years from now.
  • mode_13h - Thursday, December 7, 2023 - link

    > everyone doing it in the cloud would prefer to move as much inference as practical to the device so that they don't have to pay for it...

    If you're OpenAI, these models are your crown jewels. Keeping them locked up in the cloud is a way they can keep them secure. You charge a service fee for people to use them, and the cloud costs are passed on to the customer.

    As a practical matter, these models are friggin' huge. They'd take forever to download, would chew up the storage on client-side devices, and you wouldn't be able to hold very many of them.
  • nandnandnand - Saturday, December 9, 2023 - link

    Smaller LLMs, Stable Diffusion, etc. can definitely fit on the SSDs in consumer devices and use relatively normal amounts of RAM. Only a minority of users will actively seek to do so, while others will probably end up using some of these basic LLMs by default on even their smartphones.

    Hopefully we see a doubling of storage and RAM in the near term, with 8 TB SSDs becoming more common and 32Gb memory chips enabling 64 GB UDIMMs, larger LPDDR packages, etc.
  • flyingpants265 - Sunday, December 10, 2023 - link

    Sure they would.
  • hyno111 - Friday, December 8, 2023 - link

    Actually there is a very active community on using local LLMs. Meta released the source of their LLaMa models in Feb, and a lot of follow-ups appear using their proven architecture.
    The most active personal use cases are storytelling/roleplaying and assistant. And businesses are interested in using their own database to augment LLM abilities.
  • mode_13h - Thursday, December 7, 2023 - link

    Clicked the Ryzen AI SDK link in the article. System Requirements: Win 11

    No thank you. Yes, I'm aware of the github ticket requesting Linux support, that they recently reopened. I'm not buying a Ryzen AI-equipped laptop anytime soon, so I can afford to wait.
  • lmcd - Thursday, December 7, 2023 - link

    Honestly hilarious given that AMD's compute platform is barely supported on Windows. When will AMD support its entire feature set on a single OS platform?
  • PeachNCream - Friday, December 8, 2023 - link

    You as well as anyone else that relies heavily on Linux knows that support for new hardware tends to lag on our preferred OS platform. And when we finally do get support, it's not optimized, buggy, and vendor interest is apathetic at best. Get ready for a few years wait for parity for Ryzen AI.
  • mode_13h - Saturday, December 9, 2023 - link

    ROCm started out on Linux. So, that didn't lag Windows, but still had a host of other issues too numerous to get into, here.

    Intel has done a great job of supporting compute on Linux, using a fully open-source stack. Again, I can't say anything definitive about Windows, but I get the impression their Linux GPU compute support was above and beyond what they were doing on Windows.

    As for Nvidia, CUDA always treated Linux as first-class, as far as I'm aware.

    Ryzen AI is a little bit special-case, for AMD. It's only something they're putting in their APUs, which sets it apart from their GPU-compute efforts. Most of their APU customers are actually running Windows, so I think it makes sense for them to prioritize Windows for Ryzen AI. However, if they want to use it for tapping into the embedded market, later on, they really shouldn't disregard supporting it on Linux.
  • nandnandnand - Saturday, December 9, 2023 - link

    Technically, XDNA should make it into Phoenix desktop APUs in January, but as for the actual desktop CPUs, it's anybody's guess. No plans have been leaked

Log in

Don't have an account? Sign up now