Hacker Newsnew | past | comments | ask | show | jobs | submit | xenn0010's commentslogin

Found this while fighting with ROCm last week. Some team called Aule Technologies built a FlashAttention implementation for AMD hardware that actually runs. Anyone who's tried attention kernels on MI200/MI300 knows the ecosystem is a graveyard of abandoned CUDA ports. This one works out of the box. Repo: https://github.com/AuleTechnologies/Aule-Attention 53 stars, still pretty new. Looked at the code — it's clean HIP, not some janky translation layer. Benchmarks look legit though I haven't verified independently. Not affiliated, just surprised something in the ROCm attention space isn't broken. Figured others here running AMD might find it useful.

Says it works with Vulkan as well, for intel as well as AMD consumer cards. Sounds pretty great if it delivers on that.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: