Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Another alternative to consider is https://www.getdaft.io/ . AFAIU it is a more direct competitor to Spark (distributed mode).


Miles Cole here: I’d love to see Daft on Ray become more widely used. Same Dataframe API and run it in either single or multi-machine mode. The only thing I don’t love about it today is that their marketing is a bit misleading. Daft is distributed VIA Ray, Daft itself is not distributed.


Hey, I'm one of the developers of Daft :)

Thanks for the feedback on marketing! Daft is indeed distributed using Ray, but to do so involves Daft being architected very carefully for distributed computing (e.g. using map/reduce paradigms).

Ray fulfills almost a Kubernetes-like role for us in terms of orchestration/scheduling (admittedly it does quite a bit more as well especially in the area of data movement). But yes the technologies are very complementary!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: