Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I thought about this some more and did some research - and found an indexing approach using HNSW, serialized to parquet, and queried from the browser here:

https://github.com/jasonjmcghee/portable-hnsw

Opens up efficient query patterns for larger datasets for RAG projects where you may not have the resources to run an expensive vector database



Hey that's my little research project- lmk if you're interested in chatting about this stuff.

As others have mentioned in other threads, parquet isn't a great tool for the job here, but you could theoretically build a different file format that lends itself better to the problem of static file(s) representing a vector database.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: