How is it misleading when the whole point is that Redis can only be single threaded†? That's why Dragonfly (claims) to scale better. If anything, it's the Redis rebuttal that comes across as misleading; the posted announcement is very up front that Dragonfly's value proposition is that you get vertical scaling for free without having the additional ops overhead of a Redis cluster, which is very much not free in terms of maintenance and opportunity cost.
†: Redis 6 added threads, but AFAIK this is only for handling connection I/O. Actual database access is still single threaded. The only way I'm aware of to scale Redis is via clustering.
It's misleading because the comparison would be redis cluster vs dragonfly. There's no speed-up if the Redis user isn't fully saturating a single core. The real question is why is it only 25x faster on a 64-vCPU machine? Why isn't it 64x? Does this mean it's 60% slower when the request volume is below the needs of a single-threaded redis?
> Dragonfly's value proposition
Dragonfly has zero value proposition other than a ticking-time-bomb of pricing fuckery when they're forced to yield a return on that $21M investment.
It compares a process with a listening port with another process with a listening port. To give another example - nobody compares minio with bunch of disks to which you can write separately, and probably more efficiently.
hmm, Redis Labs are setting a cluster of 40 Redis processes on the same instance. It would be extremely difficult to do that with Redis OSS for anyone else.
"For the last 15 years, Redis has been the primary technology for developers looking to provide a real-time experience to their users. Over this period, the amount of data the average application uses has increased dramatically, as has the available hardware to serve that data. Readily available cloud instances today have at least 10X the CPUs and 100X more memory than their equivalent counterparts had 15 years ago. However, the single-threaded design of Redis has not evolved to meet modern data demands nor to take full advantage of modern hardware."
That's not what they are saying is wrong with Redis. Is Redis really 'antique tech'? Arguably, concurrent processing with a scale-up-only approach is a poor fit for "modern hardware".
So yes, you are correct: Redis from github requires knowledge and (your) code to make n instances work together (whether on the same node or not). But to claim that this is the case for "anyone else [but Redis Labs]" is questionable.
From a certain architectural camp, pin-to-core-process-in-parallel approach is optimal for [scaling on] "modern hardware". Salvatore can correct me on this but I don't recall that being a consideration at the early days, but it turned out to be a good choice. Some of the Redis apis however require dataset ensemble participation (anykind of total order semantics over the partitioned set) which is what is "difficult" to do effectively.
So basically any startup that can do that, should theoretically be able to squeeze more performance form their SaaS infrastructure than running Dragonfly type of architecture. Bonus, as pointed out by Redis Labs, being that the lots of parallel k/v processes can bust out of the max-jumbo-box should you ever need that to happen (for 'reliability' for example) ..
I think using word "misleading" is also "misleading".
Dragonfly hides complexity. Docker hid complexity of managing cgroups and deploying applications. S3 hid complexity of writing into separate disks. But you do not call S3 or minio misleading because they store stuff similarly to how disk stores files. Dragonfly hides complexity of managing bunch of processes on the same instance and the outcome of this is a cheaper production stack. What do you think has higher effective memory capacity on c6gn.16xlarge: a single process using all the memory or 40 processes which you need to provision independently?
It's misleading because, practically speaking, the type of people who are after the performance you advertise, are running clusters to begin with. So what you are selling is just a simplified stack that lets you not have to manage one more "system". That's fair but you could mention that? Or atleast acknowledge that if you repeat these tests with redis cluster the results will be wildly different and you wont have those crazy looking charts.
For example it's like me claiming that my new python web framework is X faster than Flask because it comes bundles with uwsgi. Yes, technically mine is faster, but its not a fair comparison.
It is source available. Generally can't use it to create a competing product... but also means you cannot combine it with any of the popular open source licenses.
I think maybe I'm just small-potatoes, but the only limitations or constraints I've ever run into with Redis are:
1. memory utilization
2. deployment/orchestration
3. bugs I created for myself related to using caches
What are the use cases that max out Redis speed/throughput?
Dragonfly is better with
1. Memory utilization (read the announcement, they mention it)
2. Deployment/orchestration - your initial threshold that forces you to scale horizontally just went up by order of magnitude. In fact for many use-cases you will never need to go horizontally with Dragonfly.
3. Dragonfly also provides a better experience when working with the system. Just a week ago one of the community contributors submitted a PR that introduces automatic recognition of the hot keys: https://github.com/dragonflydb/dragonfly/pull/951 (this feature is not ready for production use yet but we will get there).
It also has a built-in open-metrics support, built in cpu-profiler support, fully asynchronous I/O that allows answering INFO commands even under load etc.
It really depends on the type of application but a common one is getting a large spike in traffic beyond the norm (front page on HN, flash sale, etc.) I do think #1 and #2 that you mentioned are more common constraints and are ones that also both addressed in Dragonfly (much more efficient memory utilization and ability to scale vertically which negates the need for complex orchestration)
KeyDB implements multiple threading with spin-locks that protect a global shared data structure.
Dragonfly is built upon shared-nothing architecture where each thread manages its own slice of data, hence no need for classical locks and no contention under high load. It still provides atomicity guarantees but allows multiple transactions to progress independently as long as they do not need exlusive access to the same keys. So basically different approaches to the same promise - scale. Also different trade-offs. Shared-nothing approach has less contention and more flexible transaction framework but inhibits a slightly higher 50%th percentile latency (order of 30usec).
KeyDB is a fork of Redis, whereas Dragonfly introduces a brand-new architecture, crafted from the ground up utilizing a share nothing, multi-threaded design. It implements both Redis and memcached APIs
Putting aside what seems to be an amazing new product, but the blog post showing the results are also downplaying what (to me at least) seems to be a 25% or more increase in latency.
I wouldn't take their latency numbers too seriously since their measure isn't relative to the throughput. It's not always obvious, but the latency of high performance is tightly coupled with throughput. The latency of high performance systems is the server side execution time (for Redis that is a couple of microseconds), the network hop (probably a couple of hundred microseconds), and the congestion on the server (this is the time it takes the server to actually getting around to processing your request, since the server probably needs to handle other requests first before getting around to yours). The congestion, is directly tied to the throughput.
The most useful measurement I've seen is you pick a latency target, and evaluate how many QPS you can send to the server that meets that latency target. That gives a fairly simple dimension to compare to.
From the web page: Dragonfly is an in-memory data store built for modern application workloads. It is fully compatible with the Redis and Memcached APIs, required no code changes to adopt
When I saw the title, I thought it was a post about DragonFly BSD. But apparently it is about a closed source in-memory database meant as a competitor to Redis.
Honestly, I don't think giving 21m to DragonflyBSD would be a good idea, not sure how the project could adapt back to a shoestring budget after the money runs out.
I would not be against giving them a few 100Ks so that they could get one or two full time developers however.
(joke (but also food for thoughts): alternatively, just "safe" invest the 21M. At 3% interest rate, that 630k per year, enough for ~2 to 10 developers depending on the geo. An OSS project can do a lot of things with this kind of budget).
I seriously thought a bunch of BSD experts got together and raised funds for making 2023 the "Year of the BSD".
On a different note, I wish new projects respect the history of other projects while naming their project. I am sure they would've found a better name for this DB.
I made an honest mistake. While I saw DragonflyBSD, I didn't realize its prominence within the BSD community. I assumed that since our project only focuses on Linux, having a similar name in a different "namespace" wasn't a significant issue.
TBH, I love BSL licenses. You can use it as you want for free, except being a competitor, and there is a high chance of a sustainable business model. (What benefit do you expect if the company goes bankrupt?) Despite the unpopular opinion, you have free and easy access to the code, so it is open source in the sense of words. Just not in the sense that you can steal their business like AWS does. Feel free to start a similar project, invest your time and money, and make it available under whatever license you want.
It is open for me to use as I please. And I don't want to destroy their business. I can understand your ideological drive, but in reality it doesn't matter until you behave unethically and steal their intellectual property. Do you want that?
If you are sure that what you please is and always will be in line with what their company pleases, then I suppose so. Ideology has nothing to do with it. Nor does "unethically and steal their intellectual property" have anything to do with it- they can choose whatever license they want for their intellectual property and I can ethically choose not to use it. Unless your idea of "ethics" is for force me to use their intellectual property and agree to their license?
I mean I can go around calling C a functional programming language because it functions and I can build functioning programs with it... but that's not what the word means in anyone's discourse besides yours. I mean I support your freedom of speech, but also mine in saying your usage is disingenuous.
What's not open about it? I see the source, I see instructions to build from source, I see a license that seems to say I can copy and make derivative works out of it. Can you try to make a more substantive comment?
It's a weird license that seems to say "you can use this as long as you don't compete with us", with an automatic switchover to Apache 2.0 in 5 years. Definitely better than closed source, but probably not Open Source by the OSI definition.
Restricting Usage or Distribution makes it de-facto not OSS.
It's actually a slightly different form of an old debate. I'm thinking in particular about the Crockford license (the MIT-like one with "The Software shall be used for Good, not Evil." bit).
It was determined to be non-free quite a while back due to such restrictions.
That being said, it hard to be a commercially successful software editor with an OSS model (RethinkDB comes to mind).
I do understand why BSL exists, but it feels to me like an unsatisfactory compromise.
The license comes with restrictions on what you can use your derivative works for - e.g. not creating an in-memory datastore service. It's essentially an Apache 2 with a "Also AWS can't just steal it and sell it as a service when it gets huge"
Is it open-source? Well, depending on your definition probably not. Is it a fair license? Yeah I'd think so.
From my point of view, for example, GPL is not "open" at all. And yet it is on the list. In my opinion, BSL is even more "open" than GPL. Feel free to have a different view.
DragonflyDB cofounder here. I am not shy about our choice of license.
Like with software design, everything is about trade-offs. Folks here voiced reasons why we chose BSL. I am sure you perfectly aware about all this.
I do not know personally you but I noticed that you posted the link to the announcement. I am guessing you are passionate about the technology and innovation. Dragonfly is much more than the licensing choice we made. I wish HN discussions here were about how fibers work in Dragonfly and how SSD tiering is gonna be implemented and how we provide atomicity for lua scripts while running many of them in parallel etc.
Btw, Dragonfly relies on an io-engine called helio (roughly equivalent to tokio) that has been developed by me and open sourced under Apache 2.0.
While technical discussion about those details would be interesting, HN is also a strong entrepreneurial community and your license choice has a big impact on whether or not a business would choose to depend on your product. I would not choose to build on a BSL licensed foundation for my product because of real business concerns with vendor lock-in. I would also not choose to contribute to a BSL licensed code base because the CLA means I am not free to use the code base on equal grounds with other contributors (DragonflyDB Ltd in this case).
Because of these two things, I didn't spend a lot of time looking into the technical details and therefore can't really say much on them. By all means it sounds like a very interesting piece of technology- but the license makes it useless to me.
Please do not express your opinion as the opinion of the majority. Instead of discussing license choices, create your project and open source it as you like. Do not tell others how to build their business, especially on HN.
I never expressed my opinion as that of the majority. I expressed my opinion as my opinion, and I can express whatever opinion I want to and discuss whatever I want to on Hacker News, so long as as Y Combinator (Who owns Hacker News) is okay with it.
Are you a representative of Y Contaminator/Hacker News? If you are, I will be glad to comply with your request.
The reply wasn't directed at DragonFly, but simply an answer to parent. I think you got a lot of sticks on HN. [1] In case you are wondering, I am much closer to Laymen in my definition of Open Source so I am actually extremely supportive of BSL. A perfect balance of Business Needs and Open Source. But HN has gotten a lot more ideological than it used to be. So please keep up the good work.
[1] Part of the reason why I submitted Dragonfly, interesting technology should get more coverage on HN, and not be ignored simply because of some over zealotry ideological reason.
(Originally posted 3 months ago on https://news.ycombinator.com/item?id=34231033)