> Microservices was always a solution to the organisational problem of getting m...

infamia · on May 10, 2022

> I disagree. Starting with a bad set of microservices can be fixed by merging. Merging two codebases is trivial compared to splitting. Again, if I have isolation between the services, even if the slicing up was done badly, even if they're coupled, I can just remove the layer between them.

There is still coupling in microservices, it has just shifted to messaging, networking, and queuing. If you get any of those parts wrong, you have a worse mess to untangle with less mature debugging/logging tooling than a monolith enjoys, all the while likely dealing with eventual consistency (depending on the design). I'm not saying don't start with a microservice, but it likely wouldn't be the very first tool I would reach for when starting out if a monolith would do the job effectively. Most things will never be hyperscale and won't benefit from the increased concurrency. You can go a very long way with a "majestic monolith" and a bit of care.

staticassertion · on May 10, 2022

> There is still coupling in microservices, it has just shifted to messaging, networking, and queuing.

Sure, in the sense that your service is "coupled" to a queue and if you don't abstract that away it's hard to change that queue implementation. But in the sense of two services you wrote being coupled, they aren't, in terms of shared state. That gets pulled out. There is no way for one service to mutate the memory of another - it has to send a message to it.

That can be TCP or it can be over some queue or stream or whatever.

> If you get any of those parts wrong, you have a worse mess to untangle with less mature debugging/logging tooling than a monolith enjoys

This is the case with any concurrent system. The fact that so many languages lack concurrency primitives is probably why people don't run into this more often. If you use concurrency primitives in your language, you already have this.

> all the while likely dealing with eventual consistency (depending on the design)

There's nothing eventually consistent about this system. It fundamentally has causal consistency (since messages from a service must come after messages to that service that triggered them), and it's perfectly capable of leveraging transactions.

> I'm not saying don't start with a microservice, but it likely wouldn't be the very first tool I would reach for when starting out if a monolith would do the job effectively.

To each their own. I much prefer it. It's far simpler to maintain "good" design since the network boundary creates a hard line in the sand that you physically can not violate.

infamia · on May 10, 2022

They are coupled by the queue itself (you accounted for your queue going down and out of order/delayed messages right?), the network (i.e. what happens if some microservices go offline?), and most importantly the event message abstraction. Nothing is for free, and the event message abstraction/format is the new shared state in microservices. It's easy to get the event messaging abstraction wrong in green field projects, since you likely don't understand the domain as well as you would like. If that goes wrong, it can be very painful to fix after the fact. Again, not slamming microservices, but we should go in with eyes wide open about the well-known benefits vs. the tradeoffs they offer. I refer to the high quality (and partially free!) course [1] taught by Udi Dahan from Particular that reviews many of the tradeoffs with distributed system design.

> The fact that so many languages lack concurrency primitives is probably why people don't run into this more often. If you use concurrency primitives in your language, you already have this.

The difference is that with a monolith, the entire application state is in one place, but with microservices its state is distributed. This makes logging and debugging more difficult along several dimensions. Finally, there are decades worth of tooling development at your disposal to debug and monitor your monolith (even concurrency issues). The tooling around debugging and troubleshooting microservices pales in comparison.

[1] https://learn.particular.net/courses/distributed-systems-des...

blowski · on May 10, 2022

> That gets pulled out. There is no way for one service to mutate the memory of another - it has to send a message to it.

If it’s well-designed, and that’s a big if. Most implementations I’ve worked with constantly mutate each others’s state.

staticassertion · on May 10, 2022

No, it's physically impossible. You have causal consistency across services, not within services.

https://youtu.be/lKXe3HUG2l4?t=1438

blowski · on May 10, 2022

Codebase A writes data to datastore. Codebase B mutates it. Codebase A loads it back in, assuming it’s still the same.

Boom. You’ve mutated codebase A’s memory.

staticassertion · on May 10, 2022

I would hope it's obvious that you haven't mutated A's memory, but I'll just suggest you watch the talk.

nicoburns · on May 10, 2022

How is that any different to calling a function on a class? That's technically not class A modifying class B's memory either. B modifies it's own memory in response to a message (function parameters) from A. The message going over a network doesn't make that fundamentally different.

staticassertion · on May 10, 2022

Function parameters aren't messages. They're shared state. I'd suggest watching the talk and reading about message passing systems in general.

blowski · on May 10, 2022

“Watch this famous video” is not a great response. Many of us watched it years ago and seem to have interpreted it rather differently.

staticassertion · on May 10, 2022

Then you interpreted it incorrectly. I'm not inclined to teach you via HN about a subject that's well documented by resources I've already linked.

nicoburns · on May 10, 2022

I suppose if you mutate them. But we have a linter in place and a CI system that enforces it that prevents that.

staticassertion · on May 10, 2022

There are many solutions, certainly. A network is one option, which I personally prefer, but as I said elsewhere it's a "choose the right tool for the job" kind of situation.

blowski · on May 10, 2022

For all intents and purposes, you’ve mutated the memory. Sure, you haven’t mutated by reaching directly to the RAM. But the effect is still the same.

atx42 · on May 10, 2022

I disagree. If you are able to merge them you spent the work to originally have them split. So more work to start with microservices. It goes back to agile, the easy solution is have a monolith and figure out later how it can be well-split.

a4isms · on May 10, 2022

> Fundamentally, microservices allow you to isolate state. I can take two services and split them up - and now I've split their state spaces. I can put a queue service between them and now I've sliced out the state of their communication.

Dr. Alan Kay would like a word. This is literally the premise behind OO:

> "OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things."—Dr. Alan Kay

Outside of "extreme late-binding," which is a fascinating topic in its own right, isolating state is exactly the point of OOP. If we need microservices to accomplish isolation of state, that suggests we got OOP wrong, very wrong.

staticassertion · on May 10, 2022

I'm extremely aware of Alan Kay's statement, as well as the foundations of the actor model. The reality is that today that is not what OOP has become, and Alan Kay would agree.

> that suggests we got OOP wrong, very wrong.

Alan Kay very clearly states that people "got it wrong" and that OOP was supposed to be about messaging. ie: He intended for it to be one thing, but it isn't that thing.

> I'm sorry that I long ago coined the term "objects" for this topic because it gets many people to focus on the lesser idea. The big idea is "messaging"

> The key in making great and growable systems is much more to design how its modules communicate rather than what their internal properties and behaviors should be. [..]

http://wiki.c2.com/?AlanKayOnMessaging

https://computinged.wordpress.com/2010/09/11/moti-asks-objec...

geoffjentry · on May 10, 2022

> and that OOP was supposed to be about messaging

I agree that these are Kay's thoughts and also agree with his take on what it should be. But I think the reality is more complicated than it simply being something that evolved away from his grand dream. It's more that there was a soup of ideas floating around during that time that came together as OOP and the combination that became dominant was something else. For instance Simula was already using inheritance prior to Kay's message passing proposal.

staticassertion · on May 10, 2022

I can agree with all of that, I wasn't trying to imply otherwise, more just explaining that stating that OOP means what Kay wanted it to mean is ignoring history, consensus, and Kay himself.

a4isms · on May 10, 2022

I’m very aware that we evolved things Kay may not have intended, and that doesn’t make it wrong.

He may have coined the term, but he doesn’t own it, nor should we feel beholden to his vision dating back to 1972.

What I said was if the primary reason for micro services is hiding of state, then we got OO wrong, because OO, even the much maligned J2EE style of OO, can do that for us if we want hiding of state and message passing.

Another possibility is that microservices do much more for us than hiding of state and limiting communication to message-passing.

At my 9-5, we use Elixir to write our services and have a few Actor-based Scala services too, so my feeling is that we actually are doing OO fine, and that there’s something else that makes microservices compelling at scale.

staticassertion · on May 10, 2022

What you said was that Kay would disagree, and that we must have gotten OO wrong otherwise. Kay wouldn't disagree and Kay would say we got OO wrong.

a4isms · on May 10, 2022

What I said was:

> If we need microservices to accomplish isolation of state, that suggests we got OOP wrong, very wrong.

That’s the last sentence, which summarizes my point.

As for Dr. Kay, his exact words were saying that to him at the time OO was certain concepts and nothing more.

I have never interpreted that to mean that languages or systems that do more than hiding state and message passing are wrong, just that if we say something like “OO requires inheritance,” he would disagree with our definition of OO.

After all… Smalltalk itself has a lot more than hiding of state and message passing. Would anyone claim that Dr. Alan Kay would say Dr. Alan Kay was doing OO wrong?

I think there are good reasons to design microservice architectures, but if the argument is “Let’s break up our monolith so we can hide state,” I’d say that we can go ahead and just use our existing OO tools to achieve that.

geoffjentry · on May 10, 2022

> that suggests we got OOP wrong, very wrong.

It does, but that's because there are different flavors of OOP. Alan Kay's original take on OO was closer to the actor model than what grew into the mainstream spin on OOP with inheritance and the rest.

If you take 10 steps back and squint, microservices & the actor model start to look pretty similar.

dimitrios1 · on May 10, 2022

So more like Erlang/OTP rather than Spring Boot Microservices?

jimbokun · on May 10, 2022

> If we need microservices to accomplish isolation of state, that suggests we got OOP wrong, very wrong.

I don't think there's any question about that.

The only language that really seems to get this right is Erlang (and Elixir).

And I suppose Smalltalk, but I don't think even Smalltalk takes it as far as the Erlang VM.

seti0Cha · on May 10, 2022

Arguably, we did. In large codebases worked on by multiple teams it's not unusual to see teams drilling holes in OO walls because they "just want to get their work done" and view the abstractions as barriers. Taken along with the unfortunate fact that the majority of engineers suck at decomposing things into objects and the result is people preferring to move stuff out of process to keep the code simple and make the encapsulation more effective. I don't really think that's a good thing, but that's been my observation.

a4isms · on May 10, 2022

Erlang/Elixir feels like it strikes a middle ground, where every process behaves like one of Kay's objects, including the emphasis on message-passing rather than methods that behave like procedure calls.

pixl97 · on May 10, 2022

Then I'll have to say that most get OOP very wrong, for example any time an application crashes for any reason.

deterministic · on May 12, 2022

The Erlang programming language is AFAIK the only programming language used in production that does what Alan Kay is talking about. So it is probably the only Object Oriented programming language used in production today.

arunix · on May 11, 2022

What did he mean by "extreme late-binding of all things"?

lolinder · on May 11, 2022

Here's one explanation:

https://softwareengineering.stackexchange.com/questions/3019...

arunix · on May 11, 2022

That seems to be equating it with message passing, but messaging is already the first item mentioned by Alan (messaging, local retention ...) so I thought it might mean something else.

manigandham · on May 10, 2022

Network boundaries cause far more problems than they solve, and you've just shifted the complexity to now securing the network, usually with even more additional services, proxies, services meshes, firewalls, etc.

staticassertion · on May 10, 2022

> Network boundaries cause far more problems than they solve,

They cause exactly 0 extra problems. A call from function A to function B can fail due to B having a bug. A call from service A to service B can fail due to B having a bug or a network failure. Either way, failure is possible and has to be handled - the network only makes that more obvious.

Further, a call between functions can cause mutated shared state - not the case across a boundary, they physically do not share mutable state.

> and you've just shifted the complexity to now securing the network

Not really. Fundamentally you have split your service capabilites up - now you can apply least privilege as you desire.

manigandham · on May 10, 2022

A failure is obvious all by itself. Network boundaries just turn it into a much bigger failure. And network failures are far more common and harder to test, handle and recover from.

> "Further, a call between functions can cause mutated shared state - not the case across a boundary, they physically do not share mutable state."

This is false as state is not tied to your process nor does it require a network leap to add isolation.

> "now you can apply least privilege as you desire."

How exactly? It's not magic, you still have to apply them, and now it requires more strategies and effort to accomplish.

staticassertion · on May 10, 2022

I'm ignoring the first two points since I'm tired of explaining these things to people - you can read the papers/ watch the talks I've linked.

> How exactly? It's not magic, you still have to apply them, and now it requires more strategies and effort to accomplish.

Yes, we have tons of tooling for process isolation. Splitting a service into two services means you can isolate two processes instead of one, which means you break up the capabilities unique to each.

I used the word "apply" so I don't know why you're saying "you still have to apply them"... it's literally what I just said.

manigandham · on May 10, 2022

Securing the network is more work than just securing the code, because now there's a network in the way.

For all the repetition you have on this thread, can you summarize it with the actual benefit that you have gained in a serious production use?

staticassertion · on May 10, 2022

> Securing the network is more work than just securing the code, because now there's a network in the way.

I very much disagree.

> For all the repetition you have on this thread, can you summarize it with the actual benefit that you have gained in a serious production use?

I already have summarized it. All I've done since is correct people being incorrect with regards to my summary.

If you want a specific example, here's a blog post I wrote a long time ago (the dates are incorrect since we moved websites): https://www.graplsecurity.com/post/architecting-for-performa...

manigandham · on May 11, 2022

The only potential benefit there was for security, specifically because of your security-based product and the async nature of its processing. And even that was just relying on the ephemeral nature of lambdas instead of other security constructs or simply resetting instances of a monolith to accomplish exactly the same thing.

Nothing in that article explained a clear need or benefit of microservices.

staticassertion · on May 11, 2022

> because of your security-based product

Nothing in the article has to do with the product or the fact that it's security related, other than to provide a motivating use case.

> The only potential benefit there was for security

And performance.

> even that was just relying on the ephemeral nature of lambdas

I think you've failed to understand the article, which may be my fault, I haven't read it in a long time. The key is isolation. Ephemerality gives you a sort of temporal isolation. Splitting your messaging from your data storage gives you a capability based isolation. And so on.

It also means we can scale to the limits of S3/SQS - each service is itself stateless, the majority of state is managed in SQS, which could be quite loose about its consistency since every service is idempotent - arguably a form of temporal isolation.

What I've described in this article is effectively the actor model. I feel like I don't have to really justify the benefits of the actor model with regards to scale?

manigandham · on May 12, 2022

actor model != microservices.

What part of microservices (split functionality with completely separate runtime artifact deployed to separate servers) is needed for actors? You can have actors in a monolith.

staticassertion · on May 12, 2022

No part of microservices is needed for actors and I didn't imply that at all. I said that microservices can be easily modeled as actors.

nicoburns · on May 10, 2022

With a monolith you can put everything inside a database transaction and have a an entire request's worth of logic succeed or fail together. That's a lot easier to manage that having parts of the logic spread over multiple systems succeed and other parts fail.

staticassertion · on May 10, 2022

So use transactions? I don't understand what part of microservices prevents that. In fact, transactions are pretty fundamental to reliable systems.

https://www.hpl.hp.com/techreports/tandem/TR-85.7.pdf

From the abstract:

>It is pointed out that faults in production software are often soft (transient) and that a transaction mechanism combined with persistent process-pairs provides fault-tolerant execution -- the key to software fault-tolerance.

abraxas · on May 11, 2022

So distributed transactions with two phase commit, XA and all that. Imagine what, microservices hipsters tried that and it turned out too slow and cumbersome so they invented "sagas" which are still immensely more complex than a single transaction in a single database.

staticassertion · on May 11, 2022

You seem confused. If you want a transaction use a transaction. If you don't want a transaction don't use a transaction. If you need a transaction across services, that sounds like you've run across a microservice antipattern. Monolith/Microservice changes nothing here - you can have the same issue in a monolith where two different functions are managing transactions and now you want a single transaction.

Literally irrelevant.

nicoburns · on May 10, 2022

Well you can't use database transactions across multiple connections, so presumably this would involve you implementing your own transaction and rollback system. That's a lot more complexity than using a system that just works out of the box.

staticassertion · on May 10, 2022

I don't really understand. If I have a database, and a service is talking to it, it can open a transaction. If I then want to talk to other services, and rollback that transaction based on what happens with those, I can do that.

Microservices changes nothing about this. If you want to remove transactions by splitting up your logic such that it operates in terms of sequences or something, you can do that, but that's just a choice like any other.

nicoburns · on May 10, 2022

If the other services you talk to mutate state then rolling back those changes is non-trivial.

staticassertion · on May 11, 2022

Well... don't do that? This is where microservices comes in. In a SOA architecture nothing really tells you when it's a good idea to split things up. Microservices is a methodology to help you avoid this exact situation.

You'd have the same problem in a monolith if you have two different modules working on the same db.

dagss · on May 11, 2022

Assume you have two tables A and B on the same DB. They are sort of seen as unrelated. Suddenly a feature request requires that A and B are mutated together consistently.

If it is in one service you just use a common DB transaction and get it done.

If it is in one microservice for A and one microservice for B then you have to somehow implement this transaction yourself. This is possible but more work.

staticassertion · on May 11, 2022

OK imagine that you have two different modules that manage transactions to a database. Now suddenly you need there to be consistent mutations between those functions.

Do you see my point? Microservices do nothing here - you have run into antipatterns that are universal, and that microservice architecture addresses explicitly as an antipattern.

dagss · on May 12, 2022

I do not see your point. Sometimes consistent mutations between modules is wanted. Monoliths lets you do it. Perhaps you discovered your module boundaries were wrong, you create a supermodule to encaspulate both to coordinate the joint transaction and then split up later a different way and so on.

Module boundaries are refactorable.

Importantly, what if the alternatives you end up with achieve the same things with microservices end up causing a ball of mud of services, reimplementing transaction logic that belong in your DB in your homegrown network protocols?

You seem to say that some things are not possible with microservices and therefore this leads to cleaner code. My retort is that the kind of things one sometimes come up with as workarounds to make things still work for microservices are so complex that the cure is worse than the disease you wanted to cure in the first place.

staticassertion · on May 12, 2022

> Module boundaries are refactorable.

Why are microservices not refactorable? This is the same exact issue in both cases. You designed something for a use case, the use case changed, now your old design isn't working. So maybe you merge those two services, or merge those two modules, or whatever else you want to do.

> reimplementing transaction logic that belong in your DB in your homegrown network protocols

Don't do that? I mean, again, this issue of "I wrote something the wrong way and now I have to fix that" is not any better or worse in microservices.

> My retort is that the kind of things one sometimes come up with as workarounds to make things still work for microservices

That doesn't sound like microservices. In fact, even the idea of having a database shared across services doesn't sound like microservices - it's an explicit antipattern. So it sounds like a bad SOA design. The point of microservices is to take SOA and add patterns and guidance to avoid the issues you're talking about.

dagss · on May 12, 2022

Microservices gets owned by different teams, teams get cemented, politics get in the way of refactoring. Game over.

Sure, if you are a single team working on 10 microservices you can probably refactor with abandon without spending 70% of your working days in meetings talking about migrations and trying to sync strategies...

You may have experiences that that microservices are as easy to refactor as monoliths; in my experience it is orders of magnitude harder...

I think there is a bit of a "No true scotsman" fallacy at play here. You see something you do not like then it is "not microservices done properly".

How about state all the things you don't like about monoliths , then I say "that is not monoliths done properly", "don't do that" for each one?

I think both monoliths and microservices can lead to good code or balls of mud depending on the organization and developers involved.

Real question isn't whether "microservices done right" is better. The question is: does a Decision to do microservices reduce the chances of a ball of mud, when that Decision is then implemented by imperfect developers in an imperfect organization?

PS I always meant that each microservice had their own DB above, we agree on that and I never dreamt otherwise.

What I was getting at is that when you go distributed, sometimes quite complex patterns must be applied to compensate.

You may say the architecture is then "better", but on what metric? It is certainly more work up front -- so you start out in the negative and to become better the system and organization need at least to get to a certain scale, you need to save in the hours you invested to come out ahead.

In many scenarios the cost in developer-months needed up front is just as important as other factors in evaluating the best architecture. E.g. a scrappy startup simply should not do it IMO. Corporations..... perhaps;but I have seen it gone badly. (I guess it is just not "done right" then? See above.)

dagss · on May 12, 2022

PS I think microservices excel in making people FEEL productive (doing work that is not directly benefiting the company).

I have personal experience with the same product built twice, once as a monolith by a small team that worked really well and once as lots of services.

The featureset and development speed is about the same, but the many-services requires 10x as many people.

However by splitting into many services everyone feels productive doing auxiliary and incidental work. Only those of us who worked on the first system are able to see that the total output of the company is the same but 10x as expensive.

staticassertion · on May 12, 2022

> Microservices gets owned by different teams, teams get cemented, politics get in the way of refactoring. Game over.

I don't understand how microservices make this worse in any way. Modules get owned by different teams all the time.

> You may have experiences that that microservices are as easy to refactor as monoliths; in my experience it is orders of magnitude harder...

Yes, I have said before that I believe merging is fundamentally simpler than splitting. If we're just talking about merging a module vs a service, I don't believe either is harder than the other - I mean... nothing about microservices prevents you from using modules, and indeed I would highly recommend it.

> I think there is a bit of a "No true scotsman" fallacy at play here. You see something you do not like then it is "not microservices done properly".

For sure, and that's a failing of microservices. People think microservices means "SOA", or "write a lot of services". If you want to criticize SOA or whatever, sure, the argument of "don't do that" goes away.

> How about state all the things you don't like about monoliths , then I say "that is not monoliths done properly", "don't do that" for each one?

I probably could state a bunch of things that are pretty fundamental, but I don't think it's important - I don't know that I've actually said anywhere that microservices are better than monoliths, what I've instead said are the benefits of microservices that I see, which others have taken to mean that I somehow think monoliths or modules are bad.

> You may say the architecture is then "better"

I honestly don't think I've said that anywhere, or even made a judgment anywhere.

I think I can summarize, again, what I've said.

1. Network boundaries provide a physical layer that enforces isolation of state and the use of message passing

2. Isolation of state makes scaling a system easier

3. Isolation of capabilities makes securing a system easier

4. SOA inherently leverages the network boundary

5. Microservice Architecture is similar to SOA but with a bunch of patterns, guidance, and concepts that you leverage in your design

What I've received in response is a hodgepodge of:

1. "Modules can isolate state" - only true in some languages, and even then there's no physical barrier enforcing it, you're relying on developers to maintain that isolation.

2. "But what if you do anti-patterns that microservices tell you not to" - ok, that's why microservice architecture has books and documentation about what not to do. If you do those things, I'm not going to blame you, it's a failing of all methodologies when users have a hard time understanding them.

But so far the anti-patterns mentioned aren't really compelling or specific to microservices. You wrote code to satisfy a domain, the domain changed, now you need to change that code so that it satisfies the new domain. That happens all the time, merging services isn't any harder than merging modules.

3. General misunderstandings about state, security, etc.

> What I was getting at is that when you go distributed,

I'm not really convinced that "distributed" is the right word here. People talk about distributed systems being complex, and I think they're confused - what's complex is consensus, but splitting one service from another service shouldn't impact consensus, and the fact that they're now located on two different assets does not necessarily make things more complex.

Those services may be more complex, if your application was quite trivial - a totally stateless system with no external connections, for example. I see no reason to rewrite 'grep' as a microservice, and I would never recommend that.

Those services may now be more error-prone because you have things like dns, tcp, etc involved. If you don't want to make that tradeoff, that's OK, you could be right in that case. Again, no need to make all software be microservices.

(Going to respond to your other message here)

> PS I think microservices excel in making people FEEL productive (doing work that is not directly benefiting the company).

Maybe, I don't really know. It isn't my experience, but that's just me. Most developers seem to be pretty bad at their jobs so I imagine that all sorts of issues can be experienced. Certainly the idea of rewriting a monolith as a microservice seems like a red flag unless there were very specific needs.

convolvatron · on May 10, 2022

absolutely - if 'microservices' had transactions, it would actually provide some real leverage for fault handling over a monolith

blowski · on May 10, 2022

You drop in “…or a network failure” like it’s a rare occurrence that’s easy to handle.

staticassertion · on May 10, 2022

Frequency is irrelevant to the complexity. As I said, you have to handle the idea of cross-boundary failures, such as with modules that have bugs.

If you aren't taking steps to do so, you're not writing robust code.

Anyway, yes, persistent network failures are rare for many people.

travisd · on May 10, 2022

At some point, you can't gracefully handle bugs in other peoples code. If a function you call causes a SEGFAULT, in the vast majority of software, you're not expected to handle that. That's an invariant error, and you probably want some way to detect that it happened so you can fix it, but it's not reasonable to ask every caller of every function to handle that (in the same way we don't consider "the earth blew up" to be a reasonable thing to protect against, even though it its technically possible). There's simply not enough time and money to protect against every possible edge case in most software (NASA projects aside).

The argument here is that network issues are exceedingly common in microservice environments and so aren't actually an edge failure case, so you actually have to worry about them way more than you would worry about a function in a different module causing a SEGFAULT.

staticassertion · on May 10, 2022

The point is not to handle individual bugs, it is to handle all failures. This is the difference between a "defensive programming" approach and the "let it crash"/ "zen of erlang" approach. Actors are designed such that they have failure isolation, which means they can react to errors in other actors without worrying about their own state. They then have two options based on one of two bug classes - transient and persistent.

Persistent errors are propagated to the supervisor. Transient errors are either retried or propagated.

It doesn't matter if it's a network error, a disk error, a timeout, a crash, a cosmic radiation bit flip - your approach is always one of those two. So adding more failure cases doesn't "matter" in terms of your error handling, although you may want to adopt helpful patterns in the nuances of "retry".

The frequency of errors will obviously increase with a network error (arguably very very little), but the pattern is fundamental to resiliency.

If your network is truly so unreliable that you can not pay that cost, don't do it. I don't think most people are developing on networks that fail for long periods of time frequently.

kaba0 · on May 10, 2022

But now you are talking squarely about Erlang actors, not microservices in general. The runtime gives you all the needed guarantees here.

staticassertion · on May 11, 2022

I talk about services and actors interchangeably because there's no interesting differences between them.

kaba0 · on May 11, 2022

Other than automatic handling of network exceptions, safe failiures and the shitton of other features Erlang runtimes have?

staticassertion · on May 12, 2022

I'm not sure what you're talking about. What automatic handling of network exceptions? What safe failures? BEAM has lots of great features, no question, but they have very little to do with the implementation of actors - BEAM primarily provides names and linking as useful primitives.

ori_b · on May 10, 2022

Network calls can appear to fail, but actually succeed.

Local function calls don't have to deal with byzantine failures[1].

[1] https://en.wikipedia.org/wiki/Byzantine_fault

staticassertion · on May 10, 2022

Function calls can, of course, fail with side effects. Idempotency is always a desirable property.

blowski · on May 10, 2022

> Frequency is irrelevant to the complexity

So if something happens 50% of the time you should treat it in the same as if it happens one in 100 billion?

staticassertion · on May 10, 2022

Maybe? I can't compare 50% to 100 billion. If your computer crashed every 100 billion instructions that would be a problem. If it crashed every other instruction, that would be very slightly more (or the same amount) of a problem.

The point is that if you have a function call, you have the opportunity for a bug/ failure. Networks don't change that - you have the opportunity for a bug/ failure. The major difference is that services have stronger failure isolation.

blowski · on May 10, 2022

It sounds like you’re designing a very hypothetical bit of software.

staticassertion · on May 10, 2022

As opposed to designing software that is already implemented?

ori_b · on May 10, 2022

You've moved the state around, but the state is still there -- just hidden. It's hidden in the network communication instead of the function calls.

And the distributed, cross-service mutated state is a hell of a lot harder to trace and debug.

staticassertion · on May 10, 2022

I didn't say it removes state. I said it split the state up and isolated it. That's critically important - you physically can not mutate state across a network, you have to pass messages from one system to the other over a boundary, either via some protocol like TCP or via intermediary systems like message brokers.

Joe Armstrong talks about this better than I'm going to: https://youtu.be/lKXe3HUG2l4?t=1438

That timestamp is rough, I just found a related section of the talk.

> And the network-defined state is a hell of a lot harder to trace and debug.

There's no such thing as network-defined state. I assume you're saying that it's harder to debug bugs that span systems, which is true, but not interesting since that's fundamental to concurrent systems and not to microservices.

xorcist · on May 10, 2022

I think you have a very narrow idea about what "mutating state" really means. You seem to talk about DMA access only. But you can manipulate the state of an application by writing to a shared data store, by calling an API, and countless other ways. It is really more of a concept for us humans to define where an application begins and ends.

Let's take an example. If we have two services that wants to keep the full name of a logged in user for some reason, that piece of state can be said to be shared between the applications. Should one service want to change that piece of data (perhaps we had it wrong and the user wanted to set it right), the service must now mutate the shared state. It does not matter whether it is done by evicting a shared cache or if we write the updated data to the service directly, we still speak of a shared state that is updated.

Now we can stipulate that the more of these things we have, the more coupled two pieces of software is, which generally makes reasoning about the system harder. It is not as black and white as one type of coupling is considered acceptable and the other isn't, but some types are easier to reason about than others. Joe really thought hard about these things and it really shows in the software he wrote.

staticassertion · on May 10, 2022

We all share state in that we all exist within the same universe. But the universe has laws of causality, and Joe advocated that software should always maintain causal consistency.

A database is not needed for your example. You could replace it with an actor holding onto its own memory. But all mutations to that actor, which the other actors hold references to via their mailbox, are causally consistent and observable.

That is the premise of the talk I linked elsewhere.

camgunz · on May 10, 2022

> Fundamentally, microservices allow you to isolate state.

I think it's not really state isolation: your state is now spread across multiple separate services and a queue, which is objectively more complicated. To me, it's more the extreme version of things like dunder methods in Python or opaque structs in C: it prevents a specific type of programmer behavior. But honestly, it feels easier to solve this in code review.

Like, I agree it's bad to reach behind the public API of something, but microservices aren't immune to this. I've never worked on a microservice architecture that didn't have weird APIs just to support specific use cases, or had a bunch of WONTFIX bugs because other services depended on the buggy behavior. That's not fundamentally different than "this super important program calls .__use_me_and_get_fired__": you have an external program dictating the behavior and architecture of your own.

And you get multiple other layers of complexity here: networks, distributed transactions, separate dependency graphs, securing inter-server communications, auth/auth.

I don't think you're entirely wrong--there's a lot of history looking at state as a series of immutable updates (Git, Redux), and I think it is harder to "cheat" in this way using microservices. I just think it's far from a clear win.

rightbyte · on May 10, 2022

> Fundamentally, microservices allow you to isolate state.

You can have logical dependencies between those "isolated" states anyway so I don't see that as a benefit really compared to say Java OOP private fields.

dagss · on May 10, 2022

Re immutability -- I would say a well written backend in any language would (probably?) throw away the entire state between each request being handled. It's possible to introduce state, sure, but why and how does that happen? For very many backends the only natural thing to do is to code them stateless, keep all the state is in the database, and each new request starts in a fresh world.

I see two common sources of state in any backend (monolithic or not):

1) Caching, whether resources or flags, whitelists

2) Connection pools.

If there are ever any issues with those they can be segmented inside a monolith for a fraction of the cost of going to microservices (either using the same boundaries as if you had split into microservices -- or other boundaries, like just one set of caches/connection pools per endpoint handler..)

So I agree with the OP that the social aspect and development process is the only rational for microservices.

Otherwise, just scale the monolith horizontally to the same number of instances and you have strictly more ways to partition state; microservices only give you one way to partition state that may not even be the best one.

WookieRushing · on May 10, 2022

While isolation is important for managing state, the other side effect of isolation is allowing separate scaling of resources.

If you can scale up your number of workers for a particular emergency then things get easier to handle.

staticassertion · on May 10, 2022

It also makes bin packing services much simpler for that same reason.