Codeberg is moving and what this means to you

KronisLV · on Nov 14, 2022

> We have been looking for this moment for more than a year, since we installed our own hardware in Berlin. We are running Woodpecker CI, Codeberg Pages, Weblate Translate and more on it. But the heart of Codeberg – Gitea – is still on a rented cloud instance.

Huh, I haven't really heard of anyone running Woodpecker CI out in the wild personally, so this is a nice thing to hear about.

From what I can tell, it was a fork from Drone CI due to licensing concerns and whatnot, it appears that their repository has been a bit more active than that of Drone:

https://github.com/woodpecker-ci/woodpecker/pulse/monthly

https://github.com/harness/drone/pulse/monthly

I wonder whether it's going to be a Gogs vs Gitea situation, since I'm still running Drone for my own needs (though it is both simple and stable, as well as suitable for my needs at this time).

proxysna · on Nov 14, 2022

Woodpecker is a great tool. Paired with Gitea has probably the smallest footprint compared to other options.

jasonjayr · on Nov 14, 2022

I just installed it on my homelab k3s instance, with Gitea also providing OAuth2. It went surprisingly smoothly (less the time I spent figuring out the typo in the helm chart...)

Loic · on Nov 14, 2022

Our very small company is running Woodpecker CI + gitea and they work very well together.

rurban · on Nov 14, 2022

> don't rename your user, projects, organizations or transfer repositories.

for the transition period.

danpalmer · on Nov 14, 2022

> But the heart of Codeberg – Gitea – is still on a rented cloud instance > We have a remote Ceph filesystem mounted on the old cloud VPS

They had one (1) VPS running a service that they charge for with 38k users? That seems not great for reliability and redundancy.

bayindirh · on Nov 14, 2022

> That seems not great for reliability and redundancy.

A single server, esp. when virtualized, offers great reliability. Add regular backups in terms of disk snapshots, and you'll have a "thing" which can restart in 10 seconds and rollback in half a minute.

Yes, you can do failover pretty simply, but a single server really goes a long way.

We redundantly install critical servers, but the backup doesn't kick-in unless the other one halts and catches fire (sometimes literally).

hedora · on Nov 14, 2022

If you want to go the enterprise route, you can mount the VM on an on-prem block store with a secondary ~ 10-20ms away, and set it to async or sync replication mode. If the primary DC burns down (hey, it happens!), you just fail over and lose ten or zero seconds of transactions, respectively. Downtime is roughly time to detect the outage + time it takes to reboot the primary.

You can also do this in the cloud, of course, but I've found that grounding the implementation in a physical example clarifies exactly what I'm talking about.

Some people claim this isn't a "high availability" solution, since you need to take the service down while you reboot for OS patches, etc, and during those windows, all requests fail. However, that's often fine. In particular, these sorts of setups can easily hit five nines, which is as good as an AWS region. However, in normal operation, they don't spuriously generate error codes like S3, etc do.

If you count your SLA as "service went 1 minute without generating a spurious error for any client", then this will have many more 9's than a microservice architecture. If you count your SLA as "at least one request returned 200 in a 1 minute window", then the two approaches will probably be comparable, at least until you start running the microservices in multiple AWS regions (or whatever) and eliminate outages due to human operator error.

rdsubhas · on Nov 14, 2022

What about simple stuff, like updating the kernel for the latest security patch and needing a small restart?

bayindirh · on Nov 14, 2022

None of the pets we run complain when the service is not available for a minute or so, which is required to restart a system in the worst case scenario.

Even we don't blindly restart our cattle servers. We drain & restart them, which means the load is already shifted to other servers in the fleet.

rdsubhas · on Nov 15, 2022

You missed the context of the comment. Comment was for the single-server, not for 2+ let alone fleet :)

bayindirh · on Nov 15, 2022

I meant that, most of the time we can install a single server which is used by a whole fleet, and that single server doesn’t get restarted unless we update it, because they’re reliable.

xena · on Nov 14, 2022

It turns out you can get surprisingly far with a single server. You'd be amazed.

capableweb · on Nov 14, 2022

Indeed, people vastly underestimate how far you can go with a single server. Even Digital Ocean offers a 48 vCPUs/96 GB instance for ~1000 USD which you'll be able to host most of the businesses seen on HN with. Obviously, if you're doing storage intensive stuff, you wouldn't use the built-in storage, if you're doing really CPU intensive stuff you might outsource that to a different instance and so on.

But people tend to want to distribute their app with 5k monthly users to 10 really weak instances instead of just 2 medium ones or 1 big one. And in that, they think it makes it more reliable but fail to account for the added complexity of distributed architectures and start to fail when doing deployments and changes instead.

Another part people underestimate is the performance of dedicated instances vs "virtual" instances. Many times I've seen people shocked at how last applications work when running on dedicated instances, and how far you can scale with them as you won't have to upgrade as often as you think. Cloud is mainly meant (for me at least) for software you need to be able to scale really fast up AND down, not just up as you get more users. But most businesses I've worked for, with and started myself, never really had that need.

noirscape · on Nov 14, 2022

The reality of it is that a lot of people vastly overestimate the activity of what they build will actually reach. You can easily host a lot of stuff with just a "basic" SQL database and a single frontend as long as you have proper rate limiting on your APIs and make sure your frontend is sane. NGINXs original design goal was to handle 10,000 connections at the same time and nowadays it can handle up to ~400,000 without needing reconfiguration. Even a basic cache layer on nginx can often completely circumvent the need of calling the backend, especially for open APIs. Postgres can easily handle dozens of connections at the same time with the right config flags.

To give a bit of an estimate on how far basic infrastructure can carry you; the entire RetroArch project[1], which I'd say has about the same userbase as any small tech company is entirely hosted, buildbot (so CI) included, on a single 40$ DigitalOcean instance.

Most software is suprisingly efficient/resilient if you just configure it properly. The only place where I've seen horizontal scaling as an actual necessity vs. just properly configuring the tools you're given in most cases is Rails. There's no real efficiency fix for Rails tooling from what I can tell. It's just slow as molasses and you're gonna foot an ever expanding server bill as your userbase grows.

[1]: No endorsement.

adql · on Nov 14, 2022

Well, you probably want at least 2 for redundancy, but active/passive setup on 2 nodes can cover great many use cases if you're not running some memory hogging Rails blob that needs hundreds of MB just to render a site.

capableweb · on Nov 14, 2022

In reality, complexity of deploys and introducing new changes are way more likely to shut down your service than anything else, unless you're dealing with very large scale. And for those problems, it doesn't matter if you have 1 or 10 instances, shit would go down anyways as you push out your change to all of them.

Services falling over themselves for no reason is really uncommon in my experience, and usually are because A) introduced changes, B) hitting performance bottlenecks that are really bugs in the code or C) downtime of 3rd parties that someone failed to account could actually be down so that brought down "our" application too.

adql · on Nov 14, 2022

Complexity of rsyncing your app onto 2 nodes instead of one isn't a problem tho.

The DB is the hard part (not with automation but that's more to learn), and storing the files to be seen by all nodes (if app needs to create files during its normal work) is, but in most cases delegating that part to cloudy cloud isn't a problem.

> In reality, complexity of deploys and introducing new changes are way more likely to shut down your service than anything else, unless you're dealing with very large scale. And for those problems, it doesn't matter if you have 1 or 10 instances, shit would go down anyways as you push out your change to all of them.

Right, but updating OS is one of those changes, can't do that hitless if there is only one node in the system. And you do update your OS (or whatever FROM you got your container), right ?

Sure, most downtimes will be more "developer fucked up" than "something happened with hardware". But those fuckups are usually on smaller scale than "well, server needs to be rebuilt from scratch/backup/CM manifest" and "just" need revert.

> And for those problems, it doesn't matter if you have 1 or 10 instances, shit would go down anyways as you push out your change to all of them.

If you have tens of instances you can be responsible developer and do some kind of staged deploy to cut those fuckups by order of magnitude or two.

ta988 · on Nov 14, 2022

In practice for most projects being inaccessible for a couple hours is not the end of the world. And you loose less money than falling in the DevOps SRE hole where you pay for everything 5 times plus the people that take care of it all of that for it to go down anyway because of a long tail event.

adql · on Nov 14, 2022

Most projects don't make money... obviously someone's personal blog is near-zero monetary loss if it is down.

But honestly if you want to not do ops, outsource it instead of half-ass it, there are plenty of options

> And you loose

*lose

danpalmer · on Nov 14, 2022

My point is not performance, I'd agree you can get far. My point is that when selling a service it seems somewhat irresponsible to run it on such a clear single point of failure.

capableweb · on Nov 14, 2022

Is it actually making things less reliable to run things on a single instance though? In my experience, why things go down tend to be because humans deploy changes, not because one instance went down because of hardware failures or connectivity issues. I've outlined more of this thought here too: https://news.ycombinator.com/item?id=33593361

adql · on Nov 14, 2022

Running on one instance ties you to reliability of the hardware

Running on multiple instance ties you to reliability of the hardware and reliability of the whatever software you use for cluster, which is also heavily dependant on your skill with that.

So it will entirely depend on what you run.

If you say run a cache server (Varnish or whatever else) behind some LB instead of single Varnish instance behind same LB, skill required to set it up is zero, extra software is zero so you end up with more reliability. We had exact zero problem with those kinds of setup for decade+ and it never caused any failure.

On other hand, if you use something like Pacemaker (our usual config was pair of DRBD + DB nodes managed by Pacemaker) it is absolute nightmare to "get it right" and every mistake on your part in configuring it will make it less reliable, and there is plenty of edge cases to think of and handle, some might not even exist when you start configuring it.

For example we had a hiccup at one point where initial config had too short timeouts to shut down database and it assumed it hanged, then killed it.

Other times DB occasionally failed to start because pacemaker was (back in SysV days, no systemd) running /etc/init.d/db start & status in quick succession, and Java didn't managed to write PID yet which made status return "db is down", confusing the resource manager.

We now have a bunch of setups like that also running rock solid but it took a bunch of time and setup to get it right. No extra cost to set it up really coz we have that in CM but for a time it definitely was lower actual uptime than "just a node with DB"

And there are elasticsearch clusters that are just more stable with 3 nodes than one just because sometimes index could shit itself on crash...

salamander014 · on Nov 14, 2022

It's more about what happens once there is an issue. I agree if something works and remains unchanged it will probably stay reliable, the issue is once you introduce a change, even nowadays you can possibly break something that isn't easy to undo (or troubleshoot without taking services down).

For low traffic stuff, might seem OK, but no redundancy for a service that users pay for seems irresponsible, however Gitea might just be for them to track internal stuff, so maybe a little less so.

capableweb · on Nov 14, 2022

> It's more about what happens once there is an issue

But again, when I've seen distributed infrastructure, it's very uncommon to see issues hitting one instance while others are not being hit by the very same issue.

When you introduce change, you introduce it to all running instances, all at once or over a period of time. You might be in luck if you're doing blue/green deployment, but you can achieve the same thing with a staging + production environment mostly.

markstos · on Nov 14, 2022

Hardware failures will hit one twin and not the other. Same for some kinds of high load triggers. Major changes like OS upgrades can be applied to one twin and then the other.

capableweb · on Nov 15, 2022

How often to do hit actual hardware issues on dedicated instances for small to mid size applications? Usually you see those issues the first time you use it, or it get replaced data center wide and you get forewarning so you can move stuff over first, with minimal downtime.

Same with OS upgrades, I can't remember the last time a OS upgrade completely borked anything I have deployed. At worst, the kernel didn't boot but recover from a backup was fast enough.

mook · on Nov 14, 2022

They're using Ceph, so storage is (probably) still replicated; it's just compute that's a single point of failure. It's possible that they've got automation in place so spinning up a new instance (on a VPS, do there isn't any physical work involved) is quick enough to be acceptable?

sanjayio · on Nov 14, 2022

To be fair, that data point doesn’t tell you their backup, security, or alerting protocols.

adql · on Nov 14, 2022

Think they are complaining about lack of redundancy not computing power

andmarios · on Nov 14, 2022

That's what cloud providers want you to believe. :)

I have had both VPS and collocated servers chugging along for over 5 years without any interruption. In the worst case scenario where I would lose one of them, given proper backups, it would take me less than a day to restore.

Few of the deployments I've seen in cloud providers, kubernetes, etc, could do 5 years without downtime.

Sure, there are other costs, like scaling, finding people who can work on them, or when the dreaded time for an update comes... But still, a single VPS can get you very far for a very small cost.

jannes · on Nov 14, 2022

> that they charge for

I think it is a free service. They are a non-profit organisation.

danpalmer · on Nov 14, 2022

Ah yes, fair enough. I had spotted they were non-profit, but not that they were actually free on top of that.

jwildeboer · on Nov 14, 2022

They don’t charge. You can donate or become a paying member of the association. Or not. Up to any User to decide. (I am a paying member since quite some time)

remram · on Nov 14, 2022

A VPS is probably running on replicated network drives, so no single hardware failure would bring it down. It would probably restart on a different host in seconds.

I am not sure that adding logical replication on top would improve reliability or redundancy that much.

bluedino · on Nov 14, 2022

One vps had enough file system or were they using something else to supplement it?

Kukumber · on Nov 14, 2022

i can run demanding AAA games on my PC

there is no reason software can't run on a single PC

time to study "software performance and optimizations"

medv · on Nov 14, 2022

Off topic: what’s the difference b/ codeberg and github,gitlab,bitbucket?

What does it mean “free”?

If github not an option, host your own gitea. How switching domains will help?

Macha · on Nov 14, 2022

Codeberg is a hosted gitea for open source only. The advantages are:

- Hosted in Europe and held to EU privacy laws

- Does not require consenting to copilot training on your code

- Maintained by a non-profit so immune from the "We need to make money so we're going proprietary" trend

- Faster than Gitlab.com

The disadvantages are:

- Open source only. This isn't for your private proprietary projects, they do allow "some exceptions" but this is for things like the terraform config for your project website when the rest of the project is on codeberg anyway, rather than open core or "I have my open and closed projects in the one place"

vanderZwan · on Nov 14, 2022

> Maintained by a non-profit so immune from the "We need to make money so we're going proprietary" trend

In principle you are right, but there are regulatory capture scenarios that can undermine that so it is still important to keep paying attention. Speaking as someone who is still baffled by how Mozilla fired I-don't-know-how-many people while still paying the top board members millions a year.

Non-profits are still better than any other currently existing legal construction I can think of though

mariusor · on Nov 14, 2022

Are you by any chance making a confusion between the Mozilla Corporation and the Mozilla Foundation? The former is the for profit organization that pays most of the developers of the browser and adjacent projects. The foundation is largely meaningless at this point (in my opinion) and serves little purpose outside of providing a good PR talking point.

vanderZwan · on Nov 14, 2022

If I am then that is only by their own design of splitting these things, don't blame me for that.

Narishma · on Nov 14, 2022

> - Does not require consenting to copilot training on your code

What's to stop Microsoft from training it on code outside of Github?

shafyy · on Nov 14, 2022

> Hosted in Europe and held to EU privacy laws

Not only hosted in EU, but also by a EU legal entity. This is an important distinction under GDPR.

Barrin92 · on Nov 14, 2022

Unlike Github they're entirely open source and they're registered as an e.V. (eingetragener Verein), the German version of a non-profit volunteer association that generally cannot engage in commercial activities.

FlyingSnake · on Nov 14, 2022

Not everyone can afford to self host a secure Gitea. Codeberg is a non-profit (German e.V.) that is a libre alternative to github/gitlab/bitbucket and has special focus on privacy and data protection. A close example would be OSM foundation.

aliqot · on Nov 14, 2022

> Not everyone can afford to self host a secure Gitea.

..what??

Macha · on Nov 14, 2022

A VPS to do that is $120/yr. While nothing for a professionally employed software developer in the west, some of us contributed to open source back when we were broke high schoolers and would like the current generation to be able to also, while others have less fortunate economic situations.

Also let's say 4 hours/month of amortised maintenence. Some people just aren't willing to spend that time, and while I self host a lot, I can't blame them

aorth · on Nov 14, 2022

> A VPS to do that is $120/yr

Minor nitpick: I've been running gitea on a $5/mo VPS for a few dozen personal projects and it's incredibly performant. I was also a starving student once and I appreciated the people who supported open source before me. Now I support dozens (with time AND money) as well!

Also, maybe aliquot's question was more about the "secure"?

layer8 · on Nov 14, 2022

I’m running a Gitea instance (along with other stuff) on a $3.25 per month ($39 per year) VPS with zero maintenance thanks to Debian security auto-updates and Certbot.

Alternatively, you could use a RasPi, or whatever hardware you have at home, with a free DynDNS service.

rurrban · on Nov 14, 2022

It is based in Germany, thus entirely immune to DMCA takedowns.

Note: codeberg was founded by a developer who received a DMCA takedown on GitHub.