To those using Elixir, how are you deploying it? Would love your insight especia...

jhgg · on July 26, 2018

At Discord, we use distillery to package, and salt to configure our Elixir machines. We also bake golden images using basically the same thing.

Our CI builds our elixir artifacts, updates them to GCS, then our minions pull the artifacts, untar them and move them into the right place. Aside from deploying go binaries, Elixir binaries have been the next easiest thing.

We have one (very low throughput) service that runs Elixir inside of a docker container, however, the rest just run BEAM on the VM directly, with no containerization. BEAM at higher load does not enjoy sharing cores with anyone. Most of our Elixir services sit at 80-100% of CPU on all cores. BEAM likes to spin for work, and is CPU topology aware. Two BEAM scheduler threads running on the same core is a recipe for disaster. Since we're fully utilizing our VMs, and shipping the tars is simple enough, Docker gave us too much operational overhead for it to be worth it.

We've also had our fair share of troubles running the BEAM VM on GCP at scale. I could go into this in more detail if anyone's curious.

iamd3vil · on July 26, 2018

> I could go into this in more detail if anyone's curious.

Please do!

jhgg · on July 26, 2018

We've experienced a plethora of platform issues that were exacerbated by how BEAM consumes resources on the system. Here's a few that come to mind:

- CPU utilization can differ wildly on Haswell & Skylake. On Skylake processors our CPU utilization jump by 20-30% due to Skylake using more cycles to spin. Luckily, all of that CPU time was spent in spinning, and our actual "scheduler utilization" metric remained roughly the same (actually, on Skylake it was lower!).

- Default allocator settings can call malloc/mmap a lot, and is sensitive to latency on those calls. Under host memory bandwidth pressure, BEAM can grind to a halt. Tuning BEAM's allocator settings is imperative to avoid this. Namely, MHlmbcs, MHsbct, MHsmbcs and MMscs. This was especially noticeable after meltdown was patched.

- Live host migrations and BEAM sometimes are not friends. Two weeks ago, we discovered a defect in GCP live migrations that would cause a 80-90% performance degradation on one of our services during source-brownout migration phase.

GCP Support/Engineering has been excellent in helping us with these issues and taking our reports seriously.

ngrilly · on July 26, 2018

> Live host migrations and BEAM sometimes are not friends. Two weeks ago, we discovered a defect in GCP live migrations that would cause a 80-90% performance degradation on one of our services during source-brownout migration phase.

I thought that GCP live migrations were completely transparent for the kernel and the processes running in the VM. I'd be happy to read a bit more about the defect that made BEAM unhappy.

AboutTheWhisles · on July 26, 2018

> Default allocator settings can call malloc/mmap a lot, and is sensitive to latency on those calls. Under host memory bandwidth pressure, BEAM can grind to a halt. Tuning BEAM's allocator settings is imperative to avoid this. Namely, MHlmbcs, MHsbct, MHsmbcs and MMscs. This was especially noticeable after meltdown was patched.

Excessive allocations and memory bandwidth are two very different things. Often then don't overlap, because to max out memory bandwidth you have to write a fairly optimized program.

Also, are the allocations because of BEAM or is it because what you are running allocates a lot of memory?

jhgg · on July 26, 2018

BEAM's default allocator settings will allocate often. It's just how the VM works. The erlang allocation framework: http://erlang.org/doc/man/erts_alloc.html is a complicated beast.

We were able to simulate this failure condition synthetically by inducing memory bandwidth pressure on the guest VM.

We noticed that during certain periods of time, not caused by any workload we ran, the time spent doing malloc/mmap would 10-20x, but the # of calls would not.

bostonvaulter2 · on July 27, 2018

> We noticed that during certain periods of time, not caused by any workload we ran, the time spent doing malloc/mmap would 10-20x, but the # of calls would not.

I'm curious what tools you used to discover this

jhgg · on July 27, 2018

https://github.com/iovisor/bcc

passer-by-123 · on July 26, 2018

Thanks! I would be great if you wrote an article on these with more details, as it would be really helpful to both Elixir and Erlang communities.

_asummers · on July 25, 2018

Distillery --> Kubernetes. Use libcluster library to cluster nodes with k8s DNS. Pretty plug and play for that one.

samhamilton · on July 26, 2018

Yep same here, using Google Cloud Build to generate the docker

_asummers · on July 26, 2018

Same! Jenkins master until we get the new thing working that Google released yesterday, then we can drop Jenkins. How do you cache your Dialyzer PLTs? We do triplet of elixir version, erlang version, checksum of lock file and tag a container layer with that tag and also make one called "latest" that is constantly overridden. If there's a new version of one of your dependencies, or you remove one or something, we just grab the latest layer because it's probably close enough.

minhajuddin · on July 26, 2018

We use docker with AWS ECS, it has been a great combination. My colleague has written a few blog posts on how we do it: https://engineering.tripping.com/jenkins-elixir-and-ecs-cicd...

jswny · on July 27, 2018

I am a student without lots of disposable income, and I primarily deploy smaller Phoenix apps which don't get much traffic. I build releases locally with Distillery, and then deploy them in Docker containers on a $15/mo Digital Ocean server.

Docker allows me to run various Phoenix/Elixir apps with various versions of the language and framework since I don't have time to update all of them. My method also allows me to compile .tar.gz releases on my local machine since the lower-tier servers that I can afford don't always have enough memory to compile there.

Here's a tutorial I wrote on how I deploy my Phoenix apps if anyone is interested: https://gist.github.com/jswny/83e03537830b0d997924e8f1965d88....

dnautics · on July 25, 2018

For development purposes, I'm deploying it on an on-prem device running ubuntu, inside a singularity container (a little bit laxer on the security front than docker, for dev probably ok, but when we look to productize it we might think about other strategies).

In particular since the final product is on-prem, the highly available, let-it-fail philosophy is appealing to reduce the number of issue tickets we're likely to have to deal with.

Don't take my strategy as gospel - I am on an open-ended, highly divergent branch of our company's product line and have been given a lot of product leeway by management that is.... "open-minded".

elcritch · on July 26, 2018

Just a thought, you might consider a nerves deployment... ahem I’m just curious how an IoT deploy system would work for on prem servers, well other than running on RPi’s that is :-)

bostonvaulter2 · on July 27, 2018

If you're interested in this, then you should definitely follow this project:

https://github.com/cogini/nerves_system_ec2

I'm excited to play with it in the future.

dnautics · on July 26, 2018

nerves is designed to write to an flash disk. My system has 2TB of SSD, dual socket intel, and 8 GPUs!

elcritch · on July 26, 2018

That’s part of the fun! You can write to normal disk too, then mount the rest for data. So it’d leave 1.999500 TB free. ;) Ok maybe I just like extremes..

sb8244 · on July 25, 2018

I use distillery to build releases, docker to build that + run it, and k8s on a private EC2 based setup for pod deployment.

It works pretty well, and I get to ship artifacts rather than source code.

freedomben · on July 26, 2018

Heroku primarily for me. When I need to scale I'll start using distillery + docker + kubernetes, but for now the Heroku experience has been great.

There's a PaaS called Gigalixir that has a lot of promise, but you'll be limited to one app at a time, and it must be re-deployed at least every 30 days or it gets shut down.

jesses · on July 26, 2018

Founder of gigalixir here. You're right, but I just wanted to mention that those limitations are for the free tier. Once you upgrade to the standard tier, both those limitations go away.

freedomben · on July 27, 2018

Yes, thank you, that's important info. And the pricing seems pretty reasonable as well.

udfalkso · on July 26, 2018

I’m using gigalixir to host a few projects and have been happy with it so far. I believe it uses GCP by default under the covers

jesses · on July 26, 2018

Founder of gigalixir here. We use GCP under the covers by default, but you can also choose to use AWS if you like. You can also choose which region you want to run in, but we only support us-east1 and us-west2, and on GCP: us-central1 and europe-west1.

monting · on July 26, 2018

I've been using nanobox.io for a couple of new Elixir projects, am very satisfied so far. Using nanobox to deploy to Digital Ocean, but I believe they also support GCP.

Also been using Heroku for some Elixir apps in production.

elbow-jason · on July 26, 2018

I also use nanobox.io. Nanobox treats Elixir as a first class citizen. Nanobox basically allows me to not worry about a TON of devops issues.

iamd3vil · on July 26, 2018

We package it into a docker image using distillery and then deploy to Kubernetes using Peerage with KubeDNS for auto clustering. It was tricky to figure out everything at the start but after figuring out once, it's pretty easy and works really well. Only issue we had at the start is figuring out sys.config but we use Environment Variables for almost everything and set `REPLACE_OS_VARS=true` while building the Docker image, which solved most of our issues.

davydog187 · on July 25, 2018

Currently Heroku, but interested in deploying into GCP with Kubernetes and Distillery

brentjanderson · on July 26, 2018

Docker containers running on Linode with a bit of Ansible for deployment automation.

jesses · on July 26, 2018

I use gigalixir.com, but I'm also the founder so I'm biased =)

sergiotapia · on July 26, 2018

Heroku or Aptible if you need HIPPA compliance.

Also a simple Dockerfile to push to AWS