Has anyone here built a Beowulf Cluster?

plenipotentprotogod@lemmy.world · 1 year ago

Has anyone here built a Beowulf Cluster?

knfrmity@lemmygrad.ml · 1 year ago

I tried migrating my personal services to Docker Swarm a while back. I have a Raspberry Pi as a 24/7 machine but some services could use a bit more power so I thought I’d try Swarm. The idea being that additional machines which are on sometimes could pick up some of the load.

Two weeks later I gave up and rolled everything back to running specific services or instances on specific machines. Making sure the right data is available on all machines all the time, plus the networking between dependencies and in some cases specifying which service should prefer which machine was far too complex and messy.

That said, if you want to learn Docker Swarm or Kubernetes and distributed filesystems, I can’t think of a better way.

stevecrox@kbin.run · 1 year ago

Docker swarm was an idea worse than kubernetes, that came out after kubernetes, that isn’t really supported by anyone.

Kubernetes has the concept of a storage layer, you create a volume and can then mount the volume into the docker image. The volume is then accessible to the docker image regardless of where it is running.

There is also a difference between a volume for a deployment and a statefulset, since one is supposed to hold the application state and one is supposed to be transient.

carl_dungeon@lemmy.world · 1 year ago

Docker or Kubernetes work well on a cluster. Before containers this was a lot more work to set up, but these days you just need to image them all, put them on the network, and then use some kind of container orchestration to send them containers/pods.

Ramin Honary@lemmy.ml · 1 year ago

Someone with more expertise can correct me if I am wrong, but the last I heard about this, I heard that cluster computing was obsoleted by modern IaaS and cloud computing technology.

For example, the Xen project provides Unikernels as part of their Xen Cloud product. The unikernel is (as I understand it) basically a tiny guest operating system that statically links to a programming language runtime or virtual machine. So the Xen guest boots up a single executable program composed of the programming language runtime environment (like the Java virtual machine) statically linked to the unikernel, and then runs whatever high-level programming language that the virtual machine supports, like Java, C#, Python, Erlang, what have you.

The reason for this is if you skip running Linux altogether, even a tiny Linux build like Alpine, and just boot directly into the virtual machine process, this tends to be a lot more memory efficient, and so you can fit more processes into the memory of a single physical compute node. Microsoft Azure does something similar (I think).

To use it, basically you write a program a service in a programming language that runs on a VM and build it to run on a Xen unikernel. When you run the server, Xen allocates the computing resources for it and launches the executable program directly on the VM without an operating system, so the VM is, in effect, the operating system.

mesa@lemmy.world · 1 year ago

I did a whole ago just to see what it was all about. Then got rid of the setup a week later. It’s a cool project but I needed the other boxes. If you need a huge amount of parallel operations (and want to self host) it’s a decent option.

makeasnek@lemmy.ml · edit-2 1 year ago

Look into BOINC. It’s a free open source software for distributed computing (“map-reduce”-type problems). Runs on all platforms, handles computation at the petaflop scale. Large Hadron Collider (CERN) uses it to distribute computational work to volunteers. It’s also a way you can contribute your computer’s spare capacity to cancer research. !boinc@sopuli.xyz

𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social · 1 year ago

There are aseveral options, although wome may be defunct.

Last time I looked into this, openMosix was the most interesting, affordable, general-purpose option. It turned several computers into one big virtual computer. I ran a very small, 3-node cluster for a time. The upside was that you could run almost anything on it - unlike most HPC solutions, it didn’t require bespoke languages, libraries, or targetted solutions. The downside was performance; it turns out that to really take adventage of HPC, you really need to program for it. OpenMosix looks defunct now.

OpenPMIx looks to have taken up the torch from OpenMosix. It looks active; I have no specific knowledge about it.

tldp.org has some good required reading before you invest in this, in particular discussing the elephant in the room, networking latency. The short version is that, no matter how slow your computers, the bottleneck will still be the network. Unless you’re willing to invest a lot into fiber and expensive, fast switches, it’s probably not worth it.

slurm crosses the line into modern cluster job management, like you might find in a cloud provider like AWS, which is tye direction the non-supercomputer industry took when commodity MPI turned out to be not feasible. Warewolf is another version, sort of one foot in distributed container management and lightweight MPI. Both are pretty involved, more Beowulf than OpenMosix.

tldr, it’s probably not worth it if you’re looking for a cheap Beowulf cluster, because such a thing doesn’t exist in any practical sense. Cost, and physics, get in the way. If you want to set up a data center, or some job farm like AWS or GCS, that’s another matter. But it’s a far cry from MPI.

Has anyone here built a Beowulf Cluster?

Has anyone here built a Beowulf Cluster?

Beowulf Clusters Make Supercomputing Accessible | NASA Spinoff