Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
csabakecskemeti 
posted an update 9 days ago
Post
3123
Just sharing a result of a homelab infrastructure experiment:

I've managed to setup a distributed inference infra at home using a DGX Spark (128GB unified gddr6) and a linux workstation with an RTX 6000 Pro (96GB gddr7) connected via 100Gbps RoCEv2. The model I've used (https://lnkd.in/gx6J7YuB) is about 140GB so could not fit either of the GPU. Full setup and tutorial soon on devquasar.com



Screen recording:
https://lnkd.in/gKM9H5GJ

Did I understand well that it runs on separate machines?

·

Yes 2 machines a DGX Spark and a Linux workstation and they connected but a 100Gbps dedicated RoCE network

Will check it out. Surely.