Tell us more about your current setup and why you want to migrate from it. What problem do you wacht to solve with the upgrade / migration?
Also describe your workload a bit more detailed, do you want to run all this in parallel? Are you the only user running this apps?
Do you know how to setup any of your use case applications up on a Server OS, e.g. Linux? Are they even able to run in some sort of cluster and/or mGPU mode? What OS do you want them to run on? What would your expectations / definition of a cluster be? What benefits do you want to gain from it?
I would definitely recommend that you don’t buy anything before you can’t at least answer the above questions.
I want to migrate because comsol motor flux simulation keeps crashing before it completes (about 2 days to complete), crashes after a few hours, I let it run on two separate instances for 3 days to confirm if its just unresponsive ui). I had the same issue with nastran on my laptop and once I changed over to this desktop, the issue disappeared. I also want to speed it up to a few hours at most.
It’s single user, just for me. I’ve been working on an iron core linear motor for my machining center. I don’t want to start building it or open source the design before I optimize it via simulation.
I want to run siemens nx and comsol on either windows server 2019 or pro11 for workstations. My current license is locked to a single node for both, but I can get the HPC add-on, that would enable support for running the simulation on a cluster.
The llm on whatever OS is best for it. I want it to take scanned documents, use ocr, then take that data and ideally fill financial statements, tax returns etc. Right now I do this manually with grok, I feed it a single pdf at a time, then I get answers on what I need to write in which field etc. I would like to streamline this and avoid completely exposing my business data to a company that has no business knowing it.
I was leaning over to building a cluster mostly because of cooling efficiency, low power draw of the components (4545p+pro 2000 blackwell) and the ability to add additional nodes if I need them(which is also what I would expect, almost linear increase in performance the more nodes I plug in, granted I use nic’s inside with sufficient bandwidth). if the dual sp3 7742 is the baseline I was considering getting 2 nodes of the 4545p initially and see how things go from there, I expected a x1.8 performance gain from adding the second node. It’s also more elegant.
My mains can only support up to 3000W for the system/cluster as a whole.
This might be true for COMSOL Motor Flux simulations, which I haven’t personally used, but I don’t think the same applies to Siemens NX. Do you also run simulations in NX, and if so, can the workload be distributed?
Could you describe your workflows and software setup in a bit more detail?
Does COMSOL Motor Flux benefit from GPU computing?
What do you primarily use Siemens NX for just CAD, or do you run simulations there as well? If so, do those simulations utilize or benefit from GPUs?
If COMSOL Motor Flux is the only software that gains from distributed computing, my next question would be: will you continue using it for future projects, or is this iron-core linear motor a one-off?
Regarding LLMs: I’m no expert, but Ollama works pretty much out of the box with NVIDIA GPUs on Windows and Linux.
Have you heard of paperless-ngx? There are AI plugins like paperless-gpt or paperless-ai that can help with tasks like naming, tagging, and organizing documents. It won’t do your taxes for you, but it can give you a solid basis to gather everything you need for the relevant year.
I did simplify the mesh so it fit into the 96gb’s. At this point I assume errors in ram piled up(due to overheating?) and that’ what crashes the system.
If comsol doesn’t benefit from clusters, then the case is pretty much solved. LLM’s prefer multiple gpu’s on a single system over clusters from what I’ve read. If clusters aren’t a solution to either, then I have to stick to a single motherboard.
Maybe I should go with 9115 or a pro 9955wx for the pcie 5.0 lanes instead of the 7742.
ahh that is another possibility, The denser RAM we run nowadays is getting challenging to cool when it is actually stressed.
Comsol is getting a general purpose (not just for acoustics) CUDA solver in the next version (it remains to be seen if it will actually be faster than CPU solver however), so that might sway your decision too.
Only simple structural, which my current pc does just fine. I used it extensively when I was comparing different structures in terms of stiffness for the machining center. It is my go to CAD though mainly due to convergent body feature.
I have a wip radial direct drive motor for the rotary table so I can also do turning on the machine.
Yesterday I found this peculiar board: GENOA2D24G-2L+ / TURIN2D24G-2L+
Once I saw it I knew this was “it” and had to get it. I’ve never seen a more beautiful motherboard! I can fit so many pro 2000/4000 blackwells in it, all while maintaining proper cooling.
It’s mostly about the memory, with additional 12 slots from the second cpu, I’m more likely to hit the required amount without needing higher capacity sticks, prices of which can get quite insane the higher you go and since all server motherboards are expensive anyway, might as well get the most out of it.
The AMD Epyc ES/QS chips have lower clock speeds. Ones I have bought in the past also had non-functional memory channels so your mileage may vary.
Also, dont buy the low end epyc with high power ram. It wont get you anything. The 9115’s are bottlenecked to about 200GB/s due to the low connection count between CCD and IOD.