Is a high-CPU server worth it? #4751

MarcelloTheArcane · 2025-08-10T17:32:04Z

MarcelloTheArcane
Aug 10, 2025

I have a cp-sat problem which is taking hours to solve. It has lots of variables and is a multi-stage solve. I'm using a Google Cloud compute engine VM with 24 vCPUs and 186GB memory. It is reaching 100% CPU, and hardly using any RAM (according to top), so CPU power seems to be the bottleneck. I'm setting solver.parameters.num_workers = 0 to automatically assign threads to all available CPUs, so that the solver uses 24 solvers:

15 full problem subsolvers: [core, default_lp, lb_tree_search, max_lp_sym, no_lp, objective_lb_search, objective_shaving_max_lp, objective_shaving_no_lp, probing, probing_max_lp, probing_no_lp, pseudo_costs, quick_restart, quick_restart_no_lp, reduced_costs]
9 first solution subsolvers: [fj(3), fj_lin, fs_random, fs_random_no_lp(2), fs_random_quick_restart, fs_random_quick_restart_no_lp]
13 interleaved subsolvers: [feasibility_pump, graph_arc_lns, graph_cst_lns, graph_dec_lns, graph_var_lns, lb_relax_lns, ls(2), ls_lin, rins/rens, rnd_cst_lns, rnd_var_lns, variables_shaving]
3 helper subsolvers: [neighborhood_helper, synchronization_agent, update_gap_integral]

On Google Cloud, I can get a compute engine VM with a ton of vCPUs, for instance 144 vCPUs, 288 GB Memory. Is this worth it?

1, On a machine with 144 vCPUs, would cp-sat try to assign 144 workers to the problem? (I'm guessing the number of first solution subsolvers would be increased, from https://d-krupke.github.io/cpsat-primer/05_parameters.html#parallelization)
2. If it does, is that even beneficial, or is it better to limit the number of workers to prevent the amount of communication overhead between them?
3. If I manually assign a lower number of workers than the vCPUs, is it worth having a machine with so many vCPUs, or have I reached a fundamental limit with how fast I can solve my problem?
4. If it's best to limit the number of threads, what's an optimal number - or does it depend on the problem?

Answered by lperron

Aug 11, 2025

see: https://github.com/google/or-tools/blob/stable/ortools/sat/docs/troubleshooting.md#improving-performance-with-multiple-workers

For large models, I use 64 workers on a 128 cores CPU with hyper-threading.
This way, all 64 workers run at full speed.

Using all cores on a large multi-core machine with hyper-threading will likely hit a memory bandwidth limit.

View full answer

lperron · 2025-08-11T01:50:41Z

lperron
Aug 11, 2025
Maintainer

see: https://github.com/google/or-tools/blob/stable/ortools/sat/docs/troubleshooting.md#improving-performance-with-multiple-workers

For large models, I use 64 workers on a 128 cores CPU with hyper-threading.
This way, all 64 workers run at full speed.

Using all cores on a large multi-core machine with hyper-threading will likely hit a memory bandwidth limit.

2 replies

MarcelloTheArcane Oct 9, 2025
Author

Could I ask what CPU architecture you use? I'm using a c2d-highcpu-56 (56 vCPUs, 112 GB Memory) with AMD Milan platform and 1 vCPU per core from Google Cloud Compute Engine, and the dashboard shows 50% CPU utilisation maximum. I guess this is a memory bandwidth issue.

lperron Oct 9, 2025
Maintainer

vendor_id : AuthenticAMD
cpu family : 23
model : 49
model name : AMD Ryzen Threadripper PRO 3995WX 64-Cores

gregy4 · 2026-01-20T19:26:47Z

gregy4
Jan 20, 2026

You didn’t specified how much memory your cp-sat problem uses.

An answer to your question is not easy because it depends on cpu architecture. Generally it is good to have equal performance for all parallel executions of a solver with low memory latency that helps with random access to different parts of memory. More memory channels is always good with memory bounded cp-sat problem, cache size and various sharing of cache between cores differs a lot between cpus.

From my experience it makes sense to use 16 workers, more workers is suitable when you are interested in lower bounds or to prove optimality of a solution. Cpu with more than 32 cores is nearly always limited in single core performance and often in memory bandwidth per core, since cp-sat needs both single and multi core performance I doubt whether more cores and search workers can compensate it.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is a high-CPU server worth it? #4751

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Is a high-CPU server worth it? #4751

Uh oh!

MarcelloTheArcane Aug 10, 2025

Replies: 2 comments · 2 replies

Uh oh!

lperron Aug 11, 2025 Maintainer

Uh oh!

MarcelloTheArcane Oct 9, 2025 Author

Uh oh!

lperron Oct 9, 2025 Maintainer

Uh oh!

Uh oh!

gregy4 Jan 20, 2026

MarcelloTheArcane
Aug 10, 2025

Replies: 2 comments 2 replies

lperron
Aug 11, 2025
Maintainer

MarcelloTheArcane Oct 9, 2025
Author

lperron Oct 9, 2025
Maintainer

gregy4
Jan 20, 2026