Is a high-CPU server worth it? #4751
-
|
I have a cp-sat problem which is taking hours to solve. It has lots of variables and is a multi-stage solve. I'm using a Google Cloud compute engine VM with 24 vCPUs and 186GB memory. It is reaching 100% CPU, and hardly using any RAM (according to 15 full problem subsolvers: [core, default_lp, lb_tree_search, max_lp_sym, no_lp, objective_lb_search, objective_shaving_max_lp, objective_shaving_no_lp, probing, probing_max_lp, probing_no_lp, pseudo_costs, quick_restart, quick_restart_no_lp, reduced_costs]
9 first solution subsolvers: [fj(3), fj_lin, fs_random, fs_random_no_lp(2), fs_random_quick_restart, fs_random_quick_restart_no_lp]
13 interleaved subsolvers: [feasibility_pump, graph_arc_lns, graph_cst_lns, graph_dec_lns, graph_var_lns, lb_relax_lns, ls(2), ls_lin, rins/rens, rnd_cst_lns, rnd_var_lns, variables_shaving]
3 helper subsolvers: [neighborhood_helper, synchronization_agent, update_gap_integral]On Google Cloud, I can get a compute engine VM with a ton of vCPUs, for instance 144 vCPUs, 288 GB Memory. Is this worth it? 1, On a machine with 144 vCPUs, would cp-sat try to assign 144 workers to the problem? (I'm guessing the number of first solution subsolvers would be increased, from https://d-krupke.github.io/cpsat-primer/05_parameters.html#parallelization) |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
|
For large models, I use 64 workers on a 128 cores CPU with hyper-threading. Using all cores on a large multi-core machine with hyper-threading will likely hit a memory bandwidth limit. |
Beta Was this translation helpful? Give feedback.
-
|
You didn’t specified how much memory your cp-sat problem uses. An answer to your question is not easy because it depends on cpu architecture. Generally it is good to have equal performance for all parallel executions of a solver with low memory latency that helps with random access to different parts of memory. More memory channels is always good with memory bounded cp-sat problem, cache size and various sharing of cache between cores differs a lot between cpus. From my experience it makes sense to use 16 workers, more workers is suitable when you are interested in lower bounds or to prove optimality of a solution. Cpu with more than 32 cores is nearly always limited in single core performance and often in memory bandwidth per core, since cp-sat needs both single and multi core performance I doubt whether more cores and search workers can compensate it. |
Beta Was this translation helpful? Give feedback.
see: https://github.com/google/or-tools/blob/stable/ortools/sat/docs/troubleshooting.md#improving-performance-with-multiple-workers
For large models, I use 64 workers on a 128 cores CPU with hyper-threading.
This way, all 64 workers run at full speed.
Using all cores on a large multi-core machine with hyper-threading will likely hit a memory bandwidth limit.