Local x Server processing time (only CPU)

Posted on

Local x Server processing time (only CPU) – A server stack is the collection of software that forms the operational infrastructure on a given machine. In a computing context, a stack is an ordered pile. A server stack is one type of solution stack — an ordered selection of software that makes it possible to complete a particular task. Like in this post about Local x Server processing time (only CPU) was one problem in server stack that need for a solution. Below are some tips in manage your linux server when you find problem about linux, cpu-usage, time, , .

I’m having some HUGE differences on running time when comparing local x server processes.

Our laboratory has a dedicated server running Ubuntu 14 LTS as OS and PBS for Job scheduling. In total, there are 96 cores split in two queues. We conduce different experiments using Python routines with CPU usage, without any I/O or network requests.

My routines are developed in Python and when I use my local machine it runs in about 10 to 11 hours. When I use the same routine in our server, it takes more than 25 hours to do the same thing.

When monitoring the server status using htop, the CPUs were about 100% of processing in each core. I have already tried to reduce the Load Average per core (for about 0.8 for each core), but there were not any significant difference in processing time.

Could this difference between Local x Server be related to the CPU capacities ? Can it really doubles the processing time ?

Server CPU: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz

Local CPU: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz

That Broadwell server CPU should have several advantages over the Haswell desktop CPU. Implying that there is some overhead when scheduling jobs on the server.

E5-2650 v4 is a 12 core part for 2 socket servers. One server of this should only get 24 threads of a CPU intensive workload; limit your Linux load average to around 24.

96 cores in one system would require 4 nodes (total 8 sockets), linked together with some interconnect like MPI. If you only have one node, 96 threads is way too many. How many 24 core servers do you have?

Profile what is on CPU. For Linux use perf top and research recording perf data for analysis. Know which functions take most of the time, and whether they are your program or kernel overhead.

Also, upgrade your OS. You are at the point where you need to purchase extended Ubuntu security updates. Later versions definitely have better performance analysis tools, and probably better performance simply from a newer kernel.

Leave a Reply

Your email address will not be published. Required fields are marked *