Problems with slow calls to pthread_condition_signal

Support for the other architectures (X86_64, IA-64, and PowerPC)
pedrito
Posts: 4
Joined: 2011/05/10 10:47:03

Problems with slow calls to pthread_condition_signal

Postby pedrito » 2011/05/10 11:33:44

Hello,

I'm currently running some benchmarks of a shared-memory parallel code on a 16-Core system (4xQuad-Core AMD Opteron 8380) running CentOS 5 (Linux 2.6.18-194.17.1.el5 #1 SMP Wed Sep 29 12:50:31 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux).

My code uses pthread_cond_wait and pthread_cond_signal extensively to synchronize threads. The threads all fetch jobs from a pool. If no job is available, the thread waits with pthread_cond_wait. While the threads work on their jobs, they can release parts of them (i.e. their data). When they do so, they call pthread_cond_signal.

The job queue is protected by a pthread_mutex which is released while waiting for a job. When a new job is signaled, I do not attempt to get the mutex before calling pthread_cond_signal. Although this may be a terrible idea in many cases, it is perfectly ok in this case as there is a fall-back mechanism for threads that start waiting on jobs and just missed a signal.

I have run this code on an Intel Core i5 with Ubuntu 11.04 (Linux 2.6.38-8-generic-pae #42-Ubuntu SMP Mon Apr 11 05:17:09 UTC 2011 i686 i686 i386 GNU/Linux) and on a 16-Core (4xQuad-Core AMD Opteron 8356) system running Ubuntu 10.?? (Linux 2.6.32-30-generic #59-Ubuntu SMP Tue Mar 1 21:30:46 UTC 2011 x86_64 GNU/Linux), and it scales well with the number of processors. On the 16-Core system running CentOS, however, the scaling breaks down dramatically as of two threads.

Upon closer inspection using Intel's Vtune Amplifier XE 2011, I noticed that while in the single-threaded case (i.e. running the code with the job queue but with only a single thread) the calls to pthread_cond_signal cost almost nothing (0.32s out of 125s), with two threads it takes a whopping 13.78s (out of now 72s). On the other two systems, the costs of pthread_cond_signal grew linarly with the number of threads, as would be expected.

Replacing pthread_cond_signal by pthread_cond_broadcast produces the same result.

Is there anything I should know about CentOS's implementation of pthread_cond_signal? Or could this potentially be a bug somewhere? Please do let me know if you need any more information.

Cheers,
Pedro

pedrito
Posts: 4
Joined: 2011/05/10 10:47:03

Re: Problems with slow calls to pthread_condition_signal

Postby pedrito » 2011/05/10 19:33:33

Perhaps a short update:

I noticed that the default compiler on the CentOS system is gcc 4.1.2 and that it uses a perhaps somewhat outdated libpthread-2.5.

The 16-core Ubuntu system on which the code runs ok has libpthread-2.11 and the Core i5 system has libpthread-2.13.

I am currently trying to see if I can get a newer gcc or libpthread going to see if this is merely a gcc/libpthread problem.

Cheers, Pedro

User avatar
TrevorH
Forum Moderator
Posts: 20652
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: Problems with slow calls to pthread_condition_signal

Postby TrevorH » 2011/05/10 19:47:05

You can yum install gcc44 on CentOSto get (I think) 4.4.5 as an alternative C compiler suite.

pschaff
Retired Moderator
Posts: 18276
Joined: 2006/12/13 20:15:34
Location: Tidewater, Virginia, North America
Contact:

Problems with slow calls to pthread_condition_signal

Postby pschaff » 2011/05/10 23:37:47

Welcome to the CentOS fora. Reading FAQ & Readme First is recommended for new users.

pedrito wrote:
Hello,

I'm currently running some benchmarks of a shared-memory parallel code on a 16-Core system (4xQuad-Core AMD Opteron 8380) running CentOS 5 (Linux 2.6.18-194.17.1.el5 #1 SMP Wed Sep 29 12:50:31 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux).
...

That is a rather obsolete and consequently unsupported CentOS 5.5 kernel. You should immediately update to the current release 5.6. 5.5 is has numerous known bugs and security issues that have been fixed in subsequent updates. See the CentOS 5.6 Release Notes Section 4 for details on the recommended update procedure. By not updating you are implicitly accepting that you will live with numerous bugs and security issues (and associated known exploits) that have subsequently been fixed. Performance might also be adversely impacted.

pedrito
Posts: 4
Joined: 2011/05/10 10:47:03

Re: Problems with slow calls to pthread_condition_signal

Postby pedrito » 2011/05/11 09:55:11

Hi Pschaff,

Thanks for the information! The machine I'm working on is part of a larger high-performance cluster with hundreds of users, so updates are a bit of a difficult thing to get through...

Is there any way of checking what version of libpthread the most recent CentOS release uses?

Cheers, Pedro

pedrito
Posts: 4
Joined: 2011/05/10 10:47:03

Re: Problems with slow calls to pthread_condition_signal

Postby pedrito » 2011/05/11 09:58:50

TrevorH wrote:
You can yum install gcc44 on CentOSto get (I think) 4.4.5 as an alternative C compiler suite.


We have gcc-4.4.4 installed and I've just tried it, but as far as I could tell, it still links to the old pthread library, which seems to be part of glibc.

Cheers, Pedro

User avatar
TrevorH
Forum Moderator
Posts: 20652
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: Problems with slow calls to pthread_condition_signal

Postby TrevorH » 2011/05/11 11:16:39

If you have a large HPC cluster, do you also have a Redhat subscription? Might be worth reporting it upstream on the Redhat bugzilla if so and then reporting it as a support call. This has the look of something that needs a code change and the only way you'll get that is to either fix it yourself by backporting whatever it was that was fixed between 2.5 and 2.11 or, by getting RH to do it. The latter is much better as you won't have to maintain your own glibc rpms!

pschaff
Retired Moderator
Posts: 18276
Joined: 2006/12/13 20:15:34
Location: Tidewater, Virginia, North America
Contact:

Re: Problems with slow calls to pthread_condition_signal

Postby pschaff » 2011/05/11 15:36:37

pedrito wrote:
Is there any way of checking what version of libpthread the most recent CentOS release uses?

The runtime libraries are provided by glibc and the development libraries by glibc-devel. To see the currently available packages:

Code: Select all

# yum provides \*libpthread\*
but it is still libpthread-2.5 in glibc-2.5-58.el5_6.3.

CentOS 6 will provide libpthread-2.12.