Apex, Oracle 11g & Ultra SPARC T2

In my experience the early Ultra SPARC T Series chips are not great when it comes to supporting Apex applications, the single thread performance is poor and reflects in significantly increased average page times when compared to more conventional CPUs.

However recently encountered and unusual problem after a site upgraded a database running on a T5240 from Oracle RDBMS v10.2.0.5 to v11.2.0.2.

Average page times had gone from 1s to around 4s since the upgrade, and users were complaining loudly about performance.

Statspack was telling us that there was an usually high number of cursor: pin S wait on X waits – but why ?

Top 5 Timed Events                                      Avg %Total
wait   Call
Event                              Waits    Time (s)   (ms)   Time
--------------------------- ------------ ----------- ------ ------
CPU time                               24,208          85.6
cursor: pin S wait on X           24,177       1,361     56    4.8
db file sequential read          659,410         943      1    3.3
External Procedure call              206         861   4177    3.0
log file parallel write           93,792         251      3     .9

Well apparently its all down to mutexes – and once again it was Tanel Poder that set me on the right path

The above link describes a problem he had encountered at 10g – the bit that set me thinking was

When a latch getter can’t get the latch after spinning, it will go to sleep to release the CPU. Even if there are many latch getters in the CPU runqueue before the latch holder, they all spin quickly and end up sleeping again. But when a mutex getter doesn’t get the mutex after spinning, it will not go to sleep!!! It will yield() the CPU instead, which means that it will go to the end of runqueue and try to get back onto CPU as soon as possible. So, mutex getters in 10.2 are much less graceful, they can burn a lot of CPU when the mutex they want is held by someone else for long time.

On a T series there’s nearly always a CPU thread available (particularly if like this customer you don’t have that many users accessing the system) so even if a mutex getter yields then it will (nearly always) get a CPU thread back straightaway – so will pretty much be constantly on CPU trying to get hold of the mutex (which explained the high cpu usage and large number of waits – and presumably the performance problems experienced by the customer).

Tanel suggested this was fixed at 11g (mutex getters should sleep instead of just yielding), but it turns out that’s not quite the whole story – the default behaviour is still to always yield, but there are some hidden parameters that provide some control

Bug 10411618 – Enhancement to add different “Mutex” wait schemes

In order to encourage the mutex getters to sleep a bit instead of just yielding all the time I got the customer to add the following parameters

alter system set "_mutex_wait_scheme" = 1 scope = spfile;
alter system set "_mutex_wait_time" = 8 scope = spfile;

which means they should yield the first time but will sleep for 8ms after each subsequent attempt to acquire the mutex before trying again.

The customer confirmed that since applying the parameters performance has improved to around the same level as prior to the upgrade to 11g.