Publications‎ > ‎Articles‎ > ‎

Oracle 11g Performance Issue

A client recently downgraded from Oracle's 11.1.0.6 RDBMS to 10.2.0.4 because of performance issues. Their application inserted hundreds of millions of records extracted from a legacy mainframe system inside a PL/SQL cursor for-loop. The client took this approach since each record required a number of transformations and that logic could be easily applied within the loop. The two databases where we tested their code differed only by their version, yet the 11g instance took 33% longer to perform the same operations.

Both databases ran on the same 64-bit Linux (RHEL5) machine and had comparable memory resources allocated to them. (I'd be very interested to hear if you can reproduce the results below, especially if you're on a different operating system.) Yet 10g consistently outperformed 11g, at least for the operations in question.

The script I used to demonstrate the issue follows. It creates a very simple test case and activates Oracle's extended SQL trace, a diagnostic feature which records the database session's activity and how much time each action took.

Performance Test Script:

whenever sqlerror exit failure

-- get the version to identify trace files
col file_id for a13 new_value file_id

SELECT
'perf_test_' || SUBSTR(version, 1, 2) || 'g' file_id
FROM v$instance;

spool &file_id..log

-- create a test user (note that we designated a
-- default tablespace for this database when it
-- was created so that's not specified below)
CREATE USER perf_test IDENTIFIED BY perf_test;

GRANT
ALTER SESSION,
CREATE SESSION,
CREATE TABLE,
SELECT ANY DICTIONARY,
UNLIMITED TABLESPACE
TO perf_test;

connect perf_test/perf_test

-- create a source table
CREATE TABLE source AS
SELECT *
FROM dba_objects
WHERE ROWNUM <= 5000;

-- create a target table
CREATE TABLE target AS
SELECT *
FROM dba_objects
WHERE 1 = 2;

-- initiate trace and timing
ALTER SESSION SET statistics_level = ALL;
ALTER SESSION SET "_rowsource_execution_statistics" = TRUE;
ALTER SESSION SET tracefile_identifier = '&file_id';
ALTER SESSION SET events =
'10046 TRACE NAME CONTEXT FOREVER, LEVEL 8';

set timing on
set echo on

-- populate target from source
BEGIN
-- create 1000000 records in the target
-- from the 5000 already in the source
FOR i IN 1 .. 1000000 / 5000 LOOP
FOR rec IN (
SELECT * FROM source) LOOP

INSERT INTO target VALUES rec;
END LOOP;
END LOOP;
END;
/

set echo off
set timing off

connect /

DROP USER perf_test CASCADE;

exit

As shown, the script

  1. creates a user to run the tests,
  2. logs in,
  3. creates the source and target tables,
  4. enables an extended SQL trace (using the 10046 event),
  5. loops through the source, and
  6. inserts into the target.

Initially I gathered the insert timings without the 10046 trace. In that case, 10g ran the PL/SQL block above in 30 seconds; 11g took 40 seconds. After enabling the trace, 10g took 40 seconds and 11g took 70 seconds! Why the difference?

We might forgive some of the tracing overhead since the same insert is being executed a million times. This emits a million "EXEC #d+" records into the trace file, which grew to 70mb during the 10g test. The 11g test, however, created a trace file almost 5 times larger at 330mb. Again, why the difference?

Trace files of any size are difficult to interpret, so Oracle supplies a utility called TKPROF to provide a summary of the activity contained within them. Several clues leaped out when comparing the output. First review these snippets from the 10g TKPROF output:

10g TKPROF Output:

BEGIN
-- create 1000000 records in the target
-- from the 5000 already in the source
FOR i IN 1 .. 1000000 / 5000 LOOP
FOR rec IN (
SELECT * FROM source) LOOP
INSERT INTO target VALUES rec;
END LOOP;
END LOOP;
END;

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 18.15 18.76 0 0 0 1
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 18.15 18.77 0 0 0 1

.
.
.

INSERT INTO TARGET
VALUES
(:B1 ,:B2 ,:B3 ,:B4 ,:B5 ,:B6 ,:B7 ,:B8 ,:B9 ,:B10 ,:B11 ,:B12 ,:B13 )

call count cpu elapsed disk query current rows
------- ------- -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1000000 18.18 16.99 0 3050 1020489 1000000
Fetch 0 0.00 0.00 0 0 0 0
------- ------- -------- ---------- ---------- ---------- ---------- ----------
total 1000001 18.18 16.99 0 3050 1020489 1000000

.
.
.

1 session in tracefile.
4 user SQL statements in trace file.
369 internal SQL statements in trace file.
373 SQL statements in trace file.
11 unique SQL statements in trace file.
1013912 lines in trace file.
38 elapsed seconds in trace file.

But the same sections from the 11g trace file indicates this:

11g TKPROF Output:

BEGIN
-- create 1000000 records in the target
-- from the 5000 already in the source
FOR i IN 1 .. 1000000 / 5000 LOOP
FOR rec IN (
SELECT * FROM source) LOOP
INSERT INTO target VALUES rec;
END LOOP;
END LOOP;
END;

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 45.98 46.46 0 0 0 1
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 45.98 46.46 0 0 0 1

.
.
.

INSERT INTO TARGET
VALUES
(:B1 ,:B2 ,:B3 ,:B4 ,:B5 ,:B6 ,:B7 ,:B8 ,:B9 ,:B10 ,:B11 ,:B12 ,:B13 ,:B14 ,
:B15 )

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 999996 18.53 21.39 0 3247 1021169 999996
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 999997 18.53 21.39 0 3247 1021169 999996

.
.
.

1 session in tracefile.
1000202 user SQL statements in trace file.
328 internal SQL statements in trace file.
1000530 SQL statements in trace file.
13 unique SQL statements in trace file.
5016224 lines in trace file.
70 elapsed seconds in trace file.

What?!?! Several oddities can be see when comparing this output. First of all, the same code was executed on both systems, yet the 11g trace recorded 1,000,000 more user SQL statements than 10g. Also, the insert statement in 11g is missing four executions, showing only 999,996 instead of 1,000,000. Curiously, the other four can be found elsewhere in the TKPROF output, as singleton queries that should have been aggregated together with the others. And if we dig into 11g's raw trace file, we find 1,000,000 statements like this:

PARSING IN CURSOR #3 len=109 dep=1 uid=105 oct=2 lid=105 tim=1222272480471544 hv=2621435192 ad='cb846118' sqlid='25ts65qf3zv9s'
INSERT INTO TARGET VALUES (:B1 ,:B2 ,:B3 ,:B4 ,:B5 ,:B6 ,:B7 ,:B8 ,:B9 ,:B10 ,:B11 ,:B12 ,:B13 ,:B14 ,:B15 )
END OF STMT

The comparable statement in 10g appears only once. Note that the parse count for the insert statement in both traces is one. But it seems that each insert gets treated as a separate statement for the "user SQL statements" metric because of the repeated messages like the one above. What's more, twice as much time is spent on CPU for 11g during the execution of the PL/SQL block itself as compared with 10g. So while the actual parse count is sensible (as measured by "PARSE #d+" records in the trace), each insert appears as if it's still getting re-evaluated in some way by 11g.

I considered that this behavior might be an artifact of bind variable peeking or cursor sharing gone astray. To that end, I tried toggling each of the following session parameters:

  1. CURSOR_SHARING: I tried FORCE, EXACT, and SIMILAR, none of which had an effect
  2. SESSION_CACHED_CURSORS: I ran the test using values of 50 and 0, the latter doubled the runtime for both 10g and 11g
  3. _OPTIM_PEEK_USER_BINDS: I tried both TRUE and FALSE for this parameter; no effect
  4. _OPTIMIZER_ADAPTIVE_CURSOR_SHARING: Also ran with both TRUE and FALSE; also no effect
  5. OPTIMIZER_FEATURES_ENABLE: I even tried setting this one back to 10.2.0.4 for the 11g database; again, no effect

In the end, nothing I tried could dissuade 11g from emitting the "PARSING IN CURSOR #d+" message for each insert statement. I filed a service request with Oracle Support on this issue and they ultimately referred the issue to development as a bug. Note that Support couldn't reproduce the slowness I was seeing, but their trace files did reflect the parsing messages I observed. (I'll post a follow-up if they provide a solution.) Please post a comment if you have another suggestion on how to address this issue. While I can't pretend to know what's going on with the internals here, I can say that my client needed the best throughput they could get and the downgrade bought them an extra 33%.


David Lipowitz
Principal Consultant
Tuning Knife Consulting

September 26, 2008