Monthly Archives: September 2014

The cost of using an index

When trying to understand why the optimiser might choose not to use an index to read rows from a table, I find the following diagram helpful.

index cost

The axis along the bottom represents the selectivity of the query (0% being no rows selected, 100% being all rows selected), and the axis up the side represents the cost of the chosen operation (more costly = more work to do).

When reading from a table without using an index, the cost of retrieving one row from the table is the same as the cost of retrieving all rows from the table (the red line on the chart). The database has to read the entire table in both cases.

When using an index to read from a table, the cost is proportional to the number of rows retrieved (the green line on the chart).

The point at which they cross over is the “break-even” point; if the selectivity of the query is above this point, expect the optimiser to ignore the index and scan the entire table.

There are a number of factors which determine the placement of these two lines, such as the size of the table, the size of the index, the way in which data is organised within the index, and so on. These are beyond the scope of this (brief) post.

Deadlocks due to unindexed foreign keys

Here is a step-by-step analysis of the deadlock which occurs due to unindexed foreign keys. This analysis was performed on Oracle XE 11.2.0.2 – results may vary on other versions.

A typical deadlock graph arising from an unindexed foreign key is as follows.

                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TM-0000508a-00000000        27     101    SX             28      12         SSX
TX-00090013-0000019b        28      12     X             27     101           S

This type of deadlock is characterised by at least one session holding a table lock of type TM (table lock), and another session waiting for the TM lock in SSX mode. To help make sense of this, I found it useful to first understand two underlying concepts; how to identify what a lock relates to, and how to trace locks.

Identifying the resources in a lock

For the purposes of this post, there are two lock types of interest, TM (table) locks, and TX (transaction or row locks). Locks are reported in the data dictionary views DBA_LOCKS and V$LOCK, and also show up in trace files as shown in the deadlock graph above.

In DBA_LOCKS and V$LOCK, the ID1 and ID2 columns indicate which object or transaction the lock relates to. These IDs also correspond to the hexadecimal numbers in the resources show in the deadlock graph as above, e.g.

  • TM-0000501c-00000000 is a table lock with ID1 = 20508, ID2 = 0
  • TX-00050005-00000196 is a transaction lock with ID1 = 20485, ID2 = 406

In the case of TM locks, ID1 corresponds to the OBJECT_ID of the table, which can be cross-referenced in DBA_OBJECTS, so in this case OBJECT_ID = 20508 corresponds to table T1.

In the case of TX locks, ID1 and ID2 identify the transaction that the lock relates to, although the relationship is a little more convoluted (and transient – it is possible to go back sometime later to study object IDs, but transaction IDs will be long gone). ID1 expressed in hex corresponds to the USN and SLOT (XIDUSN and XIDSLOT in V$TRANSACTION respectively), and ID2 expressed in hex corresponds to the SQN (XIDSQN) of the transaction. So in the above example, the lock with ID1 = 20485 (hex 00050005) and ID2 = 406 (00000196) corresponds to

  • XIDUSN = hex 0005, decimal 3
  • XIDSLOT = hex 0005, decimal 5
  • XIDSQN = hex00000196, decimal 406

The transaction ID (XID) is a 16-digit hex number composed of these three values (USN-SLOT-SQN), but to complicate things, it is expressed in “reverse endianness”, e.g. :

XID = reverse USN || reverse SLOT || reverse SQN
XID = 0500 || 0500 || 69010000
XID = 0500050069010000

You can easily see example values by starting a few transactions and then querying V$TRANSACTION :

SELECT xidusn
     , to_char(xidusn,'0XXX') hex_usn
     , xidslot
     , to_char(xidslot,'0XXX') hex_slot
     , xidsqn
     , to_char(xidsqn,'0XXXXXXX') hex_sqn
     , xid
FROM   v$transaction;

Tracing locking activity

This section describes how to trace locking activity. This can be quite informative, as locks can be created and modified quickly, and simply querying DBA_LOCKS may not give the whole picture.

First, create two tables with a simple parent-child relationship and insert some sample data. Note the use of ON DELETE CASCADE, which cause the child table to be locked for the deletion of dependent data.

CREATE TABLE t1
( id NUMBER
, data VARCHAR2(50)
, CONSTRAINT pk_t1 PRIMARY KEY (id)
);

CREATE TABLE t2
( id NUMBER
, data VARCHAR2(50)
, pid NUMBER
, CONSTRAINT pk_t2 PRIMARY KEY (id)
, CONSTRAINT fk_t2_T1 FOREIGN KEY (pid) REFERENCES t1 ON DELETE CASCADE
);

INSERT INTO t1 VALUES (1,'A');
INSERT INTO t1 VALUES (2,'A');
INSERT INTO t2 VALUES (10,'A',1);
INSERT INTO t2 VALUES (20,'A',2);
COMMIT;

Locking activity can be traced using the 10704 trace event as follows.

ALTER SESSION SET EVENTS = '10704 trace name context forever, level 5';

The tracing can be disabled as follows.

ALTER SESSION SET EVENTS = '10704 trace name context off';

Locks are written out to the session trace file, which can be identified with the following query.

SELECT value FROM v$diag_info WHERE name = 'Default Trace File';

Enable tracing, and run the following DELETE statement.

DELETE FROM t1 WHERE id = 1;

This will cause a cascade delete, deleting the corresponding row from T2. Because the foreign key on T2 is not indexed, we will see additional locking taking place.

The trace file contains quite a bit of information – the relevant operations (creation and conversion of locks) are shown below :

ksqgtl *** TM-00005067-00000000 mode=3 flags=0x401 timeout=21474836 ***
ksqgtl *** TM-00005069-00000000 mode=5 flags=0x401 timeout=21474836 ***
ksqcnv: TM-00005069,00000000 mode=3 timeout=21474836
ksqgtl *** TX-00060002-000001be mode=6 flags=0x401 timeout=0 ***
ksqcnv: TM-00005069,00000000 mode=5 timeout=21474836
ksqcnv: TM-00005069,00000000 mode=3 timeout=21474836

Line-by-line, this can be read as follows. See the links at the end for notes on translating the mode.

  1. Table lock on parent table, mode SX
  2. Table lock on child table, mode SSX
  3. Conversion of child table lock to SX mode
  4. Transaction lock for the deletion of row data, mode X
  5. Conversion of child table lock to SSX mode
  6. Conversion of child table lock to SX mode

So this has shown that, in order to perform a delete using an ON DELETE CASCADE constraint, an SSX lock is temporarily required twice, and an SX lock will remain on the table until the transaction is committed.

If one session is already holding an SX lock on a table, then another session cannot obtain an SSX lock (the two are incompatible). The second session will be blocked and must wait for the first to commit or roll back (unless a timeout is specified).

Incidentally, the wait event corresponding to this is ‘enq: TM – contention’ – this indicates when a session is waiting on another session to get a TM lock. If this shows up in Statspack or AWR, it’s a sure sign that there are unindexed foreign keys in the database.

Locking with indexed foreign keys

Let’s add an index to the above table and repeat the trace.

CREATE INDEX ix_t2 ON t2 (pid);

Now, deleting a parent record with dependents

DELETE FROM t1 WHERE id = 2;

The relevant lines from the trace file are as follows.

ksqgtl *** TM-00005067-00000000 mode=3 flags=0x401 timeout=21474836 ***
ksqgtl *** TM-00005069-00000000 mode=3 flags=0x401 timeout=21474836 ***
ksqgtl *** TX-00050014-0000019e mode=6 flags=0x401 timeout=0 ***

Line-by-line, this can be read as follows.

  1. Table lock on parent table, mode SX
  2. Table lock on child table, mode SX
  3. Transaction lock for the deletion of row data, mode X

Notice that the SSX lock is no longer required, and as a result, one session will not block another and the contention is eliminated.

As an aside, doing DML with indexed foreign keys is far more efficient. Enable autotrace in SQL*Plus and compare the I/O required for deleting with and without indexes on foreign keys. When unindexed, a full table scan on the child table is required to find all dependants, whereas with an index in place, an index range scan can be used to find all dependants, which typically involves far fewer I/Os.

Triggering a deadlock

Having now established how to identify and trace locks, the deadlock graph is a lot easier to understand. First, the steps to cause a deadlock to occur. We will create three tables – two tables to participate in a foreign key relationship, and a third table for a completely separate transaction.

DROP TABLE t1 CASCADE CONSTRAINTS PURGE;
DROP TABLE t2 CASCADE CONSTRAINTS PURGE;
DROP TABLE t3 CASCADE CONSTRAINTS PURGE;

CREATE TABLE t1
( id NUMBER
, data VARCHAR2(50)
, CONSTRAINT pk_t1 PRIMARY KEY (id)
);

CREATE TABLE t2
( id NUMBER
, data VARCHAR2(50)
, pid NUMBER
, CONSTRAINT pk_t2 PRIMARY KEY (id)
, CONSTRAINT fk_t2_t1 FOREIGN KEY (pid) REFERENCES t1 ON DELETE CASCADE
);

CREATE TABLE t3
( id NUMBER
, data VARCHAR2(50)
, CONSTRAINT pk_t3 PRIMARY KEY (id)
);

INSERT INTO t1 VALUES (1,'A');
INSERT INTO t1 VALUES (2,'A');
INSERT INTO t2 VALUES (10,'A',1);
INSERT INTO t2 VALUES (20,'A',2);
COMMIT;

Now, start two SQL*Plus sessions. In session 1, issue a delete against the parent table, and do not commit.

DELETE FROM t1 WHERE id = 1;

This will start a transaction, and also create table locks on T1 and T2.

In session 2, insert a new record into T3.

INSERT INTO t3 VALUES (1,'A');

This will start separate transaction. We now have two sessions holding locks, ready to enter a deadlocked state.

In session 1, attempt to insert the same record into T3. This will be blocked, as it must wait on Session 2 to commit before it can determine whether or not the primary key has been violated.

INSERT INTO t3 VALUES (1,'A');

Finally, in session 2, attempt to delete a different record from T1. From a transactional point of view, this should not be blocked (there are no PK conflicts), but because of the need for an SSX lock on the child table, the operation will be blocked by session 2.

DELETE FROM t1 WHERE id = 2;

After a few seconds, a deadlock will be reported (probably in session 1). Check the trace file for details.

                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TM-0000508a-00000000        27     101    SX             28      12         SSX
TX-00090013-0000019b        28      12     X             27     101           S

So from this deadlock graph, we can see the following :

  • Session 1 (SID=101) is holding an SX lock on table T1 (object id 508A)
  • Session 2 (SID=12) is holding a transaction lock (TX-00090013-0000019b)
  • Session 1 (SID=101) is waiting for the same transaction (TX-00090013-0000019b)
  • Session 2 (SID=12) is waiting to take an SSX lock on table T1 (object id 508A)

The trace file may also contain details of the SQL being executed.

Repeat the above exercise with an index on the foreign key, and no deadlock will occur.

Conclusion

In this post, we’ve seen

  • How to recognise unindexed foreign keys as the cause of a deadlock
  • How to identify locks and relate them to tables or transactions
  • How to trace locking activity in a session
  • How to trigger a deadlock involving an unindexed foreign key

Several variants of the deadlock graph can occur, depending on what the sessions are doing. In some cases, both locks are TM (when dealing with multiple foreign keys). In some cases, it is possible to trigger a deadlock involving three or more sessions. The common factor with unindexed foreign keys is a session waiting for a TM lock in SSX mode.

References

Capturing long-running SQL in Statspack

Oracle 11.2.0.1.0

I have been investigating “enq: TM – contention” wait events, which are almost certainly due to un-indexed foreign keys. One such extreme example which showed up in a Statspack report is as follows :

                                                             Avg          %Total
                                          %Tim Total Wait   wait    Waits   Call
Event                               Waits  out   Time (s)   (ms)     /txn   Time
---------------------------- ------------ ---- ---------- ------ -------- ------
enq: TM - contention                    1    0      5,973 ######      0.0   58.6

This suggest a single SQL statement waited nearly two hours for another session to commit. I thought it should be easy enough to find this in the “SQL ordered by Elapsed time” section of the report, but it is not there. I’ll explain why I think this is.

Studying the SQL script which generates the Statspack report (sprepins.sql), it can be seen that “SQL ordered by Elapsed time” is obtained from the STATS$TEMP_SQLSTATS table, which is in turn populated from STATS$SQL_SUMMARY. I checked STATS$SQL_SUMMARY, and there were no entries with a large enough ELAPSED_TIME to be the culprit.

So then I looked at the STATSPACK package, which populates STATS$SQL_SUMMARY. There are two insert statements which populate STATS$SQL_SUMMARY, both of which contain the following WHERE clause to limit which SQL statements are captured :

where (   buffer_gets   > l_buffer_gets_th
       or disk_reads    > l_disk_reads_th
       or parse_calls   > l_parse_calls_th
       or executions    > l_executions_th
       or sharable_mem  > l_sharable_mem_th
       or version_count > l_version_count_th
      )

The various thresholds are taken from the STATS$STATSPACK_PARAMETER table, and the default values are

  • buffer gets – 10,000
  • disk reads – 1,000
  • parse calls – 1,000
  • executions – 100
  • sharable memory – 1,048,576
  • version count – 20

What is missing from this set of criteria is execution time. In the case of TM contention, I have a SQL statement which can have an extremely long execution time, but because it does not pass any of the given thresholds, it will not be captured by Statspack. Obviously the majority of long-running SQL statements will exceed at least one of these thresholds, but in the case of lock waits, they may not. I would have thought that any SQL that runs for more than 10-20 seconds would be worth capturing in Statspack.

It would be possible to change one of the listed thresholds to capture more SQL (e.g. executions => 0), but this means that all SQL is captured, which could lead to excessive data build-up in the Statspack schema.

I can’t see any way to configure Statspack to capture SQL based solely on its execution time. I guess it would be possible to modify the STATSPACK package, but that introduces risks to the overall stability of Statspack.