Deadlocks due to unindexed foreign keys

Here is a step-by-step analysis of the deadlock which occurs due to unindexed foreign keys. This analysis was performed on Oracle XE 11.2.0.2 – results may vary on other versions.

A typical deadlock graph arising from an unindexed foreign key is as follows.

                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TM-0000508a-00000000        27     101    SX             28      12         SSX
TX-00090013-0000019b        28      12     X             27     101           S

This type of deadlock is characterised by at least one session holding a table lock of type TM (table lock), and another session waiting for the TM lock in SSX mode. To help make sense of this, I found it useful to first understand two underlying concepts; how to identify what a lock relates to, and how to trace locks.

Identifying the resources in a lock

For the purposes of this post, there are two lock types of interest, TM (table) locks, and TX (transaction or row locks). Locks are reported in the data dictionary views DBA_LOCKS and V$LOCK, and also show up in trace files as shown in the deadlock graph above.

In DBA_LOCKS and V$LOCK, the ID1 and ID2 columns indicate which object or transaction the lock relates to. These IDs also correspond to the hexadecimal numbers in the resources show in the deadlock graph as above, e.g.

  • TM-0000501c-00000000 is a table lock with ID1 = 20508, ID2 = 0
  • TX-00050005-00000196 is a transaction lock with ID1 = 20485, ID2 = 406

In the case of TM locks, ID1 corresponds to the OBJECT_ID of the table, which can be cross-referenced in DBA_OBJECTS, so in this case OBJECT_ID = 20508 corresponds to table T1.

In the case of TX locks, ID1 and ID2 identify the transaction that the lock relates to, although the relationship is a little more convoluted (and transient – it is possible to go back sometime later to study object IDs, but transaction IDs will be long gone). ID1 expressed in hex corresponds to the USN and SLOT (XIDUSN and XIDSLOT in V$TRANSACTION respectively), and ID2 expressed in hex corresponds to the SQN (XIDSQN) of the transaction. So in the above example, the lock with ID1 = 20485 (hex 00050005) and ID2 = 406 (00000196) corresponds to

  • XIDUSN = hex 0005, decimal 3
  • XIDSLOT = hex 0005, decimal 5
  • XIDSQN = hex00000196, decimal 406

The transaction ID (XID) is a 16-digit hex number composed of these three values (USN-SLOT-SQN), but to complicate things, it is expressed in “reverse endianness”, e.g. :

XID = reverse USN || reverse SLOT || reverse SQN
XID = 0500 || 0500 || 69010000
XID = 0500050069010000

You can easily see example values by starting a few transactions and then querying V$TRANSACTION :

SELECT xidusn
     , to_char(xidusn,'0XXX') hex_usn
     , xidslot
     , to_char(xidslot,'0XXX') hex_slot
     , xidsqn
     , to_char(xidsqn,'0XXXXXXX') hex_sqn
     , xid
FROM   v$transaction;

Tracing locking activity

This section describes how to trace locking activity. This can be quite informative, as locks can be created and modified quickly, and simply querying DBA_LOCKS may not give the whole picture.

First, create two tables with a simple parent-child relationship and insert some sample data. Note the use of ON DELETE CASCADE, which cause the child table to be locked for the deletion of dependent data.

CREATE TABLE t1
( id NUMBER
, data VARCHAR2(50)
, CONSTRAINT pk_t1 PRIMARY KEY (id)
);

CREATE TABLE t2
( id NUMBER
, data VARCHAR2(50)
, pid NUMBER
, CONSTRAINT pk_t2 PRIMARY KEY (id)
, CONSTRAINT fk_t2_T1 FOREIGN KEY (pid) REFERENCES t1 ON DELETE CASCADE
);

INSERT INTO t1 VALUES (1,'A');
INSERT INTO t1 VALUES (2,'A');
INSERT INTO t2 VALUES (10,'A',1);
INSERT INTO t2 VALUES (20,'A',2);
COMMIT;

Locking activity can be traced using the 10704 trace event as follows.

ALTER SESSION SET EVENTS = '10704 trace name context forever, level 5';

The tracing can be disabled as follows.

ALTER SESSION SET EVENTS = '10704 trace name context off';

Locks are written out to the session trace file, which can be identified with the following query.

SELECT value FROM v$diag_info WHERE name = 'Default Trace File';

Enable tracing, and run the following DELETE statement.

DELETE FROM t1 WHERE id = 1;

This will cause a cascade delete, deleting the corresponding row from T2. Because the foreign key on T2 is not indexed, we will see additional locking taking place.

The trace file contains quite a bit of information – the relevant operations (creation and conversion of locks) are shown below :

ksqgtl *** TM-00005067-00000000 mode=3 flags=0x401 timeout=21474836 ***
ksqgtl *** TM-00005069-00000000 mode=5 flags=0x401 timeout=21474836 ***
ksqcnv: TM-00005069,00000000 mode=3 timeout=21474836
ksqgtl *** TX-00060002-000001be mode=6 flags=0x401 timeout=0 ***
ksqcnv: TM-00005069,00000000 mode=5 timeout=21474836
ksqcnv: TM-00005069,00000000 mode=3 timeout=21474836

Line-by-line, this can be read as follows. See the links at the end for notes on translating the mode.

  1. Table lock on parent table, mode SX
  2. Table lock on child table, mode SSX
  3. Conversion of child table lock to SX mode
  4. Transaction lock for the deletion of row data, mode X
  5. Conversion of child table lock to SSX mode
  6. Conversion of child table lock to SX mode

So this has shown that, in order to perform a delete using an ON DELETE CASCADE constraint, an SSX lock is temporarily required twice, and an SX lock will remain on the table until the transaction is committed.

If one session is already holding an SX lock on a table, then another session cannot obtain an SSX lock (the two are incompatible). The second session will be blocked and must wait for the first to commit or roll back (unless a timeout is specified).

Incidentally, the wait event corresponding to this is ‘enq: TM – contention’ – this indicates when a session is waiting on another session to get a TM lock. If this shows up in Statspack or AWR, it’s a sure sign that there are unindexed foreign keys in the database.

Locking with indexed foreign keys

Let’s add an index to the above table and repeat the trace.

CREATE INDEX ix_t2 ON t2 (pid);

Now, deleting a parent record with dependents

DELETE FROM t1 WHERE id = 2;

The relevant lines from the trace file are as follows.

ksqgtl *** TM-00005067-00000000 mode=3 flags=0x401 timeout=21474836 ***
ksqgtl *** TM-00005069-00000000 mode=3 flags=0x401 timeout=21474836 ***
ksqgtl *** TX-00050014-0000019e mode=6 flags=0x401 timeout=0 ***

Line-by-line, this can be read as follows.

  1. Table lock on parent table, mode SX
  2. Table lock on child table, mode SX
  3. Transaction lock for the deletion of row data, mode X

Notice that the SSX lock is no longer required, and as a result, one session will not block another and the contention is eliminated.

As an aside, doing DML with indexed foreign keys is far more efficient. Enable autotrace in SQL*Plus and compare the I/O required for deleting with and without indexes on foreign keys. When unindexed, a full table scan on the child table is required to find all dependants, whereas with an index in place, an index range scan can be used to find all dependants, which typically involves far fewer I/Os.

Triggering a deadlock

Having now established how to identify and trace locks, the deadlock graph is a lot easier to understand. First, the steps to cause a deadlock to occur. We will create three tables – two tables to participate in a foreign key relationship, and a third table for a completely separate transaction.

DROP TABLE t1 CASCADE CONSTRAINTS PURGE;
DROP TABLE t2 CASCADE CONSTRAINTS PURGE;
DROP TABLE t3 CASCADE CONSTRAINTS PURGE;

CREATE TABLE t1
( id NUMBER
, data VARCHAR2(50)
, CONSTRAINT pk_t1 PRIMARY KEY (id)
);

CREATE TABLE t2
( id NUMBER
, data VARCHAR2(50)
, pid NUMBER
, CONSTRAINT pk_t2 PRIMARY KEY (id)
, CONSTRAINT fk_t2_t1 FOREIGN KEY (pid) REFERENCES t1 ON DELETE CASCADE
);

CREATE TABLE t3
( id NUMBER
, data VARCHAR2(50)
, CONSTRAINT pk_t3 PRIMARY KEY (id)
);

INSERT INTO t1 VALUES (1,'A');
INSERT INTO t1 VALUES (2,'A');
INSERT INTO t2 VALUES (10,'A',1);
INSERT INTO t2 VALUES (20,'A',2);
COMMIT;

Now, start two SQL*Plus sessions. In session 1, issue a delete against the parent table, and do not commit.

DELETE FROM t1 WHERE id = 1;

This will start a transaction, and also create table locks on T1 and T2.

In session 2, insert a new record into T3.

INSERT INTO t3 VALUES (1,'A');

This will start separate transaction. We now have two sessions holding locks, ready to enter a deadlocked state.

In session 1, attempt to insert the same record into T3. This will be blocked, as it must wait on Session 2 to commit before it can determine whether or not the primary key has been violated.

INSERT INTO t3 VALUES (1,'A');

Finally, in session 2, attempt to delete a different record from T1. From a transactional point of view, this should not be blocked (there are no PK conflicts), but because of the need for an SSX lock on the child table, the operation will be blocked by session 2.

DELETE FROM t1 WHERE id = 2;

After a few seconds, a deadlock will be reported (probably in session 1). Check the trace file for details.

                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TM-0000508a-00000000        27     101    SX             28      12         SSX
TX-00090013-0000019b        28      12     X             27     101           S

So from this deadlock graph, we can see the following :

  • Session 1 (SID=101) is holding an SX lock on table T1 (object id 508A)
  • Session 2 (SID=12) is holding a transaction lock (TX-00090013-0000019b)
  • Session 1 (SID=101) is waiting for the same transaction (TX-00090013-0000019b)
  • Session 2 (SID=12) is waiting to take an SSX lock on table T1 (object id 508A)

The trace file may also contain details of the SQL being executed.

Repeat the above exercise with an index on the foreign key, and no deadlock will occur.

Conclusion

In this post, we’ve seen

  • How to recognise unindexed foreign keys as the cause of a deadlock
  • How to identify locks and relate them to tables or transactions
  • How to trace locking activity in a session
  • How to trigger a deadlock involving an unindexed foreign key

Several variants of the deadlock graph can occur, depending on what the sessions are doing. In some cases, both locks are TM (when dealing with multiple foreign keys). In some cases, it is possible to trigger a deadlock involving three or more sessions. The common factor with unindexed foreign keys is a session waiting for a TM lock in SSX mode.

References

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s