Securing Oracle scripts with a wallet

It’s that time of year when I try to clean up some of my bad habits from the year before. This time, it’s plain-text passwords in SQL scripts. Here’s a way to make them a little more secure. You can move all your passwords into a secure wallet file stored on your PC. This is not the most secure option available (and you should keep the wallet in a private location) but it is better than nothing. There is a more secure option which ties the wallet to your machine and username, but this complicates matters if you want to use the same wallet from different machines.

You will need to know the location of your SQLNET.ORA and TNSNAMES.ORA files which are used by SQL*Plus (usually defined by the TNS_ADMIN environment variable).

Create a folder to hold your wallet

mkdir D:\SQL\auth

Add the following into your SQLNET.ORA file to point to the Oracle Wallet

WALLET_LOCATION =
  (SOURCE =
    (METHOD = FILE)
    (METHOD_DATA =
  (DIRECTORY = D:\SQL\auth)
  )
)

SQLNET.WALLET_OVERRIDE = TRUE

Create a wallet – you will be prompted for a password to secure the wallet

orapki wallet create -wallet D:\SQL\auth -auto_login

Add your database credentials to the wallet, using an alias from your TNSNAMES file. You will be prompted for your database password, which is stored securely in the wallet.

mkstore -wrl D:\SQL\auth -createCredential TNS_ALIAS USERNAME

Now, when using SQL*Plus, you can connect using the wallet without having a password in your scripts

sqlplus /@TNS_ALIAS
connect /@TNS_ALIAS

If you need to connect via a proxy

sqlplus [PROXY]/@TNS_ALIAS
connect [PROXY]/@TNS_ALIAS

To list the contents of the wallet

mkstore -wrl D:\SQL\auth -listCredential

To modify the contents of the wallet, e.g. if your password is changed

mkstore -wrl D:\SQL\auth -modifyCredential TNS_ALIAS USERNAME PASSWORD

To delete an entry from the wallet

mkstore -wrl D:\SQL\auth -deleteCredential TNS_ALIAS

More reading

https://docs.oracle.com/database/121/DBSEG/authentication.htm#DBSEG99815

https://blog.pythian.com/securing-oracle-monitoring-backup-scripts/

https://blog.pythian.com/oracle-secure-external-password-stores/

Advertisements

Long-running delete of synopses during table stats gathering

Oracle 11.2.0.4

I recently encountered one long-running session and several blocked sessions, all performing the same delete operation.

--SQL_ID dsf21kcbjqfyk
DELETE FROM SYS.WRI$_OPTSTAT_SYNOPSIS$ WHERE BO# = :B2 AND INTCOL# = :B1;

One session had been running for several hours, and the other sessions were all blocked by a row lock from the first session. Unfortunately, each of these sessions was trying to gather stats on new partitions of tables as part of various ETL processes, so being blocked for several hours was not ideal.

The SYS table holds synopses of column statistics from partitioned tables, which are used to maintain incremental statistics on partitioned tables. For more details on this feature, see here.

The table is a range-hash partitioned table consisting of the following columns :

  • BO# NUMBER (partition key)
  • GROUP# NUMBER (subpartition key)
  • INTCOL# NUMBER
  • HASHVALUE NUMBER

In my case, the table contains around 11 billion rows across 1,700 partitions, and up to 32 subpartitions per partition. The total size of the table on disk is around 500GB, and the table has no indexes. The database itself contains around 1,800 partitioned tables, and over 1 million partitions and subpartitions.

The real-time SQL monitoring report from each session showed over 2,000 executions of the DELETE, suggesting it was being run from a PL/SQL FORALL statement. With each execution having to scan a whole partition, we were seeing over 1.5 billion buffer gets over a 4 hour period, which is around 100k buffer gets per second.

The report also showed that all of the sessions had the same values for the bind variables. This explains why all but one were blocked by a row lock held by the one “active” session.

On a hunch, I took these values of the bind variables and matched BO# to the OBJECT_ID in DBA_OBJECTS and INTCOL# to the INTERNAL_COLUMN_ID in DBA_TAB_COLS. Funnily enough, the object ID matched to an unrelated application table which, earlier in the day, had had a column dropped, and the column ID was one more than the remaining number of columns in the table. So it seems that every single stats gathering operation was trying to clean out the same “orphaned” synopses, and would not finish gathering stats until this operation was done.

A little peek inside the DBMS_STATS_INTERNAL package confirmed my suspicions – when gathering table stats, it fetches a list of all synopses where the table and column IDs are not in the data dictionary, and then runs a FORALL … DELETE.

Fortunately, Oracle Support was able to quickly supply a patch (19450139) which fixed the issue by bypassing the DELETE operation in DBMS_STATS_INTERNAL. I wonder now whether any other process is going to clean up the orphaned synopses, or whether they will simply remain in the table forever.

In conclusion then, I learnt a few things from this :

  1. The WRI$_OPTSTAT_SYNOPSIS$ table contains table/partition/column-level synopsis data. There is one partition per table, with the table’s partitions distributed randomly across 32 subpartitions, and each column identified by an unindexed numeric column. The volume of data in this table is therefore probably a function of the number of tables, partitions and columns.
  2. Without the patch in place, dropping columns on one partitioned table can impact the ability to gather stats on unrelated partitioned tables. Every time GATHER_TABLE_STATS was run on a partitioned table, it tried to delete the same rows, creating significant row-lock contention for other sessions.
  3. The execution counter in the top row of the real-time SQL monitoring report shows the number of times the operation has been called by the FORALL statement.
  4. Having to read through billions of rows is going to take time, even with fast I/O and a 50GB buffer cache.
  5. DBMS_STATS_INTERNAL is surprisingly readable, and it is interesting to see how an Oracle patch can consist of just changing a few lines in a package.

Joining to a pipelined table function and “left correlation”

Oracle 11.2.0.4

A pipelined table function may be called from regular SQL using the TABLE collection expression, e.g.

SELECT *
FROM   TABLE(my_pipelined_function('ABC','DEF'));

where ‘ABC’ and ‘DEF’ are the inputs to the function.

What if you want to call the function repeatedly for several sets of inputs, e.g. testing the function for a variety of values? If those inputs are stored in a table somewhere, it ought to be as easy as doing a cross join to the table.

SELECT *
FROM   TABLE(my_pipelined_function(a,b))
CROSS JOIN my_input_table;

However, this returns the error ORA-00904: invalid identifier. It’s as though the SQL engine is unable to resolve the relationship between the inputs to the function and the joining table.

Today I stumbled across the answer, buried deep in the Oracle 11g SQL Language Reference.

The collection_expression can reference columns of tables defined to its left in the FROM clause. This is called left correlation.

So it seems that by reversing the order of the join so that the joining table comes before the function (i.e. to the left of it in SQL parlance), the SQL engine is able to figure out the join.

SELECT *
FROM   my_input_table
CROSS JOIN TABLE(my_pipelined_function(a,b));

Who ever knew that join order mattered?

Worked example

-- Simple pipelined function to return a couple of records per input
SQL> CREATE OR REPLACE FUNCTION my_pipelined_function
  2  ( a VARCHAR2
  3  , b VARCHAR2
  4  )
  5  RETURN SYS.DBMS_DEBUG_VC2COLL PIPELINED
  6  IS
  7  BEGIN
  8    PIPE ROW(UPPER(a)||LOWER(b));
  9    PIPE ROW(LOWER(a)||UPPER(b));
 10    RETURN;
 11  END;
 12  /

-- Single call to the function
SQL> SELECT *
  2  FROM   TABLE(my_pipelined_function('ABC','DEF'));

COLUMN_VALUE
---------------
ABCdef
abcDEF

-- Join to an input rowset
SQL> WITH src AS (
  2    SELECT 'ABC' AS a, 'DEF' AS b FROM dual
  3    UNION ALL
  4    SELECT 'GHI' AS a, 'JKL' AS b FROM dual
  5    UNION ALL
  6    SELECT 'MNO' AS a, 'PQR' AS b FROM dual
  7  )
  8  SELECT *
  9  FROM   TABLE(my_pipelined_function(a,b))
 10  CROSS JOIN src;
CROSS JOIN src
             *
ERROR at line 10:
ORA-00904: "A": invalid identifier

-- Reverse the order of join
SQL> WITH src AS (
  2    SELECT 'ABC' AS a, 'DEF' AS b FROM dual
  3    UNION ALL
  4    SELECT 'GHI' AS a, 'JKL' AS b FROM dual
  5    UNION ALL
  6    SELECT 'MNO' AS a, 'PQR' AS b FROM dual
  7  )
  8  SELECT *
  9  FROM   src
 10  CROSS JOIN TABLE(my_pipelined_function(a,b));

A   B   COLUMN_VALUE
--- --- ---------------
ABC DEF ABCdef
ABC DEF abcDEF
GHI JKL GHIjkl
GHI JKL ghiJKL
MNO PQR MNOpqr
MNO PQR mnoPQR

References

http://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_10002.htm#sthref6754

Standard Edition is dead – long live Standard Edition 2

Update 8th July – the plot thickens

After posting this blog two days ago, it seems Oracle Corp. may be taking a little time out to “clarify” their position. The original MOS note referenced below is no longer available (despite the link still sitting in my favourites in MOS), and furthermore, the helpful and rather detailed blog post by Mike Dietrich (News on Oracle Database STANDARD EDITION 12.1.0.2) has been redacted to a “watch this space” note.

So I guess we’ll have to watch this space…

My original blog posting from 6th July remains below.

Disclaimer – these are my personal musings based on very limited information, which I’m sure will rapidly become out of date as more details emerge over the coming months.

After a year of uncertainty, Oracle has finally announced the future of Standard Edition (SE). Along with its smaller sibling Standard Edition 1 (SE1), it is being retired and replaced by Standard Edition 2 (SE2). Perhaps the most significant point of this change is that SE2 can only be run on one- or two-socket servers, which raises interesting questions for existing SE deployments on four-socket servers.

When 12.1.0.2 was released in July 2014, it was only available for Enterprise Edition (EE). No date was given for SE and SE1, and with patching for 12.1.0.1 scheduled to end in July 2015, there was an uncertain future for SE and SE1. With Extended Support fees waived on 11g for a year, it seemed like sticking with 11g was a better option for SE and SE1 than moving to 12c with no clear future. However, with 11g ending Premier Support in early 2015, things were beginning to look very bleak for Oracle’s most cost-effective database platforms.

After months of silence, a MOS note was quietly published in early July 2015 (2027072.1) with a rather surprise announcement. The key points I took from this brief note are :

  • 12.1.0.1 is the final release for SE and SE1
  • Beginning with 12.1.0.2, SE and SE1 will be superseded by SE2
  • SE2 can be run on a maximum of 2 sockets, and can optionally support a two-node RAC cluster
  • Customers running SE or SE1 will have to migrate their licenses to SE2 if they wish to upgrade to 12.1.0.2 (i.e. if they wish to remain supported beyond January 2016)
  • SE2 is scheduled for release in Q3 2015
  • Patching for 12.1.0.1 SE and SE1 will be extended until January 2016, beyond the originally published date of July 2015

So what does this mean for SE and SE1 users?

Firstly, there’s a small lifeline, in that 12.1.0.1 will be patched for another six months. However, if like me you’ve stuck with 11g and haven’t yet upgraded to 12c, is it really worth upgrading for a further six months, when 11g remains in (fees-waived) Extended Support for the same period?

The most interesting point is what happens to systems running SE on a 4-socket system, or a 2×2-socket RAC system. When the lights finally go out on SE and SE1 in January 2016, you will have to migrate to different hardware to keep within the 2-socket limit. Alternatively you may be faced with a large bill to upgrade to EE.

There’s been no mention of pricing yet. It’s been suggested that the cost of SE2 could fall somewhere between SE1 and SE. However, with the MOS note saying you have to migrate your licenses in order to upgrade, I wonder how many customers will end up paying Oracle for the privilege of migrating. Cue some lengthy discussions with your favourite account manager.

What will SE2 look like? Will it keep to the same limited set of features as SE/SE1, or will a few EE features or options be allowed to creep in? Parallel execution would be a nice treat. Will SE2 offer any improved performance monitoring capabilities, or will we still be forced to hunt around for the install script for Statspack?

Finally, will Oracle use this as an opportunity to retire the per-socket metric for licensing of SE/SE1, instead moving to per-core licensing, as is done with EE? With the advance of multi-core chips, it seems inevitable at some point. Again, this could adversely impact customers who have invested in high-end multi-core chips.

I’m no lawyer, but having recently been debating perpetual vs term licenses, it makes me wonder what “Perpetual” really means, when support for said license is no longer available. I’m sure it’s in the small print somewhere.

ALL or ANY in SQL

The other day I was studying a SQL statement that was performing poorly, when I noticed a peculiar syntax that I have not come across before :

SELECT COUNT(*)
FROM   customer
WHERE  annual_spend > ALL (
  SELECT spend_threshold
  FROM   promotion
  WHERE  promo_name = 'Summer 2015'
);

The only time I’d ever come across the keyword ALL in SQL was in UNION ALL. However, a quick check of the manual led me to the Group Comparison Conditions :

http://docs.oracle.com/database/121/SQLRF/conditions002.htm#sthref1005

There are three possible comparisons – ALL, ANY and SOME (although ANY and SOME are identical). They allow you to compare an expression with any or all members of a list or subquery, and are defined as follows :

  • ANY, SOME – Compares a value to each value in a list or returned by a query. Must be preceded by =, !=, >, <, =. Can be followed by any expression or subquery that returns one or more values.
  • ALL – Compares a value to every value in a list or returned by a query. Must be preceded by =, !=, >, <, =. Can be followed by any expression or subquery that returns one or more values.

So how might these be useful? I’ll look at three possible use cases, and for each, consider the alternatives and look at potential performance implications. For the examples given, I’m using the OE and HR sample schema supplied with Oracle 12.1.0.2.

Equality

Using the “equality test” (=) we are testing whether a value matches one or more values in the subquery or expression list.

First, using the ALL comparison in the OE schema :

SELECT SUM(quantity)
FROM   order_items
WHERE  product_id = ALL (
  SELECT product_id
  FROM   product_information
  WHERE  category_id = 11
);

SUM(QUANTITY)
-------------
(null)

This first query returns no results. Given that the subquery returns several different product IDs, then it is logically impossible for a single value to match all the results.

Next, using the ANY comparison :

SELECT SUM(quantity)
FROM   order_items
WHERE  product_id = ANY (
  SELECT product_id
  FROM   product_information
  WHERE  category_id = 11
);

SUM(QUANTITY)
-------------
         2235

This query returns the quantity sold of all products in category 11. However, the same result can be obtained using the more familiar IN-clause.

SELECT SUM(quantity)
FROM   order_items
WHERE  product_id IN (
  SELECT product_id
  FROM   product_information
  WHERE  category_id = 11
);

SUM(QUANTITY)
-------------
         2235

Let’s look at the execution plans of the ANY and the IN queries. For the sample data in OE, both queries return the same execution plan :

Plan hash value: 2275080078
 
-------------------------------------------------------------------------------------------
| Id  | Operation           | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                     |     1 |    15 |     8   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE     |                     |     1 |    15 |            |          |
|*  2 |   HASH JOIN         |                     |    58 |   870 |     8   (0)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| PRODUCT_INFORMATION |    16 |   112 |     5   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| ORDER_ITEMS         |   665 |  5320 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("PRODUCT_ID"="PRODUCT_ID")
   3 - filter("CATEGORY_ID"=11)

So, it seems that with the equality condition, ALL and ANY offer no benefits over the traditional IN-clause.

Inequality

Using the “inequality test”, which can be expressed as !=, ^= or , we are testing whether a value does not match one or more values in the subquery or expression list.

First, using the ALL comparison :

SELECT SUM(quantity)
FROM   order_items
WHERE  product_id != ALL (
  SELECT product_id
  FROM   product_information
  WHERE  category_id = 11
);

SUM(QUANTITY)
-------------
        27852

This first query returns the quantity of all products not in category 11. The same could be achieved with the more familiar NOT IN clause :

SELECT SUM(quantity)
FROM   order_items
WHERE  product_id NOT IN (
  SELECT product_id
  FROM   product_information
  WHERE  category_id = 11
);

SUM(QUANTITY)
-------------
        27852

Next, using the ANY comparison :

SELECT SUM(quantity)
FROM   order_items
WHERE  product_id != ANY (
  SELECT product_id
  FROM   product_information
  WHERE  category_id = 11
);

SUM(QUANTITY)
-------------
        30087

This query returns the quantity of products which don’t match one of the products in category 11. However, this also happens to be the total quantity of all products, since products which don’t match one product in category 11 match another product :

SELECT SUM(quantity)
FROM   order_items;

SUM(QUANTITY)
-------------
        30087

So arguably, combining != with ANY is not particularly useful.

Returning to != ALL, how does the execution plan compare with NOT IN? For the sample data in OE, both queries return the same execution plan :

Plan hash value: 712943640
 
---------------------------------------------------------------------------------------------
| Id  | Operation             | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                     |     1 |    15 |     8   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE       |                     |     1 |    15 |            |          |
|*  2 |   HASH JOIN RIGHT ANTI|                     |   665 |  9975 |     8   (0)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL  | PRODUCT_INFORMATION |    16 |   112 |     5   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL  | ORDER_ITEMS         |   665 |  5320 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("PRODUCT_ID"="PRODUCT_ID")
   3 - filter("CATEGORY_ID"=11)

So, it seems that with the inequality condition, ALL and ANY offer no benefits over the traditional NOT IN-clause.

Greater-than, less-than

Finally, consider the greater-than, less-than, greater-than-or-equals, and less-than-or-equals operators. Using greater-than as an example, we are testing whether a value exceeds one or more values in the subquery or expression list.

First, using the ALL comparison in the HR schema :

SELECT COUNT(*)
FROM   employees
WHERE  salary > ALL (
  SELECT salary
  FROM   employees
  WHERE  department_id = 50
);

  COUNT(*)
----------
        31

This first query returns employees with a salary greater than all the employees in department 50. However, the same can be achieved using the MAX aggregate function to obtain the highest salary in department 50 :

SELECT COUNT(*)
FROM   employees
WHERE  salary > (
  SELECT MAX(salary)
  FROM   employees
  WHERE  department_id = 50
);

  COUNT(*)
----------
        31

Let’s see how the execution plans compare for these two approaches.

First, using the ALL comparison :

Plan hash value: 1561682600
 
----------------------------------------------------------------------------------
| Id  | Operation            | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |           |     1 |    11 |     8  (25)| 00:00:01 |
|   1 |  SORT AGGREGATE      |           |     1 |    11 |            |          |
|   2 |   MERGE JOIN ANTI NA |           |     1 |    11 |     8  (25)| 00:00:01 |
|   3 |    SORT JOIN         |           |   107 |   428 |     4  (25)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| EMPLOYEES |   107 |   428 |     3   (0)| 00:00:01 |
|*  5 |    SORT UNIQUE       |           |    45 |   315 |     4  (25)| 00:00:01 |
|*  6 |     TABLE ACCESS FULL| EMPLOYEES |    45 |   315 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   5 - access("SALARY"<="SALARY")
       filter("SALARY"<="SALARY")
   6 - filter("DEPARTMENT_ID"=50)

Next, using the MAX function :

Plan hash value: 1008057839
 
----------------------------------------------------------------------------------
| Id  | Operation            | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |           |     1 |     4 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE      |           |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL  | EMPLOYEES |     5 |    20 |     3   (0)| 00:00:01 |
|   3 |    SORT AGGREGATE    |           |     1 |     7 |            |          |
|*  4 |     TABLE ACCESS FULL| EMPLOYEES |    45 |   315 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("SALARY"> (SELECT MAX("SALARY") FROM "EMPLOYEES" 
              "EMPLOYEES" WHERE "DEPARTMENT_ID"=50))
   4 - filter("DEPARTMENT_ID"=50)

The query using MAX appears to be more efficient. It computes the highest value from department 50, and then compares this with the employees table. In contrast, the query using ALL has to perform a comparison with every employee in department 50. This additional overhead could be significant for large datasets.

With the ANY comparison, we are looking for employees whose salary is greater than ANY employees in department 50.

SELECT COUNT(*)
FROM   employees
WHERE  salary > ANY (
  SELECT salary
  FROM   employees
  WHERE  department_id = 50
);

  COUNT(*)
----------
       106

This can similarly be optimised by using the MIN aggregate function.

SELECT COUNT(*)
FROM   employees
WHERE  salary > (
  SELECT MIN(salary)
  FROM   employees
  WHERE  department_id = 50
);

  COUNT(*)
----------
       106

A similar improvement can be seen in the execution plans.

Using the ANY comparison :

Plan hash value: 67141836
 
----------------------------------------------------------------------------------
| Id  | Operation            | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |           |     1 |    11 |     8  (25)| 00:00:01 |
|   1 |  SORT AGGREGATE      |           |     1 |    11 |            |          |
|   2 |   MERGE JOIN SEMI    |           |   105 |  1155 |     8  (25)| 00:00:01 |
|   3 |    SORT JOIN         |           |   107 |   428 |     4  (25)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| EMPLOYEES |   107 |   428 |     3   (0)| 00:00:01 |
|*  5 |    SORT UNIQUE       |           |    45 |   315 |     4  (25)| 00:00:01 |
|*  6 |     TABLE ACCESS FULL| EMPLOYEES |    45 |   315 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   5 - access(INTERNAL_FUNCTION("SALARY")>INTERNAL_FUNCTION("SALARY"))
       filter(INTERNAL_FUNCTION("SALARY")>INTERNAL_FUNCTION("SALARY"))
   6 - filter("DEPARTMENT_ID"=50)

Using the MIN aggregate function :

Plan hash value: 1008057839
 
----------------------------------------------------------------------------------
| Id  | Operation            | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |           |     1 |     4 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE      |           |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL  | EMPLOYEES |     5 |    20 |     3   (0)| 00:00:01 |
|   3 |    SORT AGGREGATE    |           |     1 |     7 |            |          |
|*  4 |     TABLE ACCESS FULL| EMPLOYEES |    45 |   315 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("SALARY"> (SELECT MIN("SALARY") FROM "EMPLOYEES" 
              "EMPLOYEES" WHERE "DEPARTMENT_ID"=50))
   4 - filter("DEPARTMENT_ID"=50)

Again, the use of an aggregate function appears to be more efficient by reducing the number of comparisons which have to be done.

Expression lists

The previous examples have focussed on comparisons with subqueries. The comparison conditions can also be used with expression lists, e.g.

SELECT SUM(quantity)
FROM   order_items
WHERE  product_id = ANY (1820,1910,2255);

SUM(QUANTITY)
-------------
          103

However, this can also be replaced by the more familiar IN clause :

SELECT SUM(quantity)
FROM   order_items
WHERE  product_id IN (1820,1910,2255);

SUM(QUANTITY)
-------------
          103

The same applies to “!= ALL”, which can be replaced with NOT IN.

Again, there seems little benefit in using ALL or ANY compared with IN and NOT IN.

Conclusion

The group comparison conditions ALL or ANY can be used when comparing rows with the results of a subquery or expression list. However, the same can be achieved using more familiar constructs such as IN, NOT IN, MIN and MA, often more efficiently.

The following table summarises the alternatives to ALL or ANY :

Condition ALL ANY
Equality Illogical Use IN clause
Inequality Use NOT IN clause Illogical
Greater/Less Use MIN/MAX Use MIN/MAX

If you know of any examples where ALL or ANY are beneficial, please do share in the comments below!

EZCONNECT without a password

After encountering this little annoyance for the nth time, I thought I should write it up once and for all, so it is committed to my electronic memory.

The EZCONNECT naming method is a handy way of connecting to an Oracle database by its service name, which avoids the need for aliases in the TNSNAMES file, or lengthy SQL*Net connection strings :

SQL> connect user/password@hostname:port/service_name
Connected.

However, one would expect that if you don’t want to specify the password on the command line, then simply omit the password and await the prompt. But…

SQL> connect user@hostname:port/service_name
ERROR:
ORA-12504: TNS:listener was not given the SERVICE_NAME in CONNECT_DATA

Without the password present, the command interpreter parses the string differently, and (according to Oracle Metalink note 267427.1) treats the service_name as the password and the hostname as the service_name. Basically it’s not handling the syntax very well.

The solution is to quote the part of the connection string after the @ symbol. The actual quoting required may vary by environment, but something along the following lines works in Windows.

SQL> connect user@'hostname:port/service_name'
Enter password:
Connected.

enq: TM – contention due to parallel DML and foreign keys

This is a write-up of an issue I recently posted to the OTN discussion forum (https://community.oracle.com/message/12798407). I thought the associated test case was useful in demonstrating the issue, so is captured here for future reference. There were some useful replies to the OTN post, confirming my suspicions.

The test was performed using Oracle Enterprise Edition 11.2.0.3.0 on Linux.

Problem statement

A parallel delete blocks insert into dependent table, and shows up as waits on ‘enq: TM – contention’. The scenario is as follows.

  • Session S1 is performing a parallel delete from table T1.
  • Session S2 is performing a single-row insert into table T2.
  • Table T2 has a foreign key referencing table T1. The foreign key column is indexed.
  • Neither table is partitioned.
  • The row being inserted into T2 does not depend on any of the rows being deleted from T1.
  • Session S2 is blocked by session S1 until the parallel delete is committed, with an ‘enq: TM – contention’ wait event.

Analysis

Due to the presence of a foreign key, it is necessary for S2 to ensure there is a valid parent record in T1 before permitting the insert into T2.

  • In the case of a serial delete from S1, the process locks the affected rows in T1. Consequently, providing S2 is inserting records which are not affected by the delete, it is not blocked.
  • In the case of a parallel delete from S1, the process locks the entire table with an exclusive (mode 6) lock. Consequently, S2 cannot insert any records into T2 and is blocked until S1 commits the transaction.

This is demonstrated in the following test case.

-- Set up tables and data

CREATE TABLE t1
( id NUMBER
, data VARCHAR2(4000)
, CONSTRAINT pk_t1 PRIMARY KEY (id)
);

CREATE TABLE t2
( id NUMBER
, t1_id NUMBER
, CONSTRAINT pk_t2 PRIMARY KEY (id)
, CONSTRAINT fk_t2_t1 FOREIGN KEY (t1_id) REFERENCES t1
);

CREATE INDEX ix_t2_fk ON t2 (t1_id);

-- Insert 1,000 rows into T1
INSERT INTO t1
SELECT level
     , RPAD('X',4000,'X')
FROM   dual
CONNECT BY level <= 1000;

-- Insert 10 rows into T2
INSERT INTO t2
SELECT level
     , level
FROM   dual
CONNECT BY level <= 10;


EXEC dbms_stats.gather_table_stats(user,'T1')
EXEC dbms_stats.gather_table_stats(user,'T2')

COMMIT;

Firstly, insert into a related table whilst performing a serial delete.

-- Session 1 - serial delete
ALTER SESSION DISABLE PARALLEL DML;

DELETE FROM t1
WHERE id BETWEEN 901 AND 1000;
-- No commit

-- Session 2
INSERT INTO t2 VALUES (11,11);
-- No wait event

-- Check locks
SELECT session_id
     , lock_type
     , mode_held
     , mode_requested
     , lock_id1
     , lock_id2
     , blocking_others
     , object_type ||' ' || object_name AS locked_object
FROM   dba_locks
LEFT JOIN user_objects
ON     lock_id1 = object_id
WHERE  lock_type IN ('DML','Transaction')
AND    session_id IN (&1,&2.)
ORDER BY 1,2,5;

/*
1 DML Row-X (SX) None 75618 0 Not Blocking TABLE T1
1 DML Row-X (SX) None 75620 0 Not Blocking TABLE T2
1 Transaction Exclusive None 262158 669 Not Blocking
2 DML Row-X (SX) None 75618 0 Not Blocking TABLE T1
2 DML Row-X (SX) None 75620 0 Not Blocking TABLE T2
2 Transaction Exclusive None 327680 830 Not Blocking
*/

-- Session 2
COMMIT;

Session 2 is able to insert a record into the dependent table T2 without encountering contention from Session 1. The serial delete in session 1 takes a share lock on T1 (SX), so session S2 is able to also take a share lock on T1 to validate the foreign key constraint.

Now see what happens when parallel DML is used to delete records in Session 1.

-- Session 1 - parallel delete
COMMIT;
ALTER SESSION ENABLE PARALLEL DML;

DELETE /*+ PARALLEL */ FROM t1
WHERE id BETWEEN 801 AND 900;
-- No commit

-- Session 2
INSERT INTO t2 VALUES (12,12);
-- Lock-wait
-- Check locks again
/*
1 DML Exclusive None 75618 0 Blocking TABLE T1 <-- this is the TM-X (mode 6) blocking lock
1 DML Row-X (SX) None 75620 0 Not Blocking TABLE T2
1 Transaction Exclusive None 458759 650 Not Blocking
2 DML None Row-X (SX) 75618 0 Not Blocking TABLE T1 <-- this is the blocked session
*/

-- Check wait events
SELECT event
FROM   v$session
WHERE  sid = &2.;
-- event = enq: TM - contention

Session 2 cannot insert into table T2 because it is blocked by session 1 deleting from table T1. The parallel delete has taken a table exclusive lock (X) on the table, so no other process can get a lock, even the SX share lock required to validate the foreign key. The insert from session 2 is blocked until session 1 commits.

-- Session 1
COMMIT;

-- Session 2
-- Insert completes successfully

This test case has shown that parallel DML can create contention, not only for the table being written to, but also to other tables related by foreign keys. This should be borne in mind when deciding whether to use parallel DML to speed-up data maintenance processes.