The cost of using an index

When trying to understand why the optimiser might choose not to use an index to read rows from a table, I find the following diagram helpful.

index cost

The axis along the bottom represents the selectivity of the query (0% being no rows selected, 100% being all rows selected), and the axis up the side represents the cost of the chosen operation (more costly = more work to do).

When reading from a table without using an index, the cost of retrieving one row from the table is the same as the cost of retrieving all rows from the table (the red line on the chart). The database has to read the entire table in both cases.

When using an index to read from a table, the cost is proportional to the number of rows retrieved (the green line on the chart).

The point at which they cross over is the “break-even” point; if the selectivity of the query is above this point, expect the optimiser to ignore the index and scan the entire table.

There are a number of factors which determine the placement of these two lines, such as the size of the table, the size of the index, the way in which data is organised within the index, and so on. These are beyond the scope of this (brief) post.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s