Thursday, January 14, 2010

Derived Table, Subquery, Correlated Subquery, etc.

I was recently attempting to explain to someone the difference between derived tables and subqueries. I didn't do a great job and came to the conclusion that I wasn't completely clear on the terminology. I don't like not knowing things so I did a little research and this is what I found . . .

A derived table is not what I thought it was. I always referred to a derived table as being a query in a FROM clause. For example:

    SELECT
        id AS EmployeeId,
        name AS EmployeeFullName,
        COUNT(*) AS PaychecksIssuedCount
    FROM
        (SELECT
             id,
             name
          FROM
              employee
          WHERE
               employee_type_cd = 101) emp
          INNER JOIN paycheck ps
              on emp.id = ps.emp_id

Apparently this is actually called a subquery and can be referred to as a subquery in a FROM clause. It's a bit of symantics but a derived table is actually the result of a table subquery.

For some reason people that use SQL Server (like me) tend to refer to this as a derived table. I guess I'll have to stop doing that now.

As for correlated subqueries, these can be found in the list of projected columns as well as the WHERE and HAVING clauses. For example:

SELECT
    id,
    name
FROM
    employee emp
WHERE
    years_employed =  
        (SELECT
            AVG(years_employed)
         FROM
            employee emp_avg_years
         WHERE
            emp_avg_years.business_unit = emp.business_unit)

As you can see, the subquery has a WHERE clause that refers (correlates) to the outer/parent query.

So there you have it. Stop saying derived tables! Most likely you're misusing it the way I have ever since I learned about the concept (of subqueries - not derived tables ;)). Or maybe you're not as anal as I am. However, if you've read this far I'd say you are.

Wednesday, January 6, 2010

Should Foreign Key Columns be Indexed?

People seem to be confused when it comes to indexing of foreign key columns. For one, I believe some people assume that when you create a foreign key constraint (in SQL Server) an index is also created on the column(s) that make up the key/constraint. This is not the case.

That leads us to the original question . . . Should we create indexes on our foreign keys??

I believe that in most cases the answer is yes.

There are a few things to consider here:
1) Constraint checks
2) Cascading Updates and Deletes
2) Joins

1) If you delete a row in a table whose primary key or unique constraint is referenced by one or more foreign keys, SQL Server will search the foreign key tables to determine if the row can be deleted (assuming you have no cascade behavior defined). An index on the foreign key column(s) will speed this search.

2) The same is also true for update and delete cascades. If either are defined, the child/foreign key rows must be identified and updated and/or deleted. Again, the index helps find these rows more efficiently.

3) More often than not foreign key columns will be joined to primary key columns in queries. Joins are always more efficient when an index exists on both sides of the join.

If you ask me, it's a no-brainer to index your foreign key columns. How often do you have a database with foreign key relationships that don't fall under one of the 3 above scenarios??