Oracle question. Optimizing this query?

I have a query that goes something like this:

select * from table1
where table1.id in(select distinct id from stuff)
union
select * from table2
where table2.id in(select distinct id from other_stuff);

This query ran like crap. To change it, all I did was add a "group by id" to the end of the the nested queries. The id field can contain duplicates. I don’t understand why the second query runs so much better. Group by and distinct do practically the same thing and give me the same answer. What is Oracle doing that makes the first query suck so much?

Below is the form that works great. Why?

select * from table1
where table1.id in(select distinct id from stuff group by id)
union
select * from table2
where table2.id in(select distinct id from other stuff group by id);

Check the query plan on both queries and compare them (or post them here). Maybe that will help explain the difference.

BTW, if the values returned from both branches of the queries are different, then you may also be able to speed up the better performing query by using a UNION ALL instead of a UNION.

UNION removes duplicates (requiring more processing) whereas UNION ALL does not.

2 Responses to “Oracle question. Optimizing this query?”

  1. No idea but I tried to replicate this on Postgres (which is similar to Oracle) and didn’t get a benefit of the GROUP BY, it’s the same speed either way.
    References :

  2. Check the query plan on both queries and compare them (or post them here). Maybe that will help explain the difference.

    BTW, if the values returned from both branches of the queries are different, then you may also be able to speed up the better performing query by using a UNION ALL instead of a UNION.

    UNION removes duplicates (requiring more processing) whereas UNION ALL does not.
    References :

Leave a Reply