Skip to main content

Subqueries

Learn how to use nested queries (subqueries) to solve complex data problems.


What is a Subquery?

A subquery is a query nested inside another query. It can return:

  • A single value (scalar)
  • A single column of values (list)
  • A full result set (table)

Subquery Types

TypeReturnsUsed In
ScalarSingle valueSELECT, WHERE
ListColumn of valuesIN, NOT IN
TableMultiple columns/rowsFROM clause
CorrelatedDepends on outer queryWHERE, SELECT

Scalar Subqueries

Returns a single value. Can be used anywhere a single value is expected.

Test Case: Compare to Average

-- Find invoices above average amount
SELECT
id,
tranid,
amount,
(SELECT AVG(amount) FROM transaction WHERE type = 'CustInvc') AS avg_amount
FROM transaction
WHERE type = 'CustInvc'
AND amount > (SELECT AVG(amount) FROM transaction WHERE type = 'CustInvc')
ORDER BY amount DESC
LIMIT 20

Sample Result:

idtranidamountavg_amount
501INV-50125,0001,960
502INV-50218,5001,960
503INV-50315,2001,960

Test Case: Latest Order Date per Customer

-- Customer with their latest order date
SELECT
c.id,
c.companyname,
(SELECT MAX(trandate)
FROM transaction t
WHERE t.entity = c.id AND t.type = 'SalesOrd'
) AS last_order_date
FROM customer c
WHERE c.isinactive = 'F'
ORDER BY c.companyname
LIMIT 20

Subqueries with IN

Returns a list of values for comparison.

Test Case: Customers with Orders

-- Find customers who have placed at least one order
SELECT id, companyname, email
FROM customer
WHERE id IN (
SELECT DISTINCT entity
FROM transaction
WHERE type = 'SalesOrd'
)
ORDER BY companyname

Test Case: Items Never Sold

-- Find items that have never been sold
SELECT id, itemid, displayname
FROM item
WHERE isinactive = 'F'
AND id NOT IN (
SELECT DISTINCT item
FROM transactionline tl
INNER JOIN transaction t ON tl.transaction = t.id
WHERE t.type = 'SalesOrd'
AND tl.item IS NOT NULL
)
ORDER BY itemid

Test Case: Multi-Level IN

-- Find customers in states where we have sales reps
SELECT id, companyname, state
FROM customer
WHERE state IN (
SELECT DISTINCT state
FROM customer c2
INNER JOIN employee e ON c2.salesrep = e.id
WHERE e.isinactive = 'F'
)
ORDER BY state, companyname

Subqueries with EXISTS

Tests whether a subquery returns any rows. Often more efficient than IN.

Test Case: Customers WITH Orders (EXISTS)

-- Customers who have placed orders (using EXISTS)
SELECT c.id, c.companyname
FROM customer c
WHERE EXISTS (
SELECT 1
FROM transaction t
WHERE t.entity = c.id
AND t.type = 'SalesOrd'
)
ORDER BY c.companyname

Test Case: Customers WITHOUT Orders (NOT EXISTS)

-- Customers who have never ordered (using NOT EXISTS)
SELECT c.id, c.companyname, c.datecreated
FROM customer c
WHERE NOT EXISTS (
SELECT 1
FROM transaction t
WHERE t.entity = c.id
AND t.type = 'SalesOrd'
)
ORDER BY c.datecreated DESC

EXISTS vs IN: When to Use

Use EXISTS WhenUse IN When
Subquery returns many rowsSubquery returns few rows
You only need to check existenceYou need the actual values
Subquery is correlatedSubquery is independent
Better for large datasetsSimpler syntax

Correlated Subqueries

A subquery that references columns from the outer query. Executes once per outer row.

Illustration

Test Case: Customer's Order Count

-- Each customer with their order count (correlated)
SELECT
c.id,
c.companyname,
(SELECT COUNT(*)
FROM transaction t
WHERE t.entity = c.id
AND t.type = 'SalesOrd'
) AS order_count
FROM customer c
WHERE c.isinactive = 'F'
ORDER BY order_count DESC
LIMIT 20

Test Case: Above Average for Their Category

-- Items priced above their category average
SELECT
i.id,
i.itemid,
i.displayname,
i.class AS category,
i.baseprice,
(SELECT AVG(baseprice)
FROM item i2
WHERE i2.class = i.class
AND i2.isinactive = 'F'
) AS category_avg
FROM item i
WHERE i.isinactive = 'F'
AND i.baseprice > (
SELECT AVG(baseprice)
FROM item i2
WHERE i2.class = i.class
AND i2.isinactive = 'F'
)
ORDER BY i.class, i.baseprice DESC

Test Case: Latest Order for Each Customer

-- Each customer's most recent order details
SELECT
c.companyname,
t.tranid,
t.trandate,
t.amount
FROM transaction t
INNER JOIN customer c ON t.entity = c.id
WHERE t.type = 'SalesOrd'
AND t.trandate = (
SELECT MAX(t2.trandate)
FROM transaction t2
WHERE t2.entity = t.entity
AND t2.type = 'SalesOrd'
)
ORDER BY t.trandate DESC
LIMIT 20

Subqueries in FROM (Derived Tables)

Use a subquery as a virtual table.

Test Case: Customer Summary Report

-- Join customer data with order aggregates
SELECT
c.companyname,
c.email,
order_stats.order_count,
order_stats.total_sales,
order_stats.last_order
FROM customer c
INNER JOIN (
SELECT
entity,
COUNT(*) AS order_count,
SUM(amount) AS total_sales,
MAX(trandate) AS last_order
FROM transaction
WHERE type = 'SalesOrd'
GROUP BY entity
) order_stats ON c.id = order_stats.entity
ORDER BY order_stats.total_sales DESC
LIMIT 20

Test Case: Ranked Results

-- Top 5 items per category (derived table approach)
SELECT *
FROM (
SELECT
i.class AS category,
i.itemid,
i.displayname,
SUM(tl.amount) AS total_sales,
ROW_NUMBER() OVER (PARTITION BY i.class ORDER BY SUM(tl.amount) DESC) AS rank
FROM item i
INNER JOIN transactionline tl ON i.id = tl.item
INNER JOIN transaction t ON tl.transaction = t.id
WHERE t.type = 'SalesOrd'
GROUP BY i.id, i.class, i.itemid, i.displayname
) ranked
WHERE rank <= 5
ORDER BY category, rank

Subqueries in SELECT

Return a calculated value for each row.

Test Case: Multiple Calculated Columns

-- Customer dashboard with multiple metrics
SELECT
c.id,
c.companyname,
c.balance,
(SELECT COUNT(*) FROM transaction t
WHERE t.entity = c.id AND t.type = 'SalesOrd') AS total_orders,
(SELECT COUNT(*) FROM transaction t
WHERE t.entity = c.id AND t.type = 'CustInvc'
AND t.status != 'C') AS open_invoices,
(SELECT MAX(trandate) FROM transaction t
WHERE t.entity = c.id AND t.type = 'SalesOrd') AS last_order
FROM customer c
WHERE c.isinactive = 'F'
ORDER BY c.balance DESC
LIMIT 20

Test Case: Percentage Calculations

-- Each customer's share of total sales
SELECT
c.companyname,
SUM(t.amount) AS customer_sales,
ROUND(
SUM(t.amount) * 100.0 /
(SELECT SUM(amount) FROM transaction WHERE type = 'SalesOrd'),
2
) AS pct_of_total
FROM transaction t
INNER JOIN customer c ON t.entity = c.id
WHERE t.type = 'SalesOrd'
GROUP BY c.id, c.companyname
ORDER BY customer_sales DESC
LIMIT 10

Practical Examples

Example 1: Find Duplicate Records

-- Find customers with duplicate names
SELECT id, companyname, email
FROM customer
WHERE companyname IN (
SELECT companyname
FROM customer
GROUP BY companyname
HAVING COUNT(*) > 1
)
ORDER BY companyname

Example 2: Gap Analysis

-- Find missing order numbers (gaps in sequence)
SELECT seq_num AS missing_number
FROM (
SELECT LEVEL AS seq_num
FROM DUAL
CONNECT BY LEVEL <= (SELECT MAX(CAST(SUBSTR(tranid, 4) AS INTEGER)) FROM transaction WHERE type = 'SalesOrd')
) all_numbers
WHERE seq_num NOT IN (
SELECT CAST(SUBSTR(tranid, 4) AS INTEGER)
FROM transaction
WHERE type = 'SalesOrd'
AND tranid LIKE 'SO-%'
)

Example 3: Compare Current vs Previous Period

-- Sales comparison: current month vs previous month
SELECT
c.companyname,
(SELECT NVL(SUM(amount), 0) FROM transaction
WHERE entity = c.id AND type = 'SalesOrd'
AND trandate >= TRUNC(SYSDATE, 'MM')) AS current_month,
(SELECT NVL(SUM(amount), 0) FROM transaction
WHERE entity = c.id AND type = 'SalesOrd'
AND trandate >= ADD_MONTHS(TRUNC(SYSDATE, 'MM'), -1)
AND trandate < TRUNC(SYSDATE, 'MM')) AS previous_month
FROM customer c
WHERE c.isinactive = 'F'
AND EXISTS (
SELECT 1 FROM transaction t
WHERE t.entity = c.id
AND t.trandate >= ADD_MONTHS(SYSDATE, -2)
)
ORDER BY current_month DESC
LIMIT 20

Example 4: Customers Needing Follow-up

-- Customers with no orders in last 90 days but had orders before
SELECT c.id, c.companyname, c.email, c.salesrep
FROM customer c
WHERE c.isinactive = 'F'
AND EXISTS (
-- Had orders more than 90 days ago
SELECT 1 FROM transaction t
WHERE t.entity = c.id
AND t.type = 'SalesOrd'
AND t.trandate < SYSDATE - 90
)
AND NOT EXISTS (
-- No orders in last 90 days
SELECT 1 FROM transaction t
WHERE t.entity = c.id
AND t.type = 'SalesOrd'
AND t.trandate >= SYSDATE - 90
)
ORDER BY c.companyname

Performance Tips

Avoid Correlated When Possible

-- SLOWER: Correlated subquery (runs for each row)
SELECT c.companyname,
(SELECT SUM(amount) FROM transaction WHERE entity = c.id) AS total
FROM customer c

-- FASTER: JOIN with aggregation (runs once)
SELECT c.companyname, NVL(SUM(t.amount), 0) AS total
FROM customer c
LEFT JOIN transaction t ON c.id = t.entity
GROUP BY c.id, c.companyname

Use EXISTS Instead of COUNT

-- SLOWER: Counts all matching rows
SELECT * FROM customer c
WHERE (SELECT COUNT(*) FROM transaction t WHERE t.entity = c.id) > 0

-- FASTER: Stops at first match
SELECT * FROM customer c
WHERE EXISTS (SELECT 1 FROM transaction t WHERE t.entity = c.id)

Use CTE for Reused Subqueries

-- BAD: Same subquery twice
SELECT *
FROM customer
WHERE balance > (SELECT AVG(balance) FROM customer)
AND creditlimit < (SELECT AVG(balance) FROM customer) * 2

-- GOOD: Calculate once with CTE
WITH AvgBalance AS (
SELECT AVG(balance) AS avg_bal FROM customer
)
SELECT c.*
FROM customer c, AvgBalance ab
WHERE c.balance > ab.avg_bal
AND c.creditlimit < ab.avg_bal * 2

Quick Reference

Subquery Locations

-- In WHERE clause
SELECT * FROM table1
WHERE column IN (SELECT column FROM table2)

-- In FROM clause (derived table)
SELECT * FROM (SELECT ... FROM table) AS derived

-- In SELECT clause (scalar)
SELECT column, (SELECT ... FROM table2) AS calc FROM table1

-- In HAVING clause
SELECT category, SUM(amount)
FROM table1
GROUP BY category
HAVING SUM(amount) > (SELECT AVG(total) FROM summary)

Common Patterns

-- Find in list
WHERE id IN (SELECT id FROM other_table)

-- Find not in list
WHERE id NOT IN (SELECT id FROM other_table WHERE column IS NOT NULL)

-- Check existence
WHERE EXISTS (SELECT 1 FROM other_table WHERE condition)

-- Compare to aggregate
WHERE amount > (SELECT AVG(amount) FROM same_table)