The SQL Question That Filters 70% of Candidates

The Interview Scenario

You’re sitting in a technical interview, feeling confident. You’ve written hundreds of SQL queries, optimized databases, and can JOIN tables in your sleep. Then comes the question:

“Given a sales table with sale_date, product_id, and amount, write a query to find the second highest sale amount for each product. If there’s only one sale amount for a product, return NULL.”

Sounds simple, right? Yet 70% of candidates get this wrong. Let’s unpack why.

The Table Structure

CREATE TABLE sales (
    sale_date DATE,
    product_id INT,
    amount DECIMAL(10, 2)
);

Sample data:

product_id | amount
-----------|-------
1          | 100.00
1          | 200.00
1          | 200.00  -- Duplicate highest!
1          | 150.00
2          | 300.00
2          | 300.00  -- Only one distinct amount
3          | 250.00  -- Only one sale total

The Trap: Three Levels of Understanding

Level 1: The Novice Attempt (30% of candidates)

The junior developer reaches for what they know:

SELECT product_id, amount
FROM (
    SELECT *, ROW_NUMBER() OVER (
        PARTITION BY product_id ORDER BY amount DESC
    ) as rn
    FROM sales
) t
WHERE rn = 2;

What’s wrong? This fails spectacularly with our sample data:

  • Product 1: Returns 150.00 ✓ (but got lucky with ordering)
  • Product 2: Returns NULL ✗ (should return NULL, but got lucky again)
  • But what if Product 1 had amounts 200, 200, 150? Row 2 might be 200!

Level 2: The Intermediate Attempt (40% of candidates)

The mid-level engineer recognizes the need for distinct handling:

SELECT product_id, amount
FROM (
    SELECT *, DENSE_RANK() OVER (
        PARTITION BY product_id ORDER BY amount DESC
    ) as drnk
    FROM sales
) t
WHERE drnk = 2;

Closer but still wrong: With amounts 200, 200, 150, DENSE_RANK gives 200→rank1, 150→rank2. That works! But wait… with amounts 200, 200, 200, DENSE_RANK gives all rank1, so no row with rank2. Returns nothing when it should return NULL. Plus, if we have 200, 200, 150, 150, we get TWO rows with rank2!

Level 3: The Expert Solution (30% of candidates)

The senior engineer thinks through all edge cases:

WITH distinct_amounts AS (
    SELECT DISTINCT product_id, amount
    FROM sales
),
ranked_amounts AS (
    SELECT 
        product_id,
        amount,
        DENSE_RANK() OVER (
            PARTITION BY product_id ORDER BY amount DESC
        ) as drnk
    FROM distinct_amounts
)
SELECT 
    product_id,
    MAX(CASE WHEN drnk = 2 THEN amount END) as second_highest_amount
FROM ranked_amounts
GROUP BY product_id
HAVING COUNT(*) > 1;

Or more elegantly:

SELECT 
    product_id,
    MAX(amount) as second_highest_amount
FROM (
    SELECT 
        product_id,
        amount,
        DENSE_RANK() OVER (
            PARTITION BY product_id ORDER BY amount DESC
        ) as drnk
    FROM (
        SELECT DISTINCT product_id, amount
        FROM sales
    ) distinct_sales
) ranked
WHERE drnk = 2
GROUP BY product_id;

The Four Critical Insights

1. Distinct Amounts First

The phrase “second-highest amount” implies we’re talking about distinct values. Amounts 200, 200, 150 have two distinct amounts: 200 and 150. The second highest is 150.

2. NULL Means NULL, Not Empty

The requirement “return NULL” means the result set should have a row for every product, with NULL in the amount column when there’s no second-highest. Most solutions that filter with WHERE drnk = 2 eliminate products.

3. DENSE_RANK vs RANK vs ROW_NUMBER

  • ROW_NUMBER(): Sequential numbers, ignores duplicates entirely
  • RANK(): Leaves gaps (200→1, 200→1, 150→3)
  • DENSE_RANK(): No gaps (200→1, 200→1, 150→2) ← What we want!

4. The GROUP BY Trick

Using MAX(CASE...) with GROUP BY product_id ensures we get one row per product, with NULL where no second-highest exists.

The Ultimate Solution

Here’s the most robust version that handles all edge cases:

WITH product_amounts AS (
    -- First get distinct amounts per product
    SELECT DISTINCT product_id, amount
    FROM sales
),
ranked AS (
    -- Then rank the distinct amounts
    SELECT 
        product_id,
        amount,
        DENSE_RANK() OVER (
            PARTITION BY product_id 
            ORDER BY amount DESC
        ) as amount_rank
    FROM product_amounts
)
SELECT 
    product_id,
    -- Use conditional aggregation to handle the NULL requirement
    MAX(CASE WHEN amount_rank = 2 THEN amount END) 
        as second_highest_amount
FROM ranked
GROUP BY product_id
-- Optional: only include products with at least 2 distinct amounts
-- HAVING MAX(amount_rank) >= 2
ORDER BY product_id;

What This Question Really Tests

  1. Precision with language: “Second highest amount” vs “second highest sale.”
  2. Understanding of window functions: Not just syntax, but behavior
  3. Edge case consideration: Ties, single values, NULL handling
  4. Query structure: CTEs for readability, proper grouping
  5. SQL philosophy: Thinking in sets, not procedurally

Practice Variations

Master this pattern with these variations:

  • Nth highest amount (parameterized)
  • Second-highest sale date
  • Products with increasing sales for three months in a row
  • Gap analysis between the highest and the second highest

The Takeaway

This question isn’t about trickery—it’s about demonstrating you understand SQL at a deep level. The 70% failure rate reveals how many developers use SQL without fully understanding its semantics.

Next time you write a window function, ask yourself:

  • What happens with duplicates?
  • What about NULL values?
  • Does every group get represented?
  • What’s the difference between filtering and transforming?

The best SQL developers don’t just write queries—they understand how data flows through them.

1 thought on “The SQL Question That Filters 70% of Candidates”

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top