Advanced SQL Joins: Anti Join and Semi Join

πŸ’‘ Concept Name

Anti Join returns rows from the left table that have no match in the right, while a Semi Join returns rows from the left where a match exists, but only from the left table.

πŸ“˜ Quick Intro

Unlike regular joins that combine data, Anti and Semi joins filter data. They’re often expressed using EXISTS, NOT EXISTS, or LEFT JOIN WHERE IS NULL patterns. These constructs are powerful for complex filtering without merging rows.

🧠 Analogy / Short Story

Imagine you’re sending invites for a party. A semi join is like checking who RSVPed (they exist on both lists). An anti join is like finding people who haven’t replied (only on the original list). You don’t care about the RSVP detailsβ€”only who’s matched or not.

πŸ”§ Technical Explanation

  • βœ… Semi Join syntax: WHERE EXISTS
  • 🚫 Anti Join syntax: WHERE NOT EXISTS or LEFT JOIN ... WHERE right.id IS NULL
  • 🧠 No duplicate rows from the right tableβ€”only filters from left table.
  • πŸ”„ Used in subqueries, filtering, and performance-tuned queries.
  • πŸ’‘ Semi joins are useful for existence checks; anti joins for exclusion.

🎯 Purpose & Use Case

  • βœ… List customers who made at least one order (semi join).
  • βœ… Find employees who haven't submitted timesheets (anti join).
  • βœ… Filter users that match criteria in a lookup table (semi).
  • βœ… Identify missing relationships across systems (anti).

πŸ’» Real Code Example

-- SEMI JOIN: Customers with orders
SELECT c.CustomerID, c.Name
FROM Customers c
WHERE EXISTS (
  SELECT 1 FROM Orders o WHERE o.CustomerID = c.CustomerID
);

-- ANTI JOIN: Customers with no orders
SELECT c.CustomerID, c.Name
FROM Customers c
WHERE NOT EXISTS (
  SELECT 1 FROM Orders o WHERE o.CustomerID = c.CustomerID
);

-- Alternate ANTI JOIN using LEFT JOIN
SELECT c.CustomerID, c.Name
FROM Customers c
LEFT JOIN Orders o ON o.CustomerID = c.CustomerID
WHERE o.CustomerID IS NULL;

❓ Interview Q&A

Q1: What is a semi join in SQL?
A: A semi join returns rows from the left table where a match exists in the right table, but doesn't return columns from the right.

Q2: How is a semi join implemented?
A: Using WHERE EXISTS subqueries or IN clauses.

Q3: What is an anti join?
A: An anti join returns only rows from the left table where no match exists in the right table.

Q4: What are two ways to perform an anti join?
A: Using WHERE NOT EXISTS or LEFT JOIN ... WHERE IS NULL.

Q5: Is there a keyword called SEMI JOIN in SQL?

A: No, it’s a logical concept implemented using EXISTS or IN.

Q6: Can semi/anti joins return columns from the right table?
A: No, they only affect filtering, not selection.

Q7: Which is better: EXISTS or IN?
A: EXISTS is generally better for correlated subqueries; IN works well with small static lists.

Q8: How can anti joins help in data cleanup?
A: By identifying orphan records or unmatched rows.

Q9: Are anti joins performant?
A: Yes, especially with proper indexing and EXISTS usage.

Q10: When should you use semi joins?
A: When you only need to check if related data exists, not return it.

πŸ“ MCQs

Q1. Which clause is used to create a semi join?

  • JOIN
  • NOT EXISTS
  • EXISTS
  • OUTER APPLY

Q2. Which SQL construct represents an anti join?

  • EXISTS
  • JOIN
  • IN
  • NOT EXISTS

Q3. What does an anti join return?

  • All matched rows
  • Only right table rows
  • Left rows with no right match
  • Nothing

Q4. Which pattern can simulate anti join?

  • FULL JOIN
  • LEFT JOIN + IS NULL
  • CROSS APPLY
  • HAVING COUNT(*) > 0

Q5. Can semi joins return right table columns?

  • Yes
  • No
  • Only if joined
  • Only if grouped

Q6. Is SEMI JOIN a SQL keyword?

  • Yes
  • No
  • In Oracle only
  • In MySQL only

Q7. Which is generally faster for anti joins?

  • LEFT JOIN
  • FULL JOIN
  • NOT EXISTS
  • IN

Q8. How does EXISTS work?

  • Returns row count
  • Checks for column names
  • Checks if subquery returns rows
  • Returns boolean column

Q9. Which join duplicates left table rows when matched?

  • ANTI JOIN
  • LEFT JOIN
  • INNER JOIN
  • SEMI JOIN

Q10. Which join is ideal for exclusion logic?

  • SEMI JOIN
  • ANTI JOIN
  • CROSS JOIN
  • SELF JOIN

πŸ’‘ Bonus Insight

In analytics and ETL processes, semi and anti joins are essential for identifying deltas, filtering datasets efficiently, and avoiding costly full joins. Some databases like Oracle and Spark optimize them behind the scenes. Prefer EXISTS for correlated queries and avoid NOT IN with NULLs.

πŸ“„ PDF Download

Need a handy summary for your notes? Download this topic as a PDF!

πŸ’¬ Feedback
πŸš€ Start Learning
Share:

Tags: