Advanced SQL Joins: Anti Join and Semi Join
π‘ Concept Name
Anti Join returns rows from the left table that have no match in the right, while a Semi Join returns rows from the left where a match exists, but only from the left table.
π Quick Intro
Unlike regular joins that combine data, Anti and Semi joins filter data. Theyβre often expressed using EXISTS
, NOT EXISTS
, or LEFT JOIN WHERE IS NULL
patterns. These constructs are powerful for complex filtering without merging rows.
π§ Analogy / Short Story
Imagine youβre sending invites for a party. A semi join is like checking who RSVPed (they exist on both lists). An anti join is like finding people who havenβt replied (only on the original list). You donβt care about the RSVP detailsβonly whoβs matched or not.
π§ Technical Explanation
- β
Semi Join syntax:
WHERE EXISTS
- π« Anti Join syntax:
WHERE NOT EXISTS
orLEFT JOIN ... WHERE right.id IS NULL
- π§ No duplicate rows from the right tableβonly filters from left table.
- π Used in subqueries, filtering, and performance-tuned queries.
- π‘ Semi joins are useful for existence checks; anti joins for exclusion.
π― Purpose & Use Case
- β List customers who made at least one order (semi join).
- β Find employees who haven't submitted timesheets (anti join).
- β Filter users that match criteria in a lookup table (semi).
- β Identify missing relationships across systems (anti).
π» Real Code Example
-- SEMI JOIN: Customers with orders
SELECT c.CustomerID, c.Name
FROM Customers c
WHERE EXISTS (
SELECT 1 FROM Orders o WHERE o.CustomerID = c.CustomerID
);
-- ANTI JOIN: Customers with no orders
SELECT c.CustomerID, c.Name
FROM Customers c
WHERE NOT EXISTS (
SELECT 1 FROM Orders o WHERE o.CustomerID = c.CustomerID
);
-- Alternate ANTI JOIN using LEFT JOIN
SELECT c.CustomerID, c.Name
FROM Customers c
LEFT JOIN Orders o ON o.CustomerID = c.CustomerID
WHERE o.CustomerID IS NULL;

β Interview Q&A
Q1: What is a semi join in SQL?
A: A semi join returns rows from the left table where a match exists in the right table, but doesn't return columns from the right.
Q2: How is a semi join implemented?
A: Using WHERE EXISTS
subqueries or IN
clauses.
Q3: What is an anti join?
A: An anti join returns only rows from the left table where no match exists in the right table.
Q4: What are two ways to perform an anti join?
A: Using WHERE NOT EXISTS
or LEFT JOIN ... WHERE IS NULL
.
Q5: Is there a keyword called SEMI JOIN in SQL?
A: No, itβs a logical concept implemented using EXISTS
or IN
.
Q6: Can semi/anti joins return columns from the right table?
A: No, they only affect filtering, not selection.
Q7: Which is better: EXISTS or IN?
A: EXISTS
is generally better for correlated subqueries; IN
works well with small static lists.
Q8: How can anti joins help in data cleanup?
A: By identifying orphan records or unmatched rows.
Q9: Are anti joins performant?
A: Yes, especially with proper indexing and EXISTS usage.
Q10: When should you use semi joins?
A: When you only need to check if related data exists, not return it.
π MCQs
Q1. Which clause is used to create a semi join?
- JOIN
- NOT EXISTS
- EXISTS
- OUTER APPLY
Q2. Which SQL construct represents an anti join?
- EXISTS
- JOIN
- IN
- NOT EXISTS
Q3. What does an anti join return?
- All matched rows
- Only right table rows
- Left rows with no right match
- Nothing
Q4. Which pattern can simulate anti join?
- FULL JOIN
- LEFT JOIN + IS NULL
- CROSS APPLY
- HAVING COUNT(*) > 0
Q5. Can semi joins return right table columns?
- Yes
- No
- Only if joined
- Only if grouped
Q6. Is SEMI JOIN a SQL keyword?
- Yes
- No
- In Oracle only
- In MySQL only
Q7. Which is generally faster for anti joins?
- LEFT JOIN
- FULL JOIN
- NOT EXISTS
- IN
Q8. How does EXISTS work?
- Returns row count
- Checks for column names
- Checks if subquery returns rows
- Returns boolean column
Q9. Which join duplicates left table rows when matched?
- ANTI JOIN
- LEFT JOIN
- INNER JOIN
- SEMI JOIN
Q10. Which join is ideal for exclusion logic?
- SEMI JOIN
- ANTI JOIN
- CROSS JOIN
- SELF JOIN
π‘ Bonus Insight
In analytics and ETL processes, semi and anti joins are essential for identifying deltas, filtering datasets efficiently, and avoiding costly full joins. Some databases like Oracle and Spark optimize them behind the scenes. Prefer EXISTS
for correlated queries and avoid NOT IN
with NULLs.
π PDF Download
Need a handy summary for your notes? Download this topic as a PDF!