Does MySQL EXISTS Really Check Every Row?

When writing queries, you might come across code like this.

SELECT u.id, u.name, u.email, u.phone, u.created_at
FROM user u
WHERE u.user_type = 2
  AND u.is_deleted = 0
  AND NOT EXISTS (
    SELECT 1 FROM task t
    WHERE t.assigned_user_id = u.id
  );

This query finds users who haven't been assigned any tasks. But looking at EXISTS, something feels off. Since it means "check if exists," wouldn't it scan the task table for every single row in the user table? If the tables are large, wouldn't this be extremely slow?

The short answer is no. EXISTS works smarter than you might think.

How EXISTS Actually Works

This is where many developers get confused. They assume EXISTS runs against every row in the user table. But in reality, the other conditions in the WHERE clause filter first.

Let's think about the execution order of the query above.

  1. Filter the user table with u.user_type = 2
  2. Further filter with u.is_deleted = 0
  3. Run the EXISTS subquery only on the remaining rows

If there are 1 million rows in the user table, but only 1,000 users have user_type = 2 and is_deleted = 0, EXISTS runs just 1,000 times. Not 1 million.

This is possible because the MySQL optimizer rearranges WHERE clause conditions in an efficient order and filters out rows as early as possible. With proper indexes on user_type and is_deleted, most rows can be quickly excluded.

The Power of Short-circuit Evaluation

Another advantage of EXISTS is short-circuit evaluation. When the subquery finds the first matching row, it immediately returns true and doesn't check the rest.

NOT EXISTS (
  SELECT 1 FROM task t
  WHERE t.assigned_user_id = u.id
)

This subquery only needs to check if there's at least one task with assigned_user_id = u.id. Zero tasks means false, one or more means true. That's why we use SELECT 1. We don't need actual data, just existence verification.

The difference becomes clear when compared to IN.

-- IN fetches all values and compares
WHERE u.id IN (SELECT assigned_user_id FROM task)

-- EXISTS only checks for existence
WHERE EXISTS (SELECT 1 FROM task WHERE task.assigned_user_id = u.id)

IN loads all subquery results into memory for comparison, but EXISTS doesn't need to. This is especially efficient for NOT EXISTS cases.

Indexes Make It Faster

EXISTS truly shines when combined with indexes. If there's an index on task.assigned_user_id, MySQL can traverse the B-Tree and check if a task exists for a given user in O(log n) time.

Without an index? That's when Full Table Scan kicks in. It has to iterate through every row in the task table checking if assigned_user_id = u.id. This can indeed be slow.

-- Create an index
CREATE INDEX idx_task_assigned_user ON task(assigned_user_id);

This single line can dramatically improve query performance.

How to Verify

If you're curious about how a query executes, use EXPLAIN.

EXPLAIN SELECT u.id, u.name, u.email, u.phone, u.created_at
FROM user u
WHERE u.user_type = 2
  AND u.is_deleted = 0
  AND NOT EXISTS (
    SELECT 1 FROM task t
    WHERE t.assigned_user_id = u.id
  );

Key points to check in EXPLAIN results:

  • type: index or ref means using an index. ALL means Full Table Scan
  • rows: Estimated number of rows to scan
  • Extra: Additional info like Using where, Using index

If the task table shows type ALL, consider adding an index.

Summary

EXISTS might look like it simply "checks for existence," but it actually operates in a well-optimized manner.

  • Other WHERE clause conditions filter first, and EXISTS only runs on remaining rows
  • Short-circuit evaluation stops at the first match
  • With indexes, B-Tree traversal finds results in O(log n)

If a query feels slow, check whether proper indexes exist before blaming EXISTS itself. EXPLAIN will quickly reveal where the bottleneck is.