Difference between Includes and Joins in Ruby on Rails

Difference between Includes and Joins in Ruby on Rails

Welcome to our very first blog post in the code optimization series. In this blog, we'll be demystifying the concepts of Includes and Joins in Ruby on Rails.

Although they might seem similar, they're quite different under the hood and have specific purposes. By the end of this blog, you'll have a clear understanding of when to use "Includes" and when to use "Joins" in your Ruby on Rails projects.

# Includes:

Use includes when you need to load data from an associated table alongside your main data. It's like fetching everything at once.

# Joins:

Use joins when you're in a lazy loading mode. It's handy when you want to treat data from the joined table as a condition without needing any of its specific attributes.


We'll provide an example where includes is the better choice over joins. Additionally, we've conducted benchmark tests to illustrate which method is faster.

class User < ApplicationRecord
  has_many :posts, dependent: :destroy
end

class Post < ApplicationRecord
 belongs_to :user
end

Now I want to display post title and user's name

Using includes which uses eager load, the query will be as following

puts Benchmark.measure {
  posts = Post.includes(:user)
  posts.each do |post|
    post.title
    post.user.name
  end
}

Post Load (0.9ms)  SELECT  `posts`.* FROM `posts` LIMIT 11
User Load (1.3ms)  SELECT `users`.* FROM `users` WHERE (`users`.`post_id`) IN (1, 2, 3)

0.004173   0.000413   0.004012 (  0.007924)

Using joins which uses lazy load, the query will be as following

puts Benchmark.measure {
  posts = Post.joins(:user)
  posts.each do |post|
    post.title
    post.user.name
  end
}

User Load (1.0ms)  SELECT `users`.* FROM `users` INNER JOIN `posts` ON `posts`.`post_id` = `users`.`post_id`
Post Load (0.2ms)  SELECT  `posts`.* FROM `posts` WHERE `posts`.`post_id` = 1 LIMIT 1
Post Load (0.3ms)  SELECT  `posts`.* FROM `posts` WHERE `posts`.`post_id` = 2 LIMIT 1
Post Load (0.2ms)  SELECT  `posts`.* FROM `posts` WHERE `posts`.`post_id` = 3 LIMIT 1

0.007273   0.000809   0.008082 (  0.012924)

Looking at the benchmarks, includes is preferred. But if you only need posts from user_id 1 without fetching user data, go for 'joins' instead of 'includes'.

puts Benchmark.measure {
  posts = Post.joins(:user).where(user_id: 1)
  posts.each do |post|
    post.title
  end
}

Conclusion

Use includes when you need to fetch associated records along with the primary records to reduce the number of database queries and improve performance.

Use joins when you need to perform more complex queries involving multiple tables and don't require eager loading of associated data or when you need selective data from associated records based on specific conditions.