Difference between Includes and Joins in Ruby on Rails
Welcome to our very first blog post in the code optimization series. In this blog, we'll be demystifying the concepts of Includes
and Joins
in Ruby on Rails.
Although they might seem similar, they're quite different under the hood and have specific purposes. By the end of this blog, you'll have a clear understanding of when to use "Includes" and when to use "Joins" in your Ruby on Rails projects.
# Includes:
Use includes when you need to load data from an associated table alongside your main data. It's like fetching everything at once.
# Joins:
Use joins when you're in a lazy loading mode. It's handy when you want to treat data from the joined table as a condition without needing any of its specific attributes.
We'll provide an example where includes is the better choice over joins. Additionally, we've conducted benchmark tests to illustrate which method is faster.
class User < ApplicationRecord
has_many :posts, dependent: :destroy
end
class Post < ApplicationRecord
belongs_to :user
end
Now I want to display post title and user's name
Using includes
which uses eager load, the query will be as following
puts Benchmark.measure {
posts = Post.includes(:user)
posts.each do |post|
post.title
post.user.name
end
}
Post Load (0.9ms) SELECT `posts`.* FROM `posts` LIMIT 11
User Load (1.3ms) SELECT `users`.* FROM `users` WHERE (`users`.`post_id`) IN (1, 2, 3)
0.004173 0.000413 0.004012 ( 0.007924)
Using joins
which uses lazy load, the query will be as following
puts Benchmark.measure {
posts = Post.joins(:user)
posts.each do |post|
post.title
post.user.name
end
}
User Load (1.0ms) SELECT `users`.* FROM `users` INNER JOIN `posts` ON `posts`.`post_id` = `users`.`post_id`
Post Load (0.2ms) SELECT `posts`.* FROM `posts` WHERE `posts`.`post_id` = 1 LIMIT 1
Post Load (0.3ms) SELECT `posts`.* FROM `posts` WHERE `posts`.`post_id` = 2 LIMIT 1
Post Load (0.2ms) SELECT `posts`.* FROM `posts` WHERE `posts`.`post_id` = 3 LIMIT 1
0.007273 0.000809 0.008082 ( 0.012924)
Looking at the benchmarks, includes
is preferred. But if you only need posts from user_id 1 without fetching user data, go for 'joins' instead of 'includes'.
puts Benchmark.measure {
posts = Post.joins(:user).where(user_id: 1)
posts.each do |post|
post.title
end
}
Conclusion
Use includes when you need to fetch associated records along with the primary records to reduce the number of database queries and improve performance.
Use joins when you need to perform more complex queries involving multiple tables and don't require eager loading of associated data or when you need selective data from associated records based on specific conditions.