Today, something about
UPDATE 14.06.2018: Updated information about #includes and #preload methods.
The typical and old problem well known to all experienced coders. What is it about? Let's assume that we have Clients table. Each record in that table is represented by Client model in our simple app. We also have table Addresses and corresponding Address model. These entities are related - one client can have many addresses and address has exactly one owner (Client). Simple and typical one-to-many association.
The problem occurs when we try to fetch 1-to-many relationship for more than one parent object (in our case, it's Client).
(calling "to_a" forces our model to fetch data from database)
This code will trigger
I do not want to say that is something wrong with that approach. Not at all. It is called "lazy loading" and could be really useful, but it really depends
We will use #preload method offered by Rails' ActiveRecord.
What we achieved by that? Code above will trigger only two queries! One to fetch all the clients and one to fetch associated addresses for all clients at once. Then 1001, now 2. Sounds good.
No magic or complicated logic here :) Simple SELECT for clients and SELECT (using IN clause) for addresses.
#eager_load also preloads association but it does it in one query! Ok... do not get too excited :) Less not always means better
Any magic here? No, no really. It just creates SQL query using LEFT OUTER JOIN and this type of join is very important. LEFT OUTER JOIN guarantees that we will receive all clients - even if they do not have any associated addresses (in opposite to INNER JOIN - see below).
There is some magic behind #includes method but documentation is not clear about this. How many queries will be called depends on what additional instructions we use (where / order etc). If we add additional condition related to preloading associations, this method will behave like eager_load method. Otherwise, like preload method.
Code above will behave like we would use preload method (two queries will be triggered).
This code will trigger one query (using LEFT OUTER JOIN) so just like in eager_load method.
Not marked as a solution because #joins method does not preload assocations! It is just an interface to create SQL query using INNER JOIN. Because of how SQL's INNER JOIN works, the following code snippet will not return all of our clients
Why? It will skip records for which join condition is satisfied - in our case: it will skip clients without any addresses so we will receive
No preloading any associations right here. Calling
will trigger 1001 queries to our database.
Another important thing: Assume that 600 of our clients have associated at least one address. Will we receive the result with 600 client objects? Answer is: 600 is the minimal amount. Again because how SQL's INNER JOIN works. Such query will return record for all matching pairs so each client with each address separately. We can easily get rid of duplicates using
Let's compare the performance of each solution. In my database, I have 1000 clients and each client has 3 addresses associated. We will measure the
|includes / preload||0.142||2||128105 (1070)|
While comparing lazy-loading to includes, we observe 88% time reduction! We can also see benefits in the amount of allocated memory. Brilliant! Does it mean that we should always preload all associations?
No, absolutely not! We do not always need to preload related data. Sometimes lazy-loading is very useful and more efficient. It really depends on the situation.
I just want you to know that "N+1 queries" problem exists, how to deal with it and whether fighting with it can be beneficial to our app.