codedecoder

breaking into the unknown…


Leave a comment

copy object in ruby

When ever, you pass any object to any other method for manipulation, you should pass its copy rather then the original object to prevent any risk on the original object. See the below example.

we have below classes in bank_account.rb

class BankAccount
  attr_accessor :name, :balance
  def initialize(name,balance)
    @name = name
    @balance = balance
  end
end

class Invoice
  def print_account_holder_name(account_holder)
    puts account_holder.name.upcase
    account_holder.balance = 1000000
  end
end

BankAccount is the class which hold detail of accounts. Invoice class, generate the account report. Let us demonstrate the involved risk on console

1.9.3-p194 :014 > load “/home/arun/Documents/bank_account.rb” # give path of your bank_account.rb
1.9.3-p194 :016 > a = BankAccount.new(“arun”, 10000)
=> #<BankAccount:0xc9553d8 @name=”arun”, @balance=10000>
1.9.3-p194 :017 > Invoice.new.print_account_holder_name(a)
ARUN
1.9.3-p194 :019 > a.balance
=> 1000000

So you can see that, when print_account_holder_name method of invoice class is passed object of BankAccount class, it accidently or intentionally(in our case), increased the balance from 10000  to 10000000 . This can be avoided by passing the copy of object rather then the original object.

With the above background, we now understand the need of creating copy of object before passing to any other method or doing any manipulation over it. Let us see how we can do that i,e copy a object.

In ruby, everything is a object, even the data types. consciously or unconsciously, you have many a time created copy of these objects with assignment operator while programming.

try this in console

1.9.3-p194 :020 > a = 1
=> 1
1.9.3-p194 :023 > b = a
=> 1
1.9.3-p194 :024 > b += 1
=> 2
1.9.3-p194 :025 > a
=> 1
1.9.3-p194 :026 > b
=> 2

You can see that, you created b as a copy of a, when you add 1 to b, a do not changed i,e the manipulation occurred only on the copy b . But this way of creating copy work only for plain data type integer, string , text etc.

See again the above with a array value.

1.9.3-p194 :027 > a = [1,2]
=> [1, 2]
1.9.3-p194 :028 > b = a
=> [1, 2]
1.9.3-p194 :029 > b << 3
=> [1, 2, 3]
1.9.3-p194 :030 > a
=> [1, 2, 3]

So you can see that, creating copy with assignment operator = not worked and array assigned to a also get changed when the copy b is modified. This is because the Array object is not a primary data type. The assignment operator doesn’t make a copy of the value, it simply copies the reference to the Array object. The a and b variables are now references to the same Array object, any changes in either variable will be seen in the other.

Here dup and clone come to our rescue

1.9.3-p194 :031 > a = [1,2]
=> [1, 2]
1.9.3-p194 :032 > b = a.dup # a.clone also work
=> [1, 2]
1.9.3-p194 :033 > b << 3
=> [1, 2, 3]
1.9.3-p194 :034 > a
=> [1, 2]

So now you can see that with dup or clone, the changes on the copy not alter the original object a . Good… let us check the real scenario we have created in our  bank_account.rb.

1.9.3-p194 :039 > load “/home/arun/Documents/bank_account.rb”
=> true
1.9.3-p194 :040 > a = BankAccount.new(“arun”, 10000)
=> #<BankAccount:0xc94549c @name=”arun”, @balance=10000>
1.9.3-p194 :041 > b = a.clone
=> #<BankAccount:0xc936ce4 @name=”arun”, @balance=10000>
1.9.3-p194 :042 > c = a.dup
=> #<BankAccount:0xc929fa8 @name=”arun”, @balance=10000>
1.9.3-p194 :043 > Invoice.new.print_account_holder_name(b)
ARUN
1.9.3-p194 :044 > Invoice.new.print_account_holder_name(c)
ARUN
1.9.3-p194 :046 > a.balance
=> 10000

O.K, so this time we pass the copy b(created with dup) and copy c (created with clone) to the print_account_holder_name method of invoice class, and you can see that the balance of the original object a is secure .

Great…we will now onward create a copy of sensitive object before handing over it to any other interface for manipulation. But what is this … both dup and clone seems to doing the same thing : – copying the object . Yaa… superficially they are same, but do exhibit different behaviour. the difference can be stated in a single line as below.

clone maintain internal characteristic of copied object but dup do not .

Let us see the difference in the console.

=> dup do not maintain the frozen state of the object

1.9.3-p194 :011 > a = BankAccount.new(“arun”, 10000)
=> #<BankAccount:0x9997470 @name=”arun”, @balance=10000>
1.9.3-p194 :012 > a.freeze
=> #<BankAccount:0x9997470 @name=”arun”, @balance=10000>
1.9.3-p194 :013 > a.frozen?
=> true
1.9.3-p194 :014 > b = a.dup
=> #<BankAccount:0x99a1254 @name=”arun”, @balance=10000>
1.9.3-p194 :015 > b.frozen?
=> false
1.9.3-p194 :016 > c = a.clone
=> #<BankAccount:0x99a5cf0 @name=”arun”, @balance=10000>
1.9.3-p194 :017 > c.frozen?
=> true

So, you can see that, object a is frozen, but when it id copied with dup it become unfrozen, but remain same when copied with clone.

 

=> dup do not copy the singleton method of the object .

1.9.3-p194 :018 > a = BankAccount.new(“arun”, 10000)
=> #<BankAccount:0x99b1e38 @name=”arun”, @balance=10000>
1.9.3-p194 :019 > def a.iam_singleton
1.9.3-p194 :020?>   end

1.9.3-p194 :021 > a.methods
=> [:iam_singleton, :name, :name=, :balance, :balance=,:clone, :dup, :initialize_dup, :initialize_clone…]

1.9.3-p194 :023 > b = a.dup
=> #<BankAccount:0x99c3c3c @name=”arun”, @balance=10000>
1.9.3-p194 :024 > b.methods
=> [:name, :name=, :balance, :balance=,:clone, :dup, :initialize_dup, :initialize_clone…]

1.9.3-p194 :025 > c = a.clone
=> #<BankAccount:0x99d2110 @name=”arun”, @balance=10000>
1.9.3-p194 :026 > c.methods
=> [:iam_singleton, :name, :name=, :balance, :balance=,:clone, :dup, :initialize_dup, :initialize_clone…]

so we have added iam_singleton method to the object a . It is available to the copy c created with clone but not available to copy b created with dup.

NOTE : You can’t make copy of all the object. for example fixnum can’t be copied

 1.9.3-p194 :033 > 1.dup
TypeError: can’t dup Fixnum
1.9.3-p194 :033 > 1.clone
TypeError: can’t clone Fixnum

 

NOTE : you can customize the dup and clone method by redefining initialize_dup and initialize_clone method in your class, so overriding the original behaviour. Both initialize_dup and initialize_clone call initialize_copy method internally

O.K..so we now know the difference between dup and clone. But which one we should use. Well it is upto you. Personally, I prefer dup, as I know that Iam having only the copy of the object, so I can mess with it. I do not want any limitation on the copy I’ am having . Limitation is always a pain..isn’t it…go with dup. 🙂

 

Reference:

http://ruby-doc.org/core-2.1.2/Object.html#method-i-dup

http://ruby-doc.org/core-2.1.2/Object.html#method-i-clone

http://m.onkey.org/ruby-i-don-t-like-3-object-freeze

 


Leave a comment

eager loading in rails

recently I got a interview call from a reputed company and come to know that, I have not so good understanding of data loading in rails. So, I tried to dig deeper into data loading. To start with – data can be loaded in two ways : lazy loading and eager loading. The first one refer to loading associated data when need arises and second one refer to loading the associated data , at the time of retrieving the parent object itself. say you have below association

class Physician < ActiveRecord::Base
  has_many :appointments
  has_many :patients, :through => :appointments
  attr_accessible :name, :specialization, :mobile_no
end
 
class Appointment < ActiveRecord::Base
  belongs_to :physician
  belongs_to :patient
  attr_accessible :reason, :physician_id, :patient_id
end
 
class Patient < ActiveRecord::Base
  has_many :appointments
  has_many :physicians, :through => :appointments
  attr_accessible :name, :mobile_no
end

The corresponding tables for them will look as below :

class CreatePhysicians < ActiveRecord::Migration
  def change
    create_table :physicians do |t|
      t.string :name
      t.string :specialization
      t.integer :mobile_no
      t.timestamps
    end
  end
end

class CreateAppointments < ActiveRecord::Migration
  def change
    create_table :appointments do |t|
      t.string :reason
      t.references :physician
      t.references :patient
      t.timestamps
    end
  end
end

class CreatePatients < ActiveRecord::Migration
  def change
    create_table :patients do |t|
      t.string :name
      t.integer :mobile_no
      t.timestamps
    end
  end
end

Let us create some seed data to experiment with. Add below in your seeds.rb file

Physician.create(:name => "Arun", :specialization => "ENT", :mobile_no => "9569806453")
Physician.create(:name => "Anand", :specialization => "MS", :mobile_no => "9569807853")

Patient.create(:name => "Raman", :mobile_no => "9569807851")
Patient.create(:name => "Kapil", :mobile_no => "9569807852")
Patient.create(:name => "Gita", :mobile_no => "9569807853")
Patient.create(:name => "Santosh", :mobile_no => "9569807854")

Appointment.create(:reason => "Ear pain", :physician_id  => 1, :patient_id => 2)
Appointment.create(:reason => "Teeth pain", :physician_id  => 1, :patient_id => 3)
Appointment.create(:reason => "Teeth pain", :physician_id  => 1, :patient_id => 4)
Appointment.create(:reason => "Surgery", :physician_id  => 2, :patient_id => 4)
Appointment.create(:reason => "Surgery", :physician_id  => 2, :patient_id => 1)

run rake db:seed to populate the above data.

Now we have the association ready along with some dummy data.But before proceeding further, try to understand, why we need eager loading. Say the hospital need to generate a report for the months, indicating the number of patients handled by a individual doctor. You may do something like this.

1.9.3-p194 :052 > @doctors = Physician.all
Physician Load (0.2ms)  SELECT `physicians`.* FROM `physicians`
=> [#<Physician id: 1, name: “Arun”, specialization: “ENT”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Physician id: 2, name: “Anand”, specialization: “MS”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>]
1.9.3-p194 :059 > @doctors.each do |doctor|
1.9.3-p194 :060 >     doctor.patients.each do |patient|
1.9.3-p194 :061 >       puts patient.name
1.9.3-p194 :062?>     end
1.9.3-p194 :063?>   end
Patient Load (0.2ms)  SELECT `patients`.* FROM `patients` INNER JOIN `appointments` ON `patients`.`id` = `appointments`.`patient_id` WHERE `appointments`.`physician_id` = 1
Kapil
Gita
Santosh
Patient Load (0.3ms)  SELECT `patients`.* FROM `patients` INNER JOIN `appointments` ON `patients`.`id` = `appointments`.`patient_id` WHERE `appointments`.`physician_id` = 2
Raman
Santosh

So, you can see that for 2 doctors in database, 3 queries hitting the database. the first query was to retrieve all the doctors and 2 queries to retrieve the details of there patients. So it leads to n+ 1 queries hitting database. Imagine the load on database, if there is thousands of doctors. This can be avoided by eager loading

Now we are ready to get into the detail of eager loading.The first thing to remember is that, you can do eager loading in any of the following ways

=> includes
=> preload
=> eager_load

O.K so we have too many ways to do eager loading. Let us see how they differ by trying out them on console .

1.9.3-p194 :039 > @doctors = Physician.includes(:patients)
Physician Load (0.3ms)  SELECT `physicians`.* FROM `physicians`
Appointment Load (0.2ms)  SELECT `appointments`.* FROM `appointments` WHERE `appointments`.`physician_id` IN (1, 2)
Patient Load (0.3ms)  SELECT `patients`.* FROM `patients` WHERE `patients`.`id` IN (2, 3, 4, 1)
=> [#<Physician id: 1, name: “Arun”, specialization: “ENT”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Physician id: 2, name: “Anand”, specialization: “MS”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>]
1.9.3-p194 :040 > @doctors.first.patients
=> [#<Patient id: 2, name: “Kapil”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Patient id: 3, name: “Gita”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Patient id: 4, name: “Santosh”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>]
1.9.3-p194 :041 > @doctors.first.appointments
=> [#<Appointment id: 1, reason: “Ear pain”, physician_id: 1, patient_id: 2, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Appointment id: 2, reason: “Teeth pain”, physician_id: 1, patient_id: 3, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Appointment id: 3, reason: “Teeth pain”, physician_id: 1, patient_id: 4, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>]

 

1.9.3-p194 :043 > @doctors = Physician.preload(:patients)
Physician Load (0.1ms)  SELECT `physicians`.* FROM `physicians`
Appointment Load (0.1ms)  SELECT `appointments`.* FROM `appointments` WHERE `appointments`.`physician_id` IN (1, 2)
Patient Load (0.1ms)  SELECT `patients`.* FROM `patients` WHERE `patients`.`id` IN (2, 3, 4, 1)
=> [#<Physician id: 1, name: “Arun”, specialization: “ENT”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Physician id: 2, name: “Anand”, specialization: “MS”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>]
1.9.3-p194 :044 > @doctors.first.patients
=> [#<Patient id: 2, name: “Kapil”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Patient id: 3, name: “Gita”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Patient id: 4, name: “Santosh”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>]
1.9.3-p194 :045 > @doctors.first.appointments
=> [#<Appointment id: 1, reason: “Ear pain”, physician_id: 1, patient_id: 2, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Appointment id: 2, reason: “Teeth pain”, physician_id: 1, patient_id: 3, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Appointment id: 3, reason: “Teeth pain”, physician_id: 1, patient_id: 4, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>]

 

1.9.3-p194 :046 > @doctors = Physician.eager_load(:patients)
SQL (0.4ms)  SELECT `physicians`.`id` AS t0_r0, `physicians`.`name` AS t0_r1, `physicians`.`specialization` AS t0_r2, `physicians`.`mobile_no` AS t0_r3, `physicians`.`created_at` AS t0_r4, `physicians`.`updated_at` AS t0_r5, `patients`.`id` AS t1_r0, `patients`.`name` AS t1_r1, `patients`.`mobile_no` AS t1_r2, `patients`.`created_at` AS t1_r3, `patients`.`updated_at` AS t1_r4 FROM `physicians` LEFT OUTER JOIN `appointments` ON `appointments`.`physician_id` = `physicians`.`id` LEFT OUTER JOIN `patients` ON `patients`.`id` = `appointments`.`patient_id`
=> [#<Physician id: 1, name: “Arun”, specialization: “ENT”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Physician id: 2, name: “Anand”, specialization: “MS”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>]
1.9.3-p194 :047 > @doctors.first.patients
=> [#<Patient id: 2, name: “Kapil”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Patient id: 3, name: “Gita”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Patient id: 4, name: “Santosh”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>]
1.9.3-p194 :048 > @doctors.first.appointments
Appointment Load (0.3ms)  SELECT `appointments`.* FROM `appointments` WHERE `appointments`.`physician_id` = 1
=> [#<Appointment id: 1, reason: “Ear pain”, physician_id: 1, patient_id: 2, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Appointment id: 2, reason: “Teeth pain”, physician_id: 1, patient_id: 3, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Appointment id: 3, reason: “Teeth pain”, physician_id: 1, patient_id: 4, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>]

strange, appointment is not loading the through relation appointments and firing a separate query for it .let us try again by including it explicitly.

1.9.3-p194 :064 > @doctors = Physician.eager_load(:patients,:appointments)
SQL (0.4ms)  SELECT `physicians`.`id` AS t0_r0, `physicians`.`name` AS t0_r1, `physicians`.`specialization` AS t0_r2, `physicians`.`mobile_no` AS t0_r3, `physicians`.`created_at` AS t0_r4, `physicians`.`updated_at` AS t0_r5, `patients`.`id` AS t1_r0, `patients`.`name` AS t1_r1, `patients`.`mobile_no` AS t1_r2, `patients`.`created_at` AS t1_r3, `patients`.`updated_at` AS t1_r4, `appointments_physicians`.`id` AS t2_r0, `appointments_physicians`.`reason` AS t2_r1, `appointments_physicians`.`physician_id` AS t2_r2, `appointments_physicians`.`patient_id` AS t2_r3, `appointments_physicians`.`created_at` AS t2_r4, `appointments_physicians`.`updated_at` AS t2_r5 FROM `physicians` LEFT OUTER JOIN `appointments` ON `appointments`.`physician_id` = `physicians`.`id` LEFT OUTER JOIN `patients` ON `patients`.`id` = `appointments`.`patient_id` LEFT OUTER JOIN `appointments` `appointments_physicians` ON `appointments_physicians`.`physician_id` = `physicians`.`id`
=> [#<Physician id: 1, name: “Arun”, specialization: “ENT”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Physician id: 2, name: “Anand”, specialization: “MS”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>]
1.9.3-p194 :065 > @doctors.first.patients
=> [#<Patient id: 2, name: “Kapil”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Patient id: 3, name: “Gita”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Patient id: 4, name: “Santosh”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>]
1.9.3-p194 :066 > @doctors.first.appointments
=> [#<Appointment id: 1, reason: “Ear pain”, physician_id: 1, patient_id: 2, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Appointment id: 2, reason: “Teeth pain”, physician_id: 1, patient_id: 3, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Appointment id: 3, reason: “Teeth pain”, physician_id: 1, patient_id: 4, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>]

you can see that, this time the through association appointment also get eager loaded and no separate query fired for it.

O.K….so let us stop here and analyse the above result.

=> includes and preload made 3 queries to database . eager_load made single query to database
=> the 3 queries made by includes and preload is small queries targeted to the three table physicians, appointments and patients. the single query by eager_load involve join of all the three table
=> the query executed by includes and preload is exactly same, but eager_load query differ from both
=> eager_load do not load the through association implicitly, but you have to specify it explicitly.
=> overall time taken by includes is 0.8ms,preload is 0.3ms and eager_load is o.4ms

So, it is clear that executing a set of small queries through preload is faster than, executing a big join of these queries. Also join suffer from cartesian product overload problem as join produce a number of duplicate record which though not affect the database but hit rails on its back as it have to deal with larger number of small and short lived object.

Here rails show its smartness through includes. Actually internally rails eager load data through preload(i,e set of small queries) or eager_load(i,e through join). include just delegate the eager load to preload or eager_load depending on the nature of query you are making. In simple word, If you are not sure which is more efficient for you : preload or eager_load, just use include and it will smartly decide which one is better and delegate the job to that.

Let us see it in action on console.

1.9.3-p194 :002 > @doctors = Physician.includes(:patients).where(“patients.mobile_no != ?”, “765778989”)
SQL (0.4ms)  SELECT `physicians`.`id` AS t0_r0, `physicians`.`name` AS t0_r1, `physicians`.`specialization` AS t0_r2, `physicians`.`mobile_no` AS t0_r3, `physicians`.`created_at` AS t0_r4, `physicians`.`updated_at` AS t0_r5, `patients`.`id` AS t1_r0, `patients`.`name` AS t1_r1, `patients`.`mobile_no` AS t1_r2, `patients`.`created_at` AS t1_r3, `patients`.`updated_at` AS t1_r4 FROM `physicians` LEFT OUTER JOIN `appointments` ON `appointments`.`physician_id` = `physicians`.`id` LEFT OUTER JOIN `patients` ON `patients`.`id` = `appointments`.`patient_id` WHERE (patients.mobile_no != ‘765778989’)
=> [#<Physician id: 1, name: “Arun”, specialization: “ENT”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Physician id: 2, name: “Anand”, specialization: “MS”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>]

So, you can see that, this time includes fire a single query through join. Here rails saw that, patient to be retrieved has some condition imposed on it, so delegated the job to eager_load internally rather then preload. Note that you can’t eager load conditional data using preload. try the above query through preload.

1.9.3-p194 :003 > @doctors = Physician.preload(:patients).where(“patients.mobile_no != ?”, “765778989”)
Physician Load (0.3ms)  SELECT `physicians`.* FROM `physicians` WHERE (patients.mobile_no != ‘765778989’)
Mysql2::Error: Unknown column ‘patients.mobile_no’ in ‘where clause’: SELECT `physicians`.* FROM `physicians`  WHERE (patients.mobile_no != ‘765778989’)
ActiveRecord::StatementInvalid: Mysql2::Error: Unknown column ‘patients.mobile_no’ in ‘where clause’: SELECT `physicians`.* FROM `physicians`  WHERE (patients.mobile_no != ‘765778989’)

So we get the error as expected.

Now let us make some conclusion at this point
=> use includes if you want rails to decide, to load data in single join query or small set of queries
=> use eager_load if you want to make single query to db without leaving the decision to includes
=> use preload when you want to load data with set of queries

use of preload do not look convincing…right? . let us see a case, where preload can only fit the need. Say you want the physician whose patient is Raman, but at the same time you want to eager load all the patient of that physician.

The query to find doctor whose patient is raman is as below.
1.9.3-p194 :019 > @doctors=Physician.joins(:patients).where(“patients.name = ?”, “Raman”)
Physician Load (0.1ms) SELECT `physicians`.* FROM `physicians` INNER JOIN `appointments` ON `appointments`.`physician_id` = `physicians`.`id` INNER JOIN `patients` ON `patients`.`id` = `appointments`.`patient_id` WHERE (patients.name = ‘Raman’)
=> [#]

let us chain includes to the above query to eager load the patients detail

1.9.3-p194 :017 > @doctors=Physician.joins(:patients).where(“patients.name = ?”, “Raman”).includes(:patients)
SQL (0.4ms) SELECT `physicians`.`id` AS t0_r0, `physicians`.`name` AS t0_r1, `physicians`.`specialization` AS t0_r2, `physicians`.`mobile_no` AS t0_r3, `physicians`.`created_at` AS t0_r4, `physicians`.`updated_at` AS t0_r5, `patients`.`id` AS t1_r0, `patients`.`name` AS t1_r1, `patients`.`mobile_no` AS t1_r2, `patients`.`created_at` AS t1_r3, `patients`.`updated_at` AS t1_r4 FROM `physicians` INNER JOIN `appointments` ON `appointments`.`physician_id` = `physicians`.`id` INNER JOIN `patients` ON `patients`.`id` = `appointments`.`patient_id` WHERE (patients.name = ‘Raman’)
 => [#<Physician id: 2, name: “Anand”, specialization: “MS”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>]
1.9.3-p194 :018 > @doctors.first.patients
=> [#<Patient id: 1, name: “Raman”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>]

So you can see that here, eager loading returning the patient Raman only as includes , seeing where clause in the query delgate the job to eager_load which create a single join with the given condition, returning only the single patient raman.

Now fire the same query, using preload.

1.9.3-p194 :015 > @doctors=Physician.joins(:patients).where(“patients.name = ?”, “Raman”).preload(:patients)
Physician Load (0.2ms)  SELECT `physicians`.* FROM `physicians` INNER JOIN `appointments` ON `appointments`.`physician_id` = `physicians`.`id` INNER JOIN `patients` ON `patients`.`id` = `appointments`.`patient_id` WHERE (patients.name = ‘Raman’)
Appointment Load (0.1ms)  SELECT `appointments`.* FROM `appointments` WHERE `appointments`.`physician_id` IN (2)
Patient Load (0.1ms)  SELECT `patients`.* FROM `patients` WHERE `patients`.`id` IN (4, 1)
=> [#<Physician id: 2, name: “Anand”, specialization: “MS”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>]
1.9.3-p194 :016 > @doctors.first.patients
=> [#<Patient id: 4, name: “Santosh”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>, #<Patient id: 1, name: “Raman”, mobile_no: 2147483647, created_at: “2014-07-21 11:51:18”, updated_at: “2014-07-21 11:51:18”>]

This time you got all the patients of the physician Anand.

So, we are done with our research on data loading in rails.

But wait….Rails still evolving, and it keep changing with each release. The includes has also changed in rails4. Now rails has stopped being super smart. thus it will not automatically delegate the job to eager_load on seing a where clause in the query. instead, you need to pass the reference of the table whose attributes used in the where clause.

Example :

@doctors = Physician.includes(:patients).where(“patients.mobile_no != ?”, “765778989”) # will throw error

@doctors = Physician.includes(:patients).where(“patients.mobile_no != ?”, “765778989”).references(:patients) # will work

 

Reference :
http://guides.rubyonrails.org/active_record_querying.html
http://stackoverflow.com/questions/1208636/rails-include-vs-joins