Materialized View Automatic Refresh Problem and How to Solve It Using the Database Triggers in Ruby on Rails Based Application

Updated Aug 3, 2023 • 13 min read

The materialized view is a powerful database solution that allow us to access the view’s data faster by “caching” its response.

The materialized view query is executed once when the view is created, not when accessing the data as it is with regular database views. It is a great and worth using feature when we do not need a view to return the most recent data or we know that we will more often read the view’s data than we will modify them. There is only one small problem while using such view: we need to handle the refresh of its data manually.

The problem

In my last article, I have shown how to create and benefit from using database views in Ruby on Rails based internet application with PostgreSQL as database management system (DBMS). In that article, we have created a Users table with given definition:

create_table "users", force: :cascade do |t|
    t.boolean   "active",         default: false
    t.datetime  "created_at",     null: false
    t.datetime  "updated_at",     null: false
    t.string    "full_name"
end

In one of the article’s steps, we have also created (using Scenic gem) a simple materialized view called ActiveUsers defined as:

SELECT id, active, full_name FROM users WHERE active = TRUE;

We had also two ActiveRecord models, one associated with the Users table:

class User < ActiveRecord::Base
end

and one associated with the ActiveUsers view:

class ActiveUser < ActiveRecord::Base
  self.primary_key = :id

  def readonly?
    true
  end
end

Let’s call our materialized view in Rails console:

ActiveUser.all
ActiveUser Load (0.8ms)  SELECT "active_users".* FROM "active_users"
=> #<ActiveRecord::Relation [#<ActiveUser id: 1, active: true, full_name: "Adam Jensen">]>

Now, let’s create a new active user:

User.create(full_name: 'James Bond', active: true)
(0.3ms)  BEGIN
SQL (3.1ms)  INSERT INTO "users" ("full_name", "active", "created_at", "updated_at") VALUES ($1, $2, $3, $4) RETURNING "id"  [["full_name", "James Bond"], ["active", "t"], ["created_at", "2019-03-10 14:11:35.449327"], ["updated_at", "2019-03-10 14:11:35.449327"]]
(1.0ms)  COMMIT

When checking the results of our materialized view now:

ActiveUser.all
ActiveUser Load (0.6ms)  SELECT "active_users".* FROM "active_users"
=> #<ActiveRecord::Relation [#<ActiveUser id: 1, active: true, full_name: "Adam Jensen">]>

We can see that the new record is not included in the view’s response. This is how the materialized view works: the view’s query is executed once (while creating the view) and then its results are stored in a temporary table. The data will not change until we request a manual refresh. To do so, let’s call SQL query given below:

REFRESH MATERIALIZED VIEW active_users;

If we are using Scenic gem instead of executing SQL query, we can call a function:

Scenic.database.refresh_materialized_view(‘active_users’, concurrently: false, cascade: false)

Now, when calling our ActiveUser view:

ActiveUser.all
ActiveUser Load (1.4ms)  SELECT "active_users".* FROM "active_users"
=> #<ActiveRecord::Relation [#<ActiveUser id: 3, active: true, full_name: "James Bond">, #<ActiveUser id: 1, active: true, full_name: "Adam Jensen">]>

We can see that the new active user record has been included in the result set. Obviously, the manual refresh is not the best option when using materialized views in the scope of internet applications with dynamically changing data. That is why we need to figure a proper way to automate this process.

The solutions

Basically in our example, we can approach this problem in three ways:

Schedule an automatic job that will refresh the view once in a given time period.
Schedule an automatic job that will refresh the view as a callback after user has been created/updated/deleted.
Define a database trigger that will be executed after each create/update/delete operation on users table.

The solution 1. is worth considering when we can operate while not having the most recent data. The solution 2. and 3. are best suited for cases when we need the data to be as accurate and recent as possible. It is also worth mentioning that the solutions 1. and 2. force us to use an additional solution to schedule background jobs while solution 3. is based on mechanism included in our DBMS.

In this article, we will take a closer look on how to solve the problem of automatic materialized view refresh in Ruby on Rails based application with PostgreSQL DBMS using the solution 3.

The database triggers

The database trigger is a piece of code that automatizes execution of a given database function. It can be defined on a table or a view. Most of DBMS allow to create a trigger that will be executed before or after an INSERT, UPDATE or DELETE statements. It is also possible to define a trigger that will be executed instead of a given database operation.

Additionally in PostgreSQL, we have 2 subtypes of database triggers:

Per-row - such triggers are invoked in scope of each separate row that is modified.
Per-statement - such triggers are invoked in a scope of a single statement regardless of the number of modified rows.

To define a trigger in PostgreSQL, first we need to define a database function that will be executed every time the trigger is fired. The function must not take any arguments (such functions get their input in a different way than using a regular function arguments) and must return the trigger data type. After that, we can define a trigger (as before/after/instead of INSERT and/or UPDATE and/or DELETE statement) on given table or view that will automatically call the previously created function. To achieve that in Ruby on Rails based application, we will use the F(x) gem.

The F(x) gem

F(x) is a gem that allows us to use database triggers and functions without being forced to change the Rails schema.rb file to SQL-format structure.sql. Similarly as Scenic gem, it introduces versioning of the triggers and functions code. We have also a separate migration file and separate file where we need to provide the SQL code of trigger or function.

Let’s generate a create function migration first:

rails generate fx:function refresh_active_users_view
    create  db/functions
    create  db/functions/refresh_active_users_view_v01.sql
    create  db/migrate/20190310210632_create_function_refresh_active_users_view.rb

As you can see, we got two new files. Let’s take a look a the migration file:

class CreateFunctionRefreshActiveUsersView < ActiveRecord::Migration
  def change
    create_function :refresh_active_users_view
  end
end

Here, we are calling the create_function function provided by F(x) gem. It basically creates a new database function and requires a function name as a parameter. We will not change anything in this file, let’s move onto the SQL code file. In this file, we need to provide a code of function that will be executed each time the trigger will be fired (so we need to call the refresh view PostgreSQL command):

CREATE OR REPLACE FUNCTION refresh_active_users_view() RETURNS trigger AS $function$
BEGIN
  REFRESH MATERIALIZED VIEW active_users;
  RETURN NULL;
END;
$function$ LANGUAGE plpgsql;

Pay attention that the defined function must not take any arguments and must return a trigger date type. We have also used the OR REPLACE keyword to replace function with the same name if such already exists currently in our database. Now, as we have defined the trigger function, we can generate a create trigger migration:

rails generate fx:trigger refresh_active_users_view table_name:users
    create  db/triggers
    create  db/triggers/refresh_active_users_view_v01.sql
    create  db/migrate/20190310211859_create_trigger_refresh_active_users_view.rb

Again, the result of the command execution are two file: one being migration and the other a file for SQL code. Let’s quickly look at our new migration file:

class CreateTriggerRefreshActiveUsersView < ActiveRecord::Migration
  def change
    create_trigger :refresh_active_users_view, on: :users
  end
end

We are calling there a create_trigger function provided by the gem. It creates trigger with given name (in our case refresh_active_users_view) on given table (in our case it is users table). Let’s go to the SQL code file, here we will define our actual database trigger. It needs to call function defined in previous step and needs to fire after each users table modification (INSERT, UPDATE, DELETE) to properly update the data returned by the materialized view.

CREATE TRIGGER refresh_active_users_view
AFTER INSERT OR UPDATE OR DELETE ON users
FOR EACH STATEMENT
EXECUTE PROCEDURE refresh_active_users_view();

As you can see, we are creating a trigger that will be executed once for a statement. We are doing it this way to optimize the performance of queries that will modify multiple users at once (if we will define this trigger as FOR EACH ROW the trigger would be executed after each row update which is actually not needed and desired in our use case).

We have everything we need ready and in place. Now, we can migrate our database and check if all the things are working properly now:

User.create(active: true, full_name: 'Jan Kowalski')
(0.3ms)  BEGIN
SQL (17.0ms)  INSERT INTO "users" ("active", "full_name", "created_at", "updated_at") VALUES ($1, $2, $3, $4) RETURNING "id"  [["active", "t"], ["full_name", "Jan Kowalski"], ["created_at", "2019-03-10 21:26:04.165675"], ["updated_at", "2019-03-10 21:26:04.165675"]]
(1.4ms)  COMMIT
=> #<User id: 4, active: true, created_at: "2019-03-10 21:26:04", updated_at: "2019-03-10 21:26:04", full_name: "Jan Kowalski">

ActiveUser.all
ActiveUser Load (1.0ms)  SELECT "active_users".* FROM "active_users"
=> #<ActiveRecord::Relation [#<ActiveUser id: 3, active: true, full_name: "James Bond">, #<ActiveUser id: 1, active: true, full_name: "Adam Jensen">, #<ActiveUser id: 4, active: true, full_name: "Jan Kowalski">]>

User.find_by(full_name: 'Jan Kowalski').update(full_name: 'Andrea Pirlo')
User Load (0.7ms)  SELECT "users".* FROM "users" WHERE "users"."full_name" = $1 LIMIT 1  [["full_name", "Jan Kowalski"]]
(0.3ms)  BEGIN
SQL (4.8ms)  UPDATE "users" SET "full_name" = $1, "updated_at" = $2 WHERE "users"."id" = $3  [["full_name", "Andrea Pirlo"], ["updated_at", "2019-03-10 21:43:25.185283"], ["id", 4]]
(1.5ms)  COMMIT
=> true

ActiveUser.all
ActiveUser Load (1.1ms)  SELECT "active_users".* FROM "active_users"
=> #<ActiveRecord::Relation [#<ActiveUser id: 3, active: true, full_name: "James Bond">, #<ActiveUser id: 1, active: true, full_name: "Adam Jensen">, #<ActiveUser id: 4, active: true, full_name: "Andrea Pirlo">]>

User.find_by(full_name: 'Andrea Pirlo').destroy
User Load (0.6ms)  SELECT "users".* FROM "users" WHERE "users"."full_name" = $1 LIMIT 1  [["full_name", "Andrea Pirlo"]]
(0.3ms)  BEGIN
SQL (6.2ms)  DELETE FROM "users" WHERE "users"."id" = $1  [["id", 4]]
(1.1ms)  COMMIT
=> #<User id: 4, active: true, created_at: "2019-03-10 21:26:04", updated_at: "2019-03-10 21:43:25", full_name: "Andrea Pirlo">

ActiveUser.all
ActiveUser Load (1.4ms)  SELECT "active_users".* FROM "active_users"
=> #<ActiveRecord::Relation [#<ActiveUser id: 3, active: true, full_name: "James Bond">, #<ActiveUser id: 1, active: true, full_name: "Adam Jensen">]>

As you can see, the triggers are now taking care of automatic refresh of our materialized view when it is needed.

If at any time in the future, you will need to modify the function or trigger, you can run the same command used for generating it. It will produce a new migration file and a file for SQL code prepopulated with the code of previous version of function or trigger. After making the necessary modifications and migrating the database, the new version of the function or trigger will be introduced.

Also, if you will want to remove the trigger, you can generate a regular migration file and use drop_trigger and drop_function methods:

class DropTriggerAndFunction < ActiveRecord::Migration
  def change
    drop_trigger :refresh_active_users_view, on: :users, revert_to_version: 1
    drop_function :refresh_active_users_view, revert_to_version: 1
  end
end

What is important, you first need to drop the trigger and then you can drop the function connected to that trigger. The other important thing is that you need to provide the version number of trigger and function queries otherwise the migration will be irreversible.

Summary

The database triggers are a handy tool that can be very useful in situations when we need to define some kind of a database callback that needs to be executed after, before or even instead of some data modification operation. Triggers may be used to achieve the automation of the materialized view refresh process. With the help of F(x) gem, we can easily define and use database functions and triggers in our Ruby on Rails applications.