Postgres Arrays vs JSON Datatypes in Rails 5

Photo of Mateusz Wąsowski

Mateusz Wąsowski

Updated Feb 21, 2024 • 6 min read
sergi-kabrera-705414-unsplash-2

Postgres complex data types (hstore, json, jsonb, array) are very useful tools in database design.

Having multiple values under one attribute definitely has its use cases. There are plenty of articles that compare the NoSQL database performance to the PostgreSQL JSON datatypes. Their results show the maturity of tools provided by this database management system. So which data types are best suited for what?

When you already decided that normalization is not the way you want to go and that in your specific case data duplication is better than having yet another join table you will most likely face the options below.

Array

Its main disadvantage is that you are limited to one data type and the strict order of the array. It's pretty easy to set up and requires less storage in comparison to jsonb. It has multi dimensional support (requires same number of elements though) and indexing with GIN, which greatly speeds up query performance.

class CreateMovies < ActiveRecord::Migration[5.2]
  def change
    create_table :movies do |t|
      t.integer 'ratings', array: true, default: []
    end

    add_index :movies, :ratings, using: :gin
  end
end
Movie.create(ratings: [7, 7, '1', 'string', :random_symbol]).ratings #=> [7, 7, 1, 0, nil] Movie.create(ratings: [[7, 5], [7]]).ratings # invalid array Movie.create(ratings: [[7, 5], [7, nil]]).ratings # valid array Movie.create(ratings: [[7, 5], [7, nil], 4]).ratings # invalid array Movie.where("7 = ANY ratings") # movies having rating 7 Movie.where('ratings @> ARRAY[?]::integer[]', [7, 1]) # movies having rating 7 and 1

Use case: simple lists of information. Especially useful when duplications are needed, for example for saving multiple ratings of the same value.

Hstore

While not being a strictly JSON data type, it can be used as such - hstore operates on a string based key/value pair system. It doesn't allow for nesting and because of that has been mostly labeled as inferior to the jsonb data type, even though it also supports GIN and GIST indexing. Considering that inside Rails you also need to remember to enable the hstore extension its usability is questionable.

class CreateMovies < ActiveRecord::Migration[5.2]
  def change
    enable_extension 'hstore' unless extension_enabled?('hstore')
    create_table :movies do |t|
      t.hstore 'additional_data', default: {}
    end

    add_index :movies, :additional_data, using: :gin
  end
end

Movie.create(additional_data: { 'advisor' => 'Random',
                                'aspect_ratio' => '2.35 : 1',
                                'advisor' => 'John Doe' }).additional_data
#=> {'advisor'=>'John Doe', 'aspect_ratio'=>'2.35 : 1'}

Movie.where('additional_data ? :key', key: 'advisor')
# movies with advisor
Movie.where('additional_data -> :key LIKE :value',
            key: 'advisor', value: '%John%')
# movies with advisor named John

Use case: if you want to store simple key/value type information and expect to never need to nest data.

JSON

Acts more or less as expected from a JSON datatype - it allows nesting and using some basic data types. Compared to jsonb, it has no support for indexes but is also less strict (jsonb doesn't allow for the NULL character). It requires less storage compared to jsonb.

class CreateMovies < ActiveRecord::Migration[5.2]
  def change
    create_table :movies do |t|
      t.json 'payload'
    end
  end
end

payload = { "request" => { data: { "yearsPlayedInTV" => [2018, 2019] } } }
Movie.create(payload: payload).payload
#=> {"request"=>{"data"=>{"yearsPlayedInTV"=>[2018, 2019]}}}

Movie.where("payload #>> '{request, data, yearsPlayedInTV, 0}' = :year",
            year: "2018")
# movies where first year played in tv was 2018

Use case: log and payload storing, especially when no frequent access is needed.

JSONB

While very similar to the json data type, it provides additional operators for querying as well us support for indexing. Those two features make this data type preferable over the other ones. It is worth noting that since jsonb has to parse the JSON data into a binary format, it tends to be slower than the json data type when writing but faster when reading the data.

class CreateMovies < ActiveRecord::Migration[5.2]
  def change
    create_table :movies do |t|
      t.jsonb 'additional_data'
    end

    add_index :movies, :additional_data, using: :gin
    add_index :movies,
              "(additional_data->'screenwriters')",
              name: "index_movies_on_additional_data_screenwriters"
  end
end


additional_data = { "screenwriters" => ['John Doe', 1], other: nil }
Movie.create(additional_data: additional_data).additional_data
#=> {"screenwriters"=>["John Doe"], "other"=>nil}

Movie.where("additional_data -> 'screenwriters' ? :screenwriter",
            screenwriter: "John Doe")
# movies with screenwriters named John Doe

Use case: any case of storing JSON type of data

Worth reading

PostgreSQL documentation - arrays

PostgreSQL documentation - hstore

PostgreSQL documentation - json and jsonb

Rails Guides - Active Record and PostgreSQL

Photo by Sergi Kabrera on Unsplash

Photo of Mateusz Wąsowski

More posts by this author

Mateusz Wąsowski

Mateusz is a student of Computer Science at Adam Mickiewicz University. Having lived in Italy and...
Lost with AI?  Get the most important news weekly, straight to your inbox, curated by our CEO  Subscribe to AI'm Informed

We're Netguru

At Netguru we specialize in designing, building, shipping and scaling beautiful, usable products with blazing-fast efficiency.

Let's talk business