Today’s web and app users demand personalized experiences. They expect the apps, news sites, social networks, and online stores they engage with to remember who they are and what they’re interested in, and make relevant, individualized, and accurate recommendations for new content and new products based on their previous activities. Any app or website that fails to deliver on these demands will quickly see its users flocking out the digital door.
A product recommendation system is a software tool designed to generate and provide suggestions for items or content a specific user would like to purchase or engage with. Utilizing machine learning techniques and various data about both individual products and individual users, the system creates an advanced net of complex connections between those products and those people.
There are three basic types of connection a product recommendation system creates:
1. User-product relationships – based on users’ individual product preferences.
2. User-user relationships – based on similar people (i.e. people of a similar age, background, etc.) likely having similar product preferences.
3. Product-product relationships – based on similar or complementary products (e.g. printers and ink cartridges) that can be categorised into relevant groups.
Product recommendation systems compare and rank these connections, and recommend products or content accordingly.
As internet users, we all interact with product recommendation systems nearly every day – during Google searches, when using movie or music streaming services, when shopping online, when browsing social media, and when using things like dating apps.
As such, product recommendation systems are one of the most successful and widespread applications of machine learning in business. When set up and configured correctly, they can significantly boost sales, revenues, click-through-rates, conversions, and other important metrics. This is because personalizing product or content recommendations to a particular user’s preferences creates a positive effect on user experience. And this, in turn, translates into metrics that are harder to measure – customer satisfaction, loyalty, brand affinity, etc. – though are nonetheless of great importance to online businesses.
Recent Research from Monetate reveals that product recommendations can lead to a 70% increase in purchase rates, both in the initial session and in return sessions, and 33% higher average order values. A further study from Salesforce found that shoppers who click on product recommendations have 4.5x higher basket rates, make 4.8x more product views per visit, and have a 5x higher per-visit spend.
Image source: slideshare.net
In order to build a product recommendation system, the first thing that’s needed is data – data pertaining to the products on sale (their specific features, prices, etc.), as well as data about users/customers.
The more data collected the better. Both demographic (age, gender, location etc.) and behavioural data is required in order to build a robust product recommendation system. Behavioural data is gathered either explicitly – i.e. users provide the information intentionally, such as by leaving a review or a rating on a product – or implicitly. Implicit data is information that is not provided intentionally by the user, but rather gathered from available data streams, such as search history, clicks, order history, and other activities.
Once the data has been collected and stored, it must then be filtered in order to extract the relevant information required to make relevant and personalized recommendations.
There are several types of product recommendation systems, each based on different machine learning algorithms which are used to conduct the data filtering process. The main categories are content-based filtering (CBF), collaborative filtering (CF), complementary filtering, and hybrid recommendation systems, which use a combination of CBF and CF.
Content-based filtering: CBF tracks a user’s actions, such as products bought or clicked on, web pages viewed, time spent browsing various product categories, etc. It then uses this information to create a customer profile. This profile is then compared to the product catalogue to make recommendations.
Collaborative filtering: CF methods involve collecting and analysing information on users’ behaviours and preferences, and predicting what each user will like based on their similarity to other users. For example, on a music streaming site, if User A likes the bands Radiohead, R.E.M., and U2, and User B likes Radiohead, R.E.M., and Pearl Jam, then the CF filtering algorithm will determine that the two users have similar tastes, and will recommend Pearl Jam to User A, and U2 to User B. Similarities between pairs of items (or bands, movies, TV shows or anything else) can be determined in the same way. In this example, since both users like the bands Radiohead and R.E.M., the pairing would receive a positive similarity score. The algorithms most frequently used in CF filtering are the k-nearest neighbours algorithm, and latent factor analysis (LFM).
Complementary filtering: Here, the system learns the probability of two or more products being bought together. For example, when a user buys a smartphone from an ecommerce store, it is more probable that the same user will buy a set of headphones on a return visit, rather than another smartphone. As such, the algorithms are based around recommending products that are complementary to other products – they are product-defined, as opposed to user-defined, as in CBF and CF. The Naïve Bayes algorithm is most commonly used in complementary filtering.
Hybrid recommendation systems: Hybrid approaches essentially work by combining CBF and CF methods. This can be achieved in a number of ways – for example, by making content-based and collaborative-based predictions separately and then combining them, by adding collaborative-based capabilities to a content-based approach (and vice versa), or by purposefully unifying the two approaches into one model.
Product recommendation systems face certain challenges in their deployment in order to be effective. Let’s consider what they are, and how they can be overcome.
There are two distinct categories of the cold start problem – product cold start, and user cold start. The user cold start problem pertains to the fact that when new users enter a website or app for the first time, the system has no information about them or their preferences, and so fails to recommend anything. Similarly, new products have no reviews, likes, clicks, or other successes among users, so no recommendations can be made.
One solution to the user cold start problem involves applying a popularity-based strategy. Trending products can be recommended to the new user in the early stages, and the selection can be narrowed down based on contextual information – their location, which site the visitor came from, device used, etc. Behavioural information will then “kick in” after a few clicks during that first visit, and start to build up from there.
When it comes to the product cold start problem, content-based filtering is often the solution. The product recommendation system can use metadata about the new product when creating recommendations.
This problem arises from the fact that users will typically rate only a limited number of the available items – especially when the catalogue is very large. This results in a sparse user-item rating matrix with insufficient data for identifying similar users or items. Combining collaborative filtering with Naïve Bayes is the solution to this problem.
Recommendation accuracy is measured by the product recommendation system’s ability to correctly predict the item preferences of each user. Hybrid recommendation systems with a Bayesian network model that contains user nodes, item nodes and feature nodes to combine CF with CBF result in better recommendation quality. Systems that make recommendations by both comparing the habits of similar users (CF) as well as by offering products that share characteristics with other products the user has rated highly (CBF) usually achieve the most accurate results.
One pressing issue of product recommendation systems today is the scalability of algorithms with large, real-world datasets. It’s possible that a recommendation algorithm will work well and produce accurate results with small datasets, yet may start producing inaccurate or inefficient results with large ones. In addition, some algorithms are computationally expensive to run – the larger the dataset, the longer it will take, and the more it will cost the business to analyse and make recommendations from it. Advanced, large-scale assessment methods are required to deal with both issues.
Another challenge of product recommendation systems is finding ways of increasing diversity without compromising the precision of the system. While collaborative filtering methods typically use nearest neighbour methods to identify items similar users like, the inverted neighbourhood model – k-furthest neighbours – seeks to identify less similar neighbourhoods for the purpose of creating more diverse recommendations. This is achieved by recommending items disliked by people least similar to the user.
To make accurate product recommendations you will need a well-built product recommendation system. Knowing whether to use content-based filtering, collaborative filtering, or a hybrid will largely depend on your project, and it will be important to make the right choice, as the quality of your system’s recommendations will impact the success of your business and the satisfaction of your customers.
If you’re in the midst of planning a new project and want to know which direction you should be considering, get in touch with Netguru. We’ll be more than happy to chat through your requirements and advise you on the best path forward.