Netguru Presents: Slack Vision Bot. Playing With Google Vision API And Ruby

Photo of Szymon Baranowski

Szymon Baranowski

Updated May 27, 2024 • 6 min read

Google’s documentation says: “Google Cloud Vision API enables developers to understand the content of an image”.

Thank you so much; what would we do without you?! I thought that teaching Slack to interpret pictures might be a little more exciting, though. Here’s how I did that by developing my own bot.

Why I Did Vision Slack Bot

The idea behind this little project was to use my previous experience with Google Vision API and, being completely new to creating bots for Slack, implement one as quickly and straightforwardly as possible.

Google Vision Slack Bot Step-by-Step Manual

Step 1: Spot the gem

The first step in developing the bot was to do a bit of research to find existing solutions. Being a Ruby developer, I started by looking for a gem, and it didn’t take me long to find an awesome Slack Ruby Bot which gives you the interfaces needed to implement the basic features of a slack bot (like commands).

Step 2: Gemfile stuff

The Gemfile is pretty straightforward. It contains the previously mentioned slack-ruby-bot, Sinatra (a web framework), Puma (a web server) and dotenv (loading variables from a .env file into ENV). For development, I also used a foreman which manages Procfile-based applications. To communicate with Google Cloud Services I chose the Google API Client which provides the classes needed to build a valid request object.

Speaking of building a request, I used building services originally written for the Picguard gem. Check out the main service.

It calls the rest of them to build one after another, nesting the previous object in the next one, just like a Russian Matryoshka doll:

matryoshka.gif

You can check out the rest of the building services in the repository, they are nice and short so I’ll spare myself further explanation.

I also re-used the `Likelihood` service which maps the Google Likelihood to numbers and lets you compare them.

Step 3: Ignite the Analyzer

Now let’s talk about one of the core classes of the bot - the Analyzer. The Analyzer is a service that interprets the Google Vision API response and creates a user-friendly message that’s finally going to be displayed on Slack. It consists of 6 smaller chunks of code - one chunk for each of the features - label detection, face detection, inappropriate content detection, logo detection, landmark detection and optical character recognition.

So once I had implemented the services that let me connect to the Google API and process the response, all I needed were the classes describing the bot behaviour.

First, a simple bot class, which would be Bot in the slack_vision directory/module:

The second class - Analyze in the slack_vision\commands directory represents the command - the trigger word that tells the bot what to do. In my case I only needed an analyze command that uses the previously implemented services and returns the result to the user. I also added an extra feature that handles exceptions when the API request limit is reached. It’s quite helpful considering that Google Vision API is not free and you might want to set a daily limit so your bill doesn’t get out of control:

Step 4: Ready, setup, go!

At this point my bot’s logic is ready. It’s high time to finally take care of the setup. The main thing is the file in the root directory that connects all of the files containing logic together:

Next, a simple Sinatra Web config that I need to be able to keep the bot awake on Heroku.

The final piece of code is the config.ru file that runs both the bot and the web server:

And that’s it! You can now both run the bot locally and push it to Heroku. You can check the Vison Slack Bot Readme for detailed information about both.

My conclusions

Writing this bot was a great example of how complex features can be implemented in just a few hours using the right tools. The Google Cloud Vision API handles the heavy work - digital image processing - which I could easily call from the Ruby code thanks to the Google API Ruby Client gem. The communication with Slack was extremely easy thanks to the Slack Ruby Bot gem and its great documentation. The only core thing left for me as a developer was to implement parsing the response from the API and to return it to the user in readable and informative way.

I enjoyed creating my first Slack Bot and hopefully this post will inspire you to implement one yourself. Don’t forget to share your ideas with me, and feel free to fork the repo so you can collaborate and make this bot even more awesome!

Photo of Szymon Baranowski

More posts by this author

Szymon Baranowski

After a couple of years working as a DBA, SysAdmin and DevOps Engineer, Szymon has decided to jump...
Lost with AI?  Get the most important news weekly, straight to your inbox, curated by our CEO  Subscribe to AI'm Informed

We're Netguru

At Netguru we specialize in designing, building, shipping and scaling beautiful, usable products with blazing-fast efficiency.

Let's talk business