Jun 03, 2019

Using Tesseract OCR in Elixir/Phoenix

Estimated Reading Time: 1 minutes (212 words)

Lately, I am exploring the use of OCR in Expendere (my expense tracking application) and came across Tesseract OCR.

At the time of writing this blog post, there is no native binding of Tesseract OCR in Elixir. However, there are two Elixir wrapper available on GitHub:

Both wrapper use System.cmd/3 to invoke the tesseract command line interface and return the results of the executed command.

Seeing there are wrappers available out there, I quickly grab one and scaffold a Phoenix application to test it out.


In this code example, I will be using wrapper from tesseract-ocr-elixir.

In mix.exs, add tesseract-ocr-elixir as dependency:

def deps do
    {:tesseract_ocr, "~> 0.1.0"}

In page_controller.ex:

defmodule OcrWeb.PageController do use OcrWeb, :controller

  def index(conn, _params) do
    render(conn, "index.html")

  def create(conn, %{"upload" => %Plug.Upload{} = upload}) do
    result =
    render(conn, "show.html", result: result)

In router.ex, add this under scope "/":

get "/", PageController, :index
post "/upload", PageController, :create

In templates/page/index.html.eex:

<%= form_for @conn, Routes.page_path(@conn, :create),
                    [multipart: true], fn f-> %>
    <%= file_input f, :upload, class: "form-control" %>
    <%= submit "Upload", class: "btn btn-primary" %>
<% end %>

In templates/page/show.html.eex:


<%= @result %>



Voila, a simple OCR application is done. The demo application is available at GitHub.