Image To Text Converter ( OCR ) In Django Using Pytesseract Package [ Django 4 ] With Ajax

March 26, 2023


1
4 min read
630

 Hey everyone in this tutorial we will discuss about how can we create Image To Text Converter Using Django and Pytesseract (OCR) package with ajax. This is going to be a interesting project as we our self face problems many times while converting Image to Text . So , we will create our own OCR software using Django . Now with full excitement lets start our tutorial

Firstly, Create a virtual environment.

If virtual environment is not installed in your system , type command 

pip install virtualenv

Then create virtual environment by typing the command

virtualenv env

env is the name of the virtual environment

Now activate the virtual environment by typing the following command 

For windows :

.\env\Scripts\activate

For linux :

source env/bin/activate

Lets install required package

Install Django 

pip install django

 

Install pillow , it used for opening the image file

pip install Pillow

 

For Windows :

pip install pytesseract

 

Now , Install Tesseract Installer ( https://github.com/UB-Mannheim/tesseract/wiki )

After you install it rember the path where you installed it its default is usally C:\Users\USER\AppData\Local\Tesseract-OCR

For linux

pip install pytesseract

Then,

sudo apt install tesseract-ocr

The Next step will be to create a django project.

To create a django project type the following command

django-admin startproject djangoocr

Now create a django app named main

django-admin startapp main

Now configure the settings.py file inside the pdfmerger  directory and add main to installed apps

INSTALLED_APPS = [
    "django.contrib.admin",
    "django.contrib.auth",
    "django.contrib.contenttypes",
    "django.contrib.sessions",
    "django.contrib.messages",
    "django.contrib.staticfiles",
    'django.contrib.sites',
    "main",
]

Then,

create templates directory and add it to you TEMPLATES in settings.py file

TEMPLATES = [
    {
        "BACKEND": "django.template.backends.django.DjangoTemplates",
        "DIRS": [os.path.join(BASE_DIR, "templates")],
        "APP_DIRS": True,
        "OPTIONS": {
            "context_processors": [
                "django.template.context_processors.debug",
                "django.template.context_processors.request",
                "django.contrib.auth.context_processors.auth",
                "django.contrib.messages.context_processors.messages",

            ],
        },
    },
]

Note: Import os module at the top of your settings.py file

import os

Now include main app urls inside the  urls.py file inside the djangoocr directory

from django.contrib import admin
from django.urls import path, include

urlpatterns = [
    path("admin/", admin.site.urls),
    path("", include("main.urls")),
]

Our next step will be to create urls.py file inside main app

Now lets create our views ,

 

from django.shortcuts import render
from django.http import JsonResponse
from PIL import Image
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r'C:\Users\USER\AppData\Local\Tesseract-OCR\tesseract.exe'  # This is only needed for windows and not for linux


def imagetoText(request):
    if request.method == 'POST':
        image = request.FILES['image']
        image = Image.open(image)
        text = pytesseract.image_to_string(image)
        return JsonResponse({'status': 'success', 'text': text})
    return render(request, 'imagetotext.html')

 

Now , inside your templates folder create a html file named imagetotext.html

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1" />
    <title>Image to Text Converter</title>
    <link
      href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0-alpha1/dist/css/bootstrap.min.css"
      rel="stylesheet"
      integrity="sha384-GLhlTQ8iRABdZLl6O3oVMWSktQOp6b7In1Zl3/Jr59b6EGGoI1aFkw7cmDA6j6gD"
      crossorigin="anonymous"
    />
    <script
      src="https://code.jquery.com/jquery-3.6.3.min.js"
      integrity="sha256-pvPw+upLPUjgMXY0G+8O0xUf+/Im1MZjXxxgOcBQBXU="
      crossorigin="anonymous"
    ></script>
  </head>
  <body>
    <div class="container">
      <div class="row">
        <div class="col-md-12 mt-5">
          <h1>Image to Text</h1>
          <p>Convert your image to text</p>
          <form
            action="{% url 'imagetoText' %}"
            method="post"
            id="imagetotextForm"
            enctype="multipart/form-data"
          >
            {% csrf_token %}
            <div class="form-group">
              <label for="image">Image</label>
              <input
                type="file"
                class="form-control"
                name="image"
                id="image"
                placeholder="Image"
                accept="image/*"
              />
            </div>
            <button type="submit" class="btn btn-primary btn-submit mt-4">
              Submit
            </button>
          </form>
        </div>
      </div>

      <div class="row mt-4 mb-5">
        <div class="header">
          <h1 class="fw-bold text-center">Result</h1>
        </div>

        <textarea
          disabled
          style="height: 300px; margin-bottom: 100px; resize: none"
          id="result"
          class="col-md-12 mt-2 bg-light p-5 result"
        >
        </textarea>
      </div>
    </div>

    <script>
      $(document).ready(function () {
        $("#imagetotextForm").submit(function (e) {
          e.preventDefault();
          var formData = new FormData(this);
          $(".btn-submit").html(
            '<span class="spinner-border spinner-border-sm" role="status" aria-hidden="true"></span> Loading...'
          );
          // disable submit button
          $(".btn-submit").attr("disabled", true);
          $.ajax({
            url: "{% url 'imagetoText' %}",
            type: "POST",
            data: formData,
            cache: false,
            contentType: false,
            processData: false,
            success: function (data) {
              var tex = data.text;
              $(".result").html(tex);
              // remove spinner from submit button
              $(".btn-submit").html("Submit");
              // enable submit button
              $(".btn-submit").attr("disabled", false);
            },
            error: function (data) {
              // remove spinner from submit button
              $(".btn-submit").html("Submit");
              // enable submit button
              $(".btn-submit").attr("disabled", false);
            },
          });
        });
      });
    </script>

    <script
      src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0-alpha1/dist/js/bootstrap.bundle.min.js"
      integrity="sha384-w76AqPfDkMBDXo30jS1Sgez6pr3x5MlQ1ZAGC+nuZB+EYdgRZgiwxhTBTkF7CXvN"
      crossorigin="anonymous"
    ></script>
  </body>
</html>

You will see an interface like given below

 

django pytesseract ocr Appreciate you stopping by my post! 😊

Comments


Profile Picture

Zehra Ahmad

Please share some screenshots, nothing in the above screenshot. I want to know what is this

March 26, 2023, 10:36 p.m.

Add a comment


Note: If you use these tags, write your text inside the HTML tag.
Login Required
Author's profile
Profile Image

Aditya Pandey

Python Developer

Hi, I am Aditya. I am a full stack web developer from Nepal.

Related articles