
Image To Text Converter ( OCR ) In Django Using Pytesseract Package [ Django 4 ] With Ajax
March 26, 2023
Hey everyone in this tutorial we will discuss about how can we create Image To Text Converter Using Django and Pytesseract (OCR) package with ajax. This is going to be a interesting project as we our self face problems many times while converting Image to Text . So , we will create our own OCR software using Django . Now with full excitement lets start our tutorial
Firstly, Create a virtual environment.
If virtual environment is not installed in your system , type command
pip install virtualenv
Then create virtual environment by typing the command
virtualenv env
env is the name of the virtual environment
Now activate the virtual environment by typing the following command
For windows :
.\env\Scripts\activate
For linux :
source env/bin/activate
Lets install required package
Install Django
pip install django
Install pillow , it used for opening the image file
pip install Pillow
For Windows :
pip install pytesseract
Now , Install Tesseract Installer ( https://github.com/UB-Mannheim/tesseract/wiki )
After you install it rember the path where you installed it its default is usally C:\Users\USER\AppData\Local\Tesseract-OCR
For linux
pip install pytesseract
Then,
sudo apt install tesseract-ocr
The Next step will be to create a django project.
To create a django project type the following command
django-admin startproject djangoocr
Now create a django app named main
django-admin startapp main
Now configure the settings.py file inside the pdfmerger directory and add main to installed apps
INSTALLED_APPS = [
"django.contrib.admin",
"django.contrib.auth",
"django.contrib.contenttypes",
"django.contrib.sessions",
"django.contrib.messages",
"django.contrib.staticfiles",
'django.contrib.sites',
"main",
]
Then,
create templates directory and add it to you TEMPLATES in settings.py file
TEMPLATES = [
{
"BACKEND": "django.template.backends.django.DjangoTemplates",
"DIRS": [os.path.join(BASE_DIR, "templates")],
"APP_DIRS": True,
"OPTIONS": {
"context_processors": [
"django.template.context_processors.debug",
"django.template.context_processors.request",
"django.contrib.auth.context_processors.auth",
"django.contrib.messages.context_processors.messages",
],
},
},
]
Note: Import os module at the top of your settings.py file
import os
Now include main app urls inside the urls.py file inside the djangoocr directory
from django.contrib import admin
from django.urls import path, include
urlpatterns = [
path("admin/", admin.site.urls),
path("", include("main.urls")),
]
Our next step will be to create urls.py file inside main app
Now lets create our views ,
from django.shortcuts import render
from django.http import JsonResponse
from PIL import Image
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\Users\USER\AppData\Local\Tesseract-OCR\tesseract.exe' # This is only needed for windows and not for linux
def imagetoText(request):
if request.method == 'POST':
image = request.FILES['image']
image = Image.open(image)
text = pytesseract.image_to_string(image)
return JsonResponse({'status': 'success', 'text': text})
return render(request, 'imagetotext.html')
Now , inside your templates folder create a html file named imagetotext.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>Image to Text Converter</title>
<link
href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0-alpha1/dist/css/bootstrap.min.css"
rel="stylesheet"
integrity="sha384-GLhlTQ8iRABdZLl6O3oVMWSktQOp6b7In1Zl3/Jr59b6EGGoI1aFkw7cmDA6j6gD"
crossorigin="anonymous"
/>
<script
src="https://code.jquery.com/jquery-3.6.3.min.js"
integrity="sha256-pvPw+upLPUjgMXY0G+8O0xUf+/Im1MZjXxxgOcBQBXU="
crossorigin="anonymous"
></script>
</head>
<body>
<div class="container">
<div class="row">
<div class="col-md-12 mt-5">
<h1>Image to Text</h1>
<p>Convert your image to text</p>
<form
action="{% url 'imagetoText' %}"
method="post"
id="imagetotextForm"
enctype="multipart/form-data"
>
{% csrf_token %}
<div class="form-group">
<label for="image">Image</label>
<input
type="file"
class="form-control"
name="image"
id="image"
placeholder="Image"
accept="image/*"
/>
</div>
<button type="submit" class="btn btn-primary btn-submit mt-4">
Submit
</button>
</form>
</div>
</div>
<div class="row mt-4 mb-5">
<div class="header">
<h1 class="fw-bold text-center">Result</h1>
</div>
<textarea
disabled
style="height: 300px; margin-bottom: 100px; resize: none"
id="result"
class="col-md-12 mt-2 bg-light p-5 result"
>
</textarea>
</div>
</div>
<script>
$(document).ready(function () {
$("#imagetotextForm").submit(function (e) {
e.preventDefault();
var formData = new FormData(this);
$(".btn-submit").html(
'<span class="spinner-border spinner-border-sm" role="status" aria-hidden="true"></span> Loading...'
);
// disable submit button
$(".btn-submit").attr("disabled", true);
$.ajax({
url: "{% url 'imagetoText' %}",
type: "POST",
data: formData,
cache: false,
contentType: false,
processData: false,
success: function (data) {
var tex = data.text;
$(".result").html(tex);
// remove spinner from submit button
$(".btn-submit").html("Submit");
// enable submit button
$(".btn-submit").attr("disabled", false);
},
error: function (data) {
// remove spinner from submit button
$(".btn-submit").html("Submit");
// enable submit button
$(".btn-submit").attr("disabled", false);
},
});
});
});
</script>
<script
src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0-alpha1/dist/js/bootstrap.bundle.min.js"
integrity="sha384-w76AqPfDkMBDXo30jS1Sgez6pr3x5MlQ1ZAGC+nuZB+EYdgRZgiwxhTBTkF7CXvN"
crossorigin="anonymous"
></script>
</body>
</html>
You will see an interface like given below
django pytesseract ocr Thanks For reading
Please share some screenshots, nothing in the above screenshot. I want to know what is this
March 26, 2023, 10:36 p.m.