Image to Text with Python: A Beginner’s Guide to OCR

Have you ever wanted to extract text from an image — a screenshot, scanned document, or photo of a sign — and convert it into editable, searchable text? That’s where OCR (Optical Character Recognition) comes in.
In this article, you’ll learn how to build a Python script that takes an image and extracts readable text from it using OCR. Whether you're learning Python or want to automate data capture, this guide is the perfect place to start.
What You'll Learn
What OCR is and how it works.
How to use Python with
pytesseractandPillowfor OCR.How to convert an image into plain text.
Common challenges and how to improve accuracy.
What is OCR?
OCR (Optical Character Recognition) is the process of converting images of typed, handwritten, or printed text into machine-readable text. It's useful for:
Digitizing printed documents.
Reading text from screenshots or camera images.
Automating data entry from forms or receipts.
Python makes this easy with libraries like pytesseract.
What You Need
We'll use the following Python libraries:
pytesseract: Python wrapper for Tesseract OCR engine.Pillow: To handle image processing.
Install Requirements
Before you begin, install the libraries:
pip install pytesseract pillow
You also need to install Tesseract OCR on your system and make sure it's accessible via your system path.
How the Script Works — Step-by-Step
Step 1: Import Necessary Modules
from PIL import Image
import pytesseract
import os
PIL.Imagehandles loading and processing the image.pytesseractperforms the OCR to extract text.oshelps with file path handling.
Step 2: Load the Image
image_path = r"C:\Users\YourName\Pictures\sample_image.png"
image = Image.open(image_path)
Replace the path with the location of your image. The image is opened using Pillow’s Image.open() method.
Step 3: Extract Text from the Image
textracted_text = pytesseract.image_to_string(image)
This uses pytesseract to convert the image into text. It handles character recognition, spacing, and line breaks.
Step 4: Save the Extracted Text (Optional)
output_path = os.path.splitext(image_path)[0] + "_output.txt"
with open(output_path, 'w', encoding='utf-8') as f:
f.write(extracted_text)
The extracted text is saved to a .txt file next to the original image.
Step 5: Print the Output
print("✅ Text extracted from image:")
print(extracted_text)
This prints the text in the console for immediate viewing.
Full Script
from PIL import Image
import pytesseract
import os
# === Step 1: Provide the path to the image file ===
image_path = r"C:\Users\YourName\Pictures\sample_image.png" # Replace with your actual image path
# === Step 2: Open the image using Pillow ===
image = Image.open(image_path)
# === Step 3: Use pytesseract to do OCR ===
extracted_text = pytesseract.image_to_string(image)
# === Step 4: Save the extracted text to a .txt file ===
output_path = os.path.splitext(image_path)[0] + "_output.txt"
with open(output_path, 'w', encoding='utf-8') as f:
f.write(extracted_text)
# === Step 5: Display the result ===
print("✅ Text extracted from image:")
print(extracted_text)
Tips for Better Accuracy
Use high-quality, high-resolution images.
Avoid blurry or skewed pictures.
Convert images to black and white for better contrast.
Use
image = image.convert("L")to convert to grayscale if needed.Preprocess images (e.g., resize, sharpen) using OpenCV for complex tasks.
Use Cases
Scanning business cards into contact lists.
Digitizing printed books.
Extracting information from forms, IDs, or receipts.
Reading license plates, signs, or handwritten notes.
Summary
OCR is a powerful tool that unlocks text trapped in images. With Python, pytesseract, and a few lines of code, you can extract and manipulate that text with ease. This project is a great way to dip your toes into real-world automation and image processing.
Happy Scripting ! !




