Skip to main content

Command Palette

Search for a command to run...

Converting Large Text Files to Speech with Python

Updated
4 min read
Converting Large Text Files to Speech with Python

Introduction

Text-to-Speech (TTS) technology allows a program to read text aloud using a synthetic voice. It's widely used in accessibility applications, educational tools, and content automation. In this article, we’ll walk through how to use Python to convert large .txt files into spoken audio using the pyttsx3 library.

Unlike other TTS libraries that require an internet connection, pyttsx3 works offline and is compatible with multiple platforms. This makes it a reliable choice for converting large files without latency or API limitations.

What You’ll Learn

  • How to install and use the pyttsx3 library

  • How to process large text files in manageable chunks

  • How to speak or save long texts as audio

  • How to avoid common performance issues

Installing Required Library

We will be using the pyttsx3 library, which provides an interface to platform-specific TTS engines.

Install it using pip:

pip install pyttsx3

This library will allow Python to access your system’s speech engine without needing any external services.

Why Process Large Text Files in Chunks?

When working with large text files—such as eBooks, articles, or transcripts—trying to process the entire content at once may cause memory issues or make the TTS engine unresponsive. To handle this efficiently, we divide the file into smaller chunks and process them sequentially.

Step-by-Step Script to Speak Large Files

The following script reads a large text file and speaks it aloud in smaller segments.

import pyttsx3

# Step 1: Initialize the TTS engine
engine = pyttsx3.init()
engine.setProperty('rate', 150)  # Set speech rate (words per minute)

# Step 2: Function to read large text in chunks
def read_in_chunks(file_path, chunk_size=3000):
    with open(file_path, 'r', encoding='utf-8') as file:
        while True:
            chunk = file.read(chunk_size)
            if not chunk:
                break
            yield chunk

# Step 3: Speak the file content chunk by chunk
file_path = "large_text.txt"  # Path to your large text file

for i, chunk in enumerate(read_in_chunks(file_path)):
    print(f"Speaking chunk {i + 1}...")
    engine.say(chunk)
    engine.runAndWait()

print("Finished speaking all text.")

Explanation of the Code

  1. Initialization: We initialize the TTS engine using pyttsx3.init() and configure the speech rate using engine.setProperty('rate', 150). A slower rate can be more natural for long reads.

  2. Chunk Generator: The function read_in_chunks() reads the file in segments of a specified number of characters. This prevents memory overload and keeps the voice engine responsive.

  3. Processing Loop: For each chunk, we instruct the engine to speak the content using engine.say() and wait for it to complete using engine.runAndWait().

  4. Status Output: We print the current chunk number to indicate progress through the text.

Saving the Spoken Text as an Audio File

You may want to save the spoken version as an audio file, especially for offline playback or sharing. pyttsx3 provides a simple method for saving audio.

Here’s how to save the entire text as an MP3 or WAV file:

import pyttsx3

engine = pyttsx3.init()
engine.setProperty('rate', 150)

# Step 1: Read the full file content
with open("large_text.txt", "r", encoding='utf-8') as file:
    full_text = file.read()

# Step 2: Save the speech to a file
engine.save_to_file(full_text, "output_audio.mp3")
engine.runAndWait()

print("Audio saved as 'output_audio.mp3'")

Notes

  • The save_to_file() function writes the spoken output to a file.

  • On some systems, .mp3 output may not be supported. Use .wav if you encounter issues.

Handling Extremely Large Files

If your text file is extremely large (for example, an entire book or a multi-thousand-line transcript), you might run into memory limitations when reading the whole file at once. In such cases, use chunked processing and save each chunk as a separate audio file:

from datetime import datetime
import pyttsx3

engine = pyttsx3.init()
engine.setProperty('rate', 150)

def read_in_chunks(file_path, chunk_size=3000):
    with open(file_path, 'r', encoding='utf-8') as file:
        while True:
            chunk = file.read(chunk_size)
            if not chunk:
                break
            yield chunk

file_path = "large_text.txt"

for i, chunk in enumerate(read_in_chunks(file_path)):
    filename = f"audio_chunk_{i+1}.mp3"
    print(f"Saving {filename}...")
    engine.save_to_file(chunk, filename)
    engine.runAndWait()

print("All audio chunks saved.")

This approach allows you to split a long book into manageable audio clips, which can then be played separately.

Customizing Voice and Speed

You can also customize the voice (male or female, if available) and speech speed.

voices = engine.getProperty('voices')
engine.setProperty('voice', voices[1].id)  # 0 for male, 1 for female
engine.setProperty('rate', 130)  # Slow down the voice

Use print(voices) to see the list of voices installed on your system.

Conclusion

Using Python and the pyttsx3 library, you can build an efficient and flexible solution to convert large text files into speech. By reading the file in chunks, customizing the voice and speed, and optionally saving the output to audio files, you can create a useful tool for accessibility, education, or content delivery.

This approach can be further extended with graphical interfaces, integration into web apps, or scheduled automation. Whether you're building a personal assistant or a voice-based content reader, Python gives you a powerful and offline-capable starting point.

Happy Scripting ! !

More from this blog

PyScript Academy

29 posts

PyScript Academy is a blog sharing practical Python scripts, tips, and mini projects—helping you learn Python by doing, one useful script at a time.