Batch Excel File Processor Using Python

Introduction

Manually opening, reading, and analyzing dozens of Excel files one by one is time-consuming and prone to error. With Python, we can automate this task by building a batch processor that can:

Read all Excel files in a folder
Perform operations like filtering, aggregation, or transformation
Combine and save results to a new Excel file

This tool is ideal for:

Sales teams consolidating monthly reports
Researchers analyzing repeated survey files
Finance teams tracking expenses across departments

What You’ll Need

Install the required packages:

pip install pandas openpyxl

pandas handles data manipulation
openpyxl enables reading/writing .xlsx files

What We'll Build

The script will:

Loop through all .xlsx files in a folder
Read each file into a Pandas DataFrame
Apply filtering or data transformations (optional)
Combine results from all files
Save the final output to a new Excel file

Step-by-Step Script with Explanations

Step 1: Import Required Libraries

import os
import pandas as pd

Explanation:
We import os to navigate through folders and pandas to handle Excel files.

Step 2: Set Folder Path and Get Excel Files

folder_path = r"C:\Users\YourName\Documents\ExcelData"  # Update this
excel_files = [f for f in os.listdir(folder_path) if f.endswith('.xlsx')]

Explanation:
This gets a list of all .xlsx files in the specified folder. You can change the path as needed.

Step 3: Read and Process Each File

combined_data = []

for file in excel_files:
    file_path = os.path.join(folder_path, file)
    df = pd.read_excel(file_path)

    # Optional: filter rows where "Sales" > 1000
    if "Sales" in df.columns:
        df = df[df["Sales"] > 1000]

    df["Source File"] = file  # Add filename for tracking
    combined_data.append(df)

Explanation:

Each Excel file is read into a DataFrame
Filters can be added based on your needs
A column for "Source File" is added to track which data came from which file

Step 4: Combine All DataFrames

final_df = pd.concat(combined_data, ignore_index=True)

Explanation:
Combines all filtered data into one master DataFrame.

Step 5: Save the Final Output

output_path = os.path.join(folder_path, "combined_output.xlsx")
final_df.to_excel(output_path, index=False)
print("Batch processing complete! Saved to:", output_path)

Full Script: Batch Excel Processor

import os
import pandas as pd

# Step 1: Set folder path
folder_path = r"C:\Users\YourName\Documents\ExcelData"  # Change this

# Step 2: Get all .xlsx files in the folder
excel_files = [f for f in os.listdir(folder_path) if f.endswith('.xlsx')]

# Step 3: Read and process each file
combined_data = []

for file in excel_files:
    file_path = os.path.join(folder_path, file)
    df = pd.read_excel(file_path)

    # Optional: Filter rows where "Sales" > 1000
    if "Sales" in df.columns:
        df = df[df["Sales"] > 1000]

    df["Source File"] = file  # Track origin file
    combined_data.append(df)

# Step 4: Combine all data
if combined_data:
    final_df = pd.concat(combined_data, ignore_index=True)

    # Step 5: Save the combined result
    output_path = os.path.join(folder_path, "combined_output.xlsx")
    final_df.to_excel(output_path, index=False)
    print("Batch processing complete! File saved to:", output_path)
else:
    print("No Excel files found or files are empty.")

Output Example

The final Excel file (combined_output.xlsx) might look like this:

Date	Product	Sales	Region	Source File
2024-01-03	WidgetA	1500	East	sales_jan.xlsx
2024-02-10	WidgetB	1200	West	sales_feb.xlsx

Customizations

You can easily extend this script to:

Process CSV files: Use pd.read_csv()

Apply multiple filters:

  df = df[(df["Sales"] > 1000) & (df["Region"] == "West")]

Create summary reports:

  summary = final_df.groupby("Region")["Sales"].sum()

Final Thoughts

Batch Excel processing is a powerful automation technique that saves hours of repetitive work. With a bit of Python and pandas, you can:

Scale data processing across multiple files
Add filters, summaries, and exports
Customize workflows for business, finance, or research

Happy Scripting ! !

Batch Excel File Processor Using Python

Introduction

What You’ll Need

What We'll Build

Step-by-Step Script with Explanations

Step 1: Import Required Libraries

Step 2: Set Folder Path and Get Excel Files

Step 3: Read and Process Each File

Step 4: Combine All DataFrames

Step 5: Save the Final Output

Full Script: Batch Excel Processor

Output Example

Customizations

Final Thoughts

Comments

More from this blog

Build a Network Sniffer in Python Using Scapy

Building a Port Scanner with Python

Build a Simple URL Shortener in Python

QR Code Generator with Python

Create a Secure Password Generator Using Python

Command Palette

Introduction

What You’ll Need

What We'll Build

Step-by-Step Script with Explanations

Step 1: Import Required Libraries

Step 2: Set Folder Path and Get Excel Files

Step 3: Read and Process Each File

Step 4: Combine All DataFrames

Step 5: Save the Final Output

Full Script: Batch Excel Processor

Output Example

Customizations

Final Thoughts

Comments

More from this blog