DEV Community

Cover image for Fast Data Entry: Automating Data Entry for Efficiency
Roopkumar Das
Roopkumar Das

Posted on

Fast Data Entry: Automating Data Entry for Efficiency

Introduction

Data entry is a crucial but often time-consuming task. Instead of manually entering data into Excel, we can use Python and Pandas to speed up the process significantly. While some basic Python knowledge is required, don't worry—most of it is just plain English reasoning.

In this blog, we'll explore different methods of data entry based on the format of the source data, such as physical documents, images, PDFs, and more.

Methods of Data Entry

There are two main approaches to data entry:

1. Automated Data Entry: Using OCR & Pandas in Python

If the data is in image or scanned document format, we can extract text using OCR (Optical Character Recognition) and process it with Pandas.

Example: Extracting Data from an Image

We'll use the EasyOCR library to extract text from an image file.

small_business_loan.png

import easyocr

# Creating a reader that processes English text
reader = easyocr.Reader(['en'])

# Reading text from the image
result = reader.readtext('small_business_loan.png', detail=0)

for text in result:
    print(text)
Enter fullscreen mode Exit fullscreen mode

Output (Example OCR Result):

Table B: Value of Small Business Loans Outstanding by Depository Lender Size, 2016 to 2020
Billions of Dollars, Nominal
Percentage Change,
Lenders by Total Asset Size
...
Enter fullscreen mode Exit fullscreen mode

As you can see, OCR might introduce errors in the extracted text. Rather than relying entirely on AI to correct it, a manual review and correction step is often necessary. Once corrected, we can save the text in a structured format like a CSV file.

Converting OCR Output to a CSV File

Once we have a cleaned text file, we can structure it into a CSV format for easy import into Excel.

import pandas as pd
from io import StringIO

if __name__ == "__main__":
    with open("output.txt", "r") as f:
        data = f.read()

    data = data.strip()  # Remove trailing newlines
    data_lines = data.split("\n")

    new_text = []
    col_count = 0

    for text in data_lines:
        new_text.append(text)
        if col_count != 6:
            col_count += 1
            new_text.append(",")
        else:
            col_count = 0
            new_text.append("\n")

    csv_content = "".join(new_text)
    data_csv_file = StringIO(csv_content)
    df = pd.read_csv(data_csv_file)
    print(df)
Enter fullscreen mode Exit fullscreen mode

Saving Data to Excel Format

To store this structured data in an Excel file, simply use:

df.to_excel("business_loans.xlsx", sheet_name="Business Loans", index=False)
Enter fullscreen mode Exit fullscreen mode

2. Manual Data Entry: Why CSV is Faster Than Excel

Manual data entry in Excel requires frequent switching between cells using the mouse or arrow keys, which can slow you down. If you're entering data row-wise (e.g., filling out a loan application form), Excel forces you to press Tab or Arrow keys to navigate between columns.

A faster alternative is to enter data into a CSV file using only the keyboard. Commas (,) separate values, making it easier to stay focused without switching keys frequently.

Example: Small Business Loan Data in CSV Format

Lenders by Total Asset Size,2016,2017,2018,2019,2020,Percentage Change (2019-2020)
Less than $100 million,11.5,10.4,10,8.5,7.4,-13.4
$100 million to $499.9 million,96,90.5,86.4,82.2,87.4,6.3
$500 million to $999.9 million,52.8,55.6,54,51.9,70.7,36.1
$1 billion to $9.9 billion,139.5,133,127,129.9,215.4,65.8
$10 million to $49.9 billion,73.5,85.6,106.9,114.5,162,41.5
$50 billion or more,240.5,243.7,248.2,257.4,352.4,36.9
Enter fullscreen mode Exit fullscreen mode

This structured format allows for quick keyboard entry without mouse interaction.

Converting CSV to Excel Using Python

Once the data is entered in CSV format, we can easily convert it into an Excel file using Pandas

import pandas as pd

df = pd.read_csv("small_business_loans.csv")
df.to_excel("small_business_loans.xlsx", index=False)
Enter fullscreen mode Exit fullscreen mode

Conclusion: Choosing the Right Data Entry Method

To speed up data entry, choose the most efficient method based on your data format:

  • If the data is in an image/PDF → Use OCR + Python for automation.

  • If the data is already in text format → Convert it directly into CSV.

  • If manual entry is necessary → Use CSV format instead of Excel to speed up typing.

By using these techniques, you can automate and streamline data entry, saving time and effort while ensuring accuracy. 🚀

Top comments (0)