DEV Community

Cover image for 📝✨ClearText
Ajinkya Bobade
Ajinkya Bobade

Posted on

📝✨ClearText

This is a submission for the GitHub Copilot Challenge : Transitions and Transformations

What I Built

I have built "ClearText" which is an AI-powered text detection and enhancement tool that makes text in images cleaner.

Title bar

It's Perfect For 🎯

  • 📄 Document Digitization
  • 📚 Book Scanning
  • 📱 Mobile Photos of Text
  • 🖨️ Improving Scanned Documents
  • 📑 Text Enhancement in Images

Demo

ClearText Demo

Repo

Github Repository - ClearText

Here's an example of what ClearText can do:

ClearText Demo

Image description

ClearText takes input image (left hand side), removes all noise and outputs pure text (right hand side).

ClearText has a huge potential where it can be used in the following fields:

Document Processing 📄

  • Banking & Finance
    • 🏦 Check processing
    • 📊 Financial statement digitization

Healthcare 🏥

  • Medical Records
    • 📋 Patient records digitization
    • 🔬 Lab report enhancement

Legal Industry ⚖️

  • Document Management
    • 📜 Contract digitization
    • 🗄️ Case file processing

Academic Use Cases 📚

  • 📖 Textbook scanning
  • 📑 Research paper digitization

Copilot Experience 🤖

I used co-pilot extensively to complete this amazing project. Here are the ways in which co-pilot helped me :

Code Completion 📝

  • Auto-completed common OpenCV operations
  • Suggested image processing parameters
  • Completed function signatures for Streamlit components

Chat Assistance 💬

  • Debugged ONNX model loading issues
  • Explained image processing pipeline
  • Suggested optimizations for image transformations

Inline Suggestions ⚡

  • Recommended error handling patterns
  • Suggested variable names and types

Model Switching 🔄

Used different models for specific tasks:

  • Code Completion: GitHub Copilot
  • Documentation: Claude
  • Debugging: GPT-4

Common Prompts Used 🎯

# Function implementation
/explain image processing pipeline
/suggest error handling
/optimize performance
Enter fullscreen mode Exit fullscreen mode

Code Edits ✏️

  • Refactored image processing functions
  • Added blur/no-blur options
  • Improved error messages
  • Enhanced documentation

Project Evolution & Contributions

Building on Open Source

This project builds upon the excellent CRAFT text detection model by CLOVA AI Research, while making significant architectural and functional improvements:

1. Production-Ready Architecture 🏗️

  • I converted the research-focused PyTorch model to production-ready ONNX format
  • Leveraged ONNX Runtime for optimized inference across different hardware
  • Added complete Docker containerization for reliable deployment

2. Enhanced Text Processing Pipeline 🔄

The original CRAFT model provides basic text detection. ClearText significantly expands on this by:

  • Adding custom image preprocessing for better text clarity
  • Implementing new post-processing transforms for enhanced output quality
  • Creating an entirely new text enhancement pipeline
  • Developing a user-friendly web interface for easy interaction

3. Major Output Improvements 📈

ClearText transforms the basic text detection output into a comprehensive text enhancement solution:

  • Original CRAFT: Basic text region detection
  • ClearText Additions:
    • Text clarity enhancement
    • Document digitization capabilities
    • Support for various document types (books, mobile photos, scanned documents)
    • Complete image processing pipeline

Transparency Statement

While this project builds upon CRAFT's foundational text detection capabilities, ClearText represents a significant evolution with entirely new functionality, architecture, and use cases. All original CRAFT code is properly credited and licensed under MIT License.

Conclusion

Developing ClearText during the GitHub Copilot 1-Day Build Challenge has been an amazing journey. Without co-pilot, transforming complex text detection model into an accessible, user-friendly web application would have been tremendously difficult. The project showcases how AI can bridge the gap between computer vision and practical, everyday use cases.

Top comments (0)