DEV Community

Shahzad Ashraf
Shahzad Ashraf

Posted on

Accurate PDF to Text Conversion in C# Apps

Extracting text from PDF files is important for developers engaged in document automation, data analysis, or content management systems. The GroupDocs.Conversion Cloud .NET SDK enables you to effortlessly convert PDF files to text in C# with minimal API calls, allowing for the transformation of both scanned PDFs and standard digital documents. This robust Cloud API guarantees high-quality text extraction while maintaining accuracy.

Utilizing our secure and robust REST API, you can seamlessly integrate PDF-to-text conversion into your C#, VB.NET, and ASP.NET applications without requiring extra dependencies or manual intervention. The API streamlines the extraction process, enabling you to programmatically obtain plain text from PDFs, making it perfect for text analysis, search indexing, and data processing tasks.

Are you ready to elevate your C# document conversion capabilities? Visit this step-by-step guide to begin today with just a few lines of code. You can incorporate PDF text extraction into your C# applications, reducing manual efforts and optimizing document workflows. Start developing smarter, more efficient cross-platform .NET solutions with only a few API calls.

This C# code sample will enable you to quickly add PDF to Text conversion functionality to your applications.

using System;
using System.IO;
using GroupDocs.Conversion.Cloud.Sdk.Api;
using GroupDocs.Conversion.Cloud.Sdk.Client;
using GroupDocs.Conversion.Cloud.Sdk.Model;
using GroupDocs.Conversion.Cloud.Sdk.Model.Requests;

namespace ConvertPDFtoText
{
    class Program
    {
        public static void Main(string[] args)
        {
            // Define your API credentials
            string MyAppSid = "your-app-sid";
            string MyAppSkey = "your-app-secret-key";

            // Initialize configuration object using client credentials
            var configuration = new Configuration(MyAppSid, MyAppSkey);

            // Instantiate the ConvertApi class for document conversion
            var apiInstance = new ConvertApi(configuration);

            try
            {
                // Path to the local PDF file
                string sourceFilePath = @"D:\Example Files\pdf-source.pdf";

                // Open the local PDF file for reading
                using (var fileStream = File.OpenRead(sourceFilePath))
                {
                    // Specify conversion settings directly in the ConvertDocumentDirectRequest
                    var convertRequest = new ConvertDocumentDirectRequest(
                        "txt",       // Target format: convert to Text
                        fileStream,  // Input file stream
                        null,        // No output storage (save locally)
                        null,        // No output path (save locally)
                        null         // No special load options required for PDF
                    );

                    // Perform the conversion and get the result as a stream
                    var responseStream = apiInstance.ConvertDocumentDirect(convertRequest);

                    // Save the converted text to a local file
                    using (var outputFile = File.Create(@"D:\Example Files\converted-pdf-to-text.txt"))
                    {
                        responseStream.CopyTo(outputFile);
                    }

                    // Print success message after saving the text file
                    Console.WriteLine("PDF to Text conversion completed and saved locally!");
                }
            }
            catch (Exception ex)
            {
                // Handle any exceptions that may occur
                Console.WriteLine("An error occurred: " + ex.Message);
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Top comments (0)