Browse our Products

If so you can download any of the below versions for testing. The product will function as normal except for an evaluation limitation. At the time of purchase we provide a license file via email that will allow the product to work in its full capacity. If you would also like an evaluation license to test without any restrictions for 30 days, please follow the directions provided here.


Docs Swagger Examples Blog Support Release Notes Dashboard

Installation

The package is available at nuget.org and it can be installed via package manager console by executing following command:

PM> NuGet\Install-Package GroupDocs.Parser-Cloud

Version NuGet NuGet-GroupDocsCloud


Document Parsing & Data Extraction .NET SDK

GroupDocs.Parser Cloud is a robust REST API designed to streamline document parsing and data extraction in your cloud-based .NET applications. Whether you need to extract text, images, metadata, or structured data using custom templates, this API offers high accuracy, fast processing, and scalability for enterprise-level operations. It supports a wide range of document formats, integrates seamlessly with multiple programming languages, and ensures secure API access with JWT authentication. With features like batch processing, Docker support, and comprehensive SDKs, GroupDocs.Parser Cloud is ideal for any document processing or data extraction task.

General Features

Text Extraction

Extract text from a wide range of document formats.

Document Info Extraction

Extract metadata and other document information such as title, author, and subject.

Image Extraction

Extract images embedded within documents.

Container Items Info Extraction

Extract information from container file formats like ZIP, PST, and OST.

Parse by Template

Parse documents by using custom templates for structured data extraction.

Document Processing Features

Metadata Extraction

Extracts metadata such as author, creation date, etc., from supported file formats.

Template-Based Parsing

Define templates for structured data extraction, ideal for processing forms, invoices, and other structured documents.

Batch Processing

Process multiple documents in a single request, making it efficient for large-scale operations.

Integration Features

RESTful API

Access the parser features via a REST API for easy integration into any platform.

SDK Availability

SDKs available for multiple programming languages including .NET, Java, Python, PHP, and more.

Platform Agnostic

Can be used across various platforms such as Windows, macOS, and Linux.

Security and Authentication

JWT Authentication

Ensures secure API access through JSON Web Token (JWT) authentication.

Client ID and Secret

Use Client ID and Secret for making secure API calls.

Data Encryption

Supports secure and encrypted communication between the client and the API.

Performance Features

High Accuracy

Provides accurate text extraction using advanced algorithms.

Fast Processing

Optimized for quick data extraction, suitable for high-performance applications.

Scalability

Can handle large volumes of documents efficiently, supporting enterprise-level operations.

Usability Features

Comprehensive Documentation

Extensive documentation and code samples available to help developers get started quickly.

API Explorer

Built-in API explorer for testing and exploring the API functionalities directly in the browser.

Multi-Platform Support

Compatible with various operating systems including Windows, Linux, and macOS.

Deployment and Hosting

Docker Support

Can be deployed in a Docker container for private cloud or on-premises hosting.

Self-Hosting

Allows running the API on your infrastructure with full control over the environment.

Automatic Scaling

Automatically scales to meet varying workloads, ensuring high availability.

Supported Document Formats

The following table indicates the file formats from which GroupDocs.Parser Cloud can extract data.

Document TypeFile FormatParse Document by TemplateExtract TextExtract Document InfoExtract ImagesExtract Container Items Info
Word ProcessingDOC - Microsoft Word Document✔️✔️✔️✔️
DOT - Microsoft Word Document Template✔️✔️✔️✔️
DOCX - Office Open XML Document✔️✔️✔️✔️
DOCM - Office Open XML Macro-Enabled Document✔️✔️✔️✔️
DOTX - Office Open XML Document Template✔️✔️✔️✔️
DOTM - Office Open XML Document Macro-Enabled Template✔️✔️✔️✔️
TXT - Plain Text✔️✔️
ODT - Open Document Text✔️✔️✔️✔️
OTT - Open Document Text Template✔️✔️✔️✔️
RTF - Rich Text Format✔️✔️✔️✔️
PDFPDF - Portable Document Format File✔️✔️✔️✔️
MarkupHTML - Hypertext Markup Language File✔️✔️
XHTML - Extensible Hypertext Markup Language File✔️✔️
MHTML - MIME HTML File✔️✔️
MD - Markdown✔️✔️
XML - XML File✔️✔️
EbooksCHM - Compiled HTML Help File✔️✔️
EPUB - Digital E-Book File Format✔️✔️
FB2 - FictionBook 2.0 File✔️✔️
SpreadsheetXLS - Microsoft Excel Spreadsheet✔️✔️✔️✔️
XLT - Microsoft Excel Template✔️✔️✔️✔️
XLSX - Office Open XML Spreadsheet✔️✔️✔️✔️
XLSM - Office Open XML Macro-Enabled Spreadsheet✔️✔️✔️✔️
XLSB - Office Open XML Binary Spreadsheet✔️✔️✔️✔️
XLTX - Office Open XML Spreadsheet Template✔️✔️✔️✔️
XLTM - Office Open XML Macro-Enabled Spreadsheet Template✔️✔️✔️✔️
ODS - Open Document Spreadsheet✔️✔️✔️✔️
OTS - Open Document Spreadsheet Template✔️✔️✔️✔️
CSV - Comma Separated Values✔️✔️
XLA - Excel Add-In File✔️✔️✔️✔️
XLAM - Excel Open XML Macro-Enabled Add-In✔️✔️✔️✔️
NUMBERS - Apple iWork Numbers✔️✔️✔️✔️
PresentationsPPT - PowerPoint Presentation✔️✔️✔️✔️
PPS - PowerPoint Slideshow✔️✔️✔️✔️
POT - PowerPoint Template✔️✔️✔️✔️
PPTX - Office Open XML Presentation✔️✔️✔️✔️
PPTM - Office Open XML Macro-Enabled Presentation✔️✔️✔️✔️
POTX - Office Open XML Presentation Template✔️✔️✔️✔️
POTM - Office Open XML Macro-Enabled Presentation Template✔️✔️✔️✔️
PPSX - Office Open XML Presentation Slideshow✔️✔️✔️✔️
PPSM - Office Open XML Macro-Enabled Presentation Slideshow✔️✔️✔️✔️
ODP - Open Document Presentation✔️✔️✔️✔️
OTP - Open Document Presentation Template✔️✔️✔️✔️
EmailsPST - Outlook Personal Information Store File✔️✔️
OST - Outlook Offline Data File✔️✔️
EML - E-Mail Message✔️✔️✔️
EMLX - Apple Mail Message✔️✔️✔️
MSG - Outlook Mail Message✔️✔️✔️
NotesONE - OneNote Document✔️✔️
ArchivesZIP - Zipped File✔️✔️

Get Started

You do not need to install anything to get started with GroupDocs.Parser Cloud SDK for .Net. Just create an account at GroupDocs for Cloud and get your application information.

Simply execute Install-Package GroupDocs.Parser-Cloud from Package Manager Console in Visual Studio to fetch & reference GroupDocs.Parser assembly in your project. If you already have GroupDocs.Parser Cloud SDK for .Net and want to upgrade it, please execute Update-Package GroupDocs.Parser-Cloud to get the latest version.

Please check the GitHub Repository for common usage scenarios.

GroupDocs.Parser Cloud API Code Samples

These code samples demonstrate various parsing capabilities of GroupDocs.Parser Cloud, including extracting text, extracting images, and parsing documents by template.

Extracting Text from a Document

Learn how to extract text from a document using the GroupDocs.Parser Cloud API. This example demonstrates the text extraction process in C#.

using System;
using GroupDocs.Parser.Cloud.Sdk.Api;
using GroupDocs.Parser.Cloud.Sdk.Model.Requests;

namespace GroupDocs.Parser.Cloud.Sdk.Examples
{
    class Extract_Text_From_Document
    {
        public static void Run()
        {
            // Get your AppSID and AppKey from https://dashboard.groupdocs.cloud/ (free registration required)
            var configuration = new Configuration
            {
                AppSid = "YOUR_APP_SID",
                AppKey = "YOUR_APP_KEY"
            };

            // Initialize the Parser API instance
            var apiInstance = new ParserApi(configuration);

            try
            {
                // Define the document to parse
                var fileInfo = new FileInfo { Folder = "path/to/folder", Name = "document.docx" };

                // Create a text extraction request
                var request = new ExtractTextRequest(fileInfo);

                // Extract text from the document
                var response = apiInstance.ExtractText(request);

                // Output the extracted text to the console
                Console.WriteLine("Extracted Text: " + response.Text);
            }
            catch (Exception e)
            {
                // Handle any exceptions that occur during the API call
                Console.WriteLine("Exception when calling ParserApi.ExtractText: " + e.Message);
            }
        }
    }
}

Extracting Images from a Document

Learn how to extract images embedded within a document using the GroupDocs.Parser Cloud API. This example illustrates the process in C#.

using System;
using GroupDocs.Parser.Cloud.Sdk.Api;
using GroupDocs.Parser.Cloud.Sdk.Model.Requests;

namespace GroupDocs.Parser.Cloud.Sdk.Examples
{
    class Extract_Images_From_Document
    {
        public static void Run()
        {
            // Get your AppSID and AppKey from https://dashboard.groupdocs.cloud/ (free registration required)
            var configuration = new Configuration
            {
                AppSid = "YOUR_APP_SID",
                AppKey = "YOUR_APP_KEY"
            };

            // Initialize the Parser API instance
            var apiInstance = new ParserApi(configuration);

            try
            {
                // Define the document to parse
                var fileInfo = new FileInfo { Folder = "path/to/folder", Name = "document.pdf" };

                // Create an image extraction request
                var request = new ExtractImagesRequest(fileInfo);

                // Extract images from the document
                var response = apiInstance.ExtractImages(request);

                // Loop through and output each extracted image's info
                foreach (var image in response.Images)
                {
                    Console.WriteLine("Image Format: " + image.Format + ", Image Path: " + image.Path);
                }
            }
            catch (Exception e)
            {
                // Handle any exceptions that occur during the API call
                Console.WriteLine("Exception when calling ParserApi.ExtractImages: " + e.Message);
            }
        }
    }
}

Parsing Document by Template

Learn how to parse a document by using a custom template for structured data extraction with the GroupDocs.Parser Cloud API. This example shows the template-based parsing in C#.

using System;
using GroupDocs.Parser.Cloud.Sdk.Api;
using GroupDocs.Parser.Cloud.Sdk.Model.Requests;
using GroupDocs.Parser.Cloud.Sdk.Model;

namespace GroupDocs.Parser.Cloud.Sdk.Examples
{
    class Parse_Document_By_Template
    {
        public static void Run()
        {
            // Get your AppSID and AppKey from https://dashboard.groupdocs.cloud/ (free registration required)
            var configuration = new Configuration
            {
                AppSid = "YOUR_APP_SID",
                AppKey = "YOUR_APP_KEY"
            };

            // Initialize the Parser API instance
            var apiInstance = new ParserApi(configuration);

            try
            {
                // Define the document and template file
                var fileInfo = new FileInfo { Folder = "path/to/folder", Name = "invoice.pdf" };
                var templatePath = "path/to/template.json";

                // Create a template-based parsing request
                var request = new ParseRequest(fileInfo, templatePath);

                // Parse the document using the template
                var response = apiInstance.Parse(request);

                // Output the parsed data to the console
                foreach (var field in response.Fields)
                {
                    Console.WriteLine("Field Name: " + field.Name + ", Field Value: " + field.Value);
                }
            }
            catch (Exception e)
            {
                // Handle any exceptions that occur during the API call
                Console.WriteLine("Exception when calling ParserApi.Parse: " + e.Message);
            }
        }
    }
}

Docs Swagger Examples Blog Support Release Notes Dashboard


Tags

Document Data Extraction | REST API | GroupDocs.Parser | Text Extraction | Image Extraction | Template Parsing | Markdown Extraction | HTML Extraction | Container Files | Data Parsing | Document Information | File Management | Cloud Storage | SDKs | Cross Platform | Storage API | File Operations | Folder Operations | Security and Authentication | Document Parsing | API Integration | Data Extraction | ZIP Files | PDF | PST/OST Files | Extract Images | Document Processing | Data Extraction API | GroupDocs SDK | API Explorer | Metadata Extraction


 English