If so you can download any of the below versions for testing. The product will function as normal except for an evaluation limitation. At the time of purchase we provide a license file via email that will allow the product to work in its full capacity. If you would also like an evaluation license to test without any restrictions for 30 days, please follow the directions provided here.
If you experience errors, when you try to download a file, make sure your network policies (enforced by your company or ISP) allow downloading ZIP and/or MSI files.
Installation
The package is available at nuget.org and it can be installed via package manager console by executing following command:
PM> NuGet\Install-Package GroupDocs.Parser-Cloud
GroupDocs.Parser Cloud is a robust REST API designed to streamline document parsing and data extraction in your cloud-based .NET applications. Whether you need to extract text, images, metadata, or structured data using custom templates, this API offers high accuracy, fast processing, and scalability for enterprise-level operations. It supports a wide range of document formats, integrates seamlessly with multiple programming languages, and ensures secure API access with JWT authentication. With features like batch processing, Docker support, and comprehensive SDKs, GroupDocs.Parser Cloud is ideal for any document processing or data extraction task.
General Features
Extract text from a wide range of document formats.
Extract metadata and other document information such as title, author, and subject.
Extract images embedded within documents.
Extract information from container file formats like ZIP, PST, and OST.
Parse by Template
Parse documents by using custom templates for structured data extraction.
Document Processing Features
Extracts metadata such as author, creation date, etc., from supported file formats.
Template-Based Parsing
Define templates for structured data extraction, ideal for processing forms, invoices, and other structured documents.
Batch Processing
Process multiple documents in a single request, making it efficient for large-scale operations.
Integration Features
RESTful API
Access the parser features via a REST API for easy integration into any platform.
SDK Availability
SDKs available for multiple programming languages including .NET, Java, Python, PHP, and more.
Can be used across various platforms such as Windows, macOS, and Linux.
Security and Authentication
JWT Authentication
Ensures secure API access through JSON Web Token (JWT) authentication.
Client ID and Secret
Use Client ID and Secret for making secure API calls.
Data Encryption
Supports secure and encrypted communication between the client and the API.
High Accuracy
Provides accurate text extraction using advanced algorithms.
Fast Processing
Optimized for quick data extraction, suitable for high-performance applications.
Scalability
Can handle large volumes of documents efficiently, supporting enterprise-level operations.
Usability Features
Comprehensive Documentation
Extensive documentation and code samples available to help developers get started quickly.
API Explorer
Built-in API explorer for testing and exploring the API functionalities directly in the browser.
Compatible with various operating systems including Windows, Linux, and macOS.
Deployment and Hosting
Docker Support
Can be deployed in a Docker container for private cloud or on-premises hosting.
Self-Hosting
Allows running the API on your infrastructure with full control over the environment.
Automatic Scaling
Automatically scales to meet varying workloads, ensuring high availability.
The following table indicates the file formats from which GroupDocs.Parser Cloud can extract data.
Document Type | File Format | Parse Document by Template | Extract Text | Extract Document Info | Extract Images | Extract Container Items Info |
---|
Word Processing | DOC - Microsoft Word Document | ✔️ | ✔️ | ✔️ | ✔️ | |
| DOT - Microsoft Word Document Template | ✔️ | ✔️ | ✔️ | ✔️ | |
| DOCX - Office Open XML Document | ✔️ | ✔️ | ✔️ | ✔️ | |
| DOCM - Office Open XML Macro-Enabled Document | ✔️ | ✔️ | ✔️ | ✔️ | |
| DOTX - Office Open XML Document Template | ✔️ | ✔️ | ✔️ | ✔️ | |
| DOTM - Office Open XML Document Macro-Enabled Template | ✔️ | ✔️ | ✔️ | ✔️ | |
| TXT - Plain Text | | ✔️ | ✔️ | | |
| ODT - Open Document Text | ✔️ | ✔️ | ✔️ | ✔️ | |
| OTT - Open Document Text Template | ✔️ | ✔️ | ✔️ | ✔️ | |
| RTF - Rich Text Format | ✔️ | ✔️ | ✔️ | ✔️ | |
PDF | PDF - Portable Document Format File | ✔️ | ✔️ | ✔️ | ✔️ | |
Markup | HTML - Hypertext Markup Language File | | ✔️ | ✔️ | | |
| XHTML - Extensible Hypertext Markup Language File | | ✔️ | ✔️ | | |
| MHTML - MIME HTML File | | ✔️ | ✔️ | | |
| MD - Markdown | | ✔️ | ✔️ | | |
| XML - XML File | | ✔️ | ✔️ | | |
Ebooks | CHM - Compiled HTML Help File | | ✔️ | ✔️ | | |
| EPUB - Digital E-Book File Format | | ✔️ | ✔️ | | |
| FB2 - FictionBook 2.0 File | | ✔️ | ✔️ | | |
Spreadsheet | XLS - Microsoft Excel Spreadsheet | ✔️ | ✔️ | ✔️ | ✔️ | |
| XLT - Microsoft Excel Template | ✔️ | ✔️ | ✔️ | ✔️ | |
| XLSX - Office Open XML Spreadsheet | ✔️ | ✔️ | ✔️ | ✔️ | |
| XLSM - Office Open XML Macro-Enabled Spreadsheet | ✔️ | ✔️ | ✔️ | ✔️ | |
| XLSB - Office Open XML Binary Spreadsheet | ✔️ | ✔️ | ✔️ | ✔️ | |
| XLTX - Office Open XML Spreadsheet Template | ✔️ | ✔️ | ✔️ | ✔️ | |
| XLTM - Office Open XML Macro-Enabled Spreadsheet Template | ✔️ | ✔️ | ✔️ | ✔️ | |
| ODS - Open Document Spreadsheet | ✔️ | ✔️ | ✔️ | ✔️ | |
| OTS - Open Document Spreadsheet Template | ✔️ | ✔️ | ✔️ | ✔️ | |
| CSV - Comma Separated Values | | ✔️ | ✔️ | | |
| XLA - Excel Add-In File | ✔️ | ✔️ | ✔️ | ✔️ | |
| XLAM - Excel Open XML Macro-Enabled Add-In | ✔️ | ✔️ | ✔️ | ✔️ | |
| NUMBERS - Apple iWork Numbers | ✔️ | ✔️ | ✔️ | ✔️ | |
Presentations | PPT - PowerPoint Presentation | ✔️ | ✔️ | ✔️ | ✔️ | |
| PPS - PowerPoint Slideshow | ✔️ | ✔️ | ✔️ | ✔️ | |
| POT - PowerPoint Template | ✔️ | ✔️ | ✔️ | ✔️ | |
| PPTX - Office Open XML Presentation | ✔️ | ✔️ | ✔️ | ✔️ | |
| PPTM - Office Open XML Macro-Enabled Presentation | ✔️ | ✔️ | ✔️ | ✔️ | |
| POTX - Office Open XML Presentation Template | ✔️ | ✔️ | ✔️ | ✔️ | |
| POTM - Office Open XML Macro-Enabled Presentation Template | ✔️ | ✔️ | ✔️ | ✔️ | |
| PPSX - Office Open XML Presentation Slideshow | ✔️ | ✔️ | ✔️ | ✔️ | |
| PPSM - Office Open XML Macro-Enabled Presentation Slideshow | ✔️ | ✔️ | ✔️ | ✔️ | |
| ODP - Open Document Presentation | ✔️ | ✔️ | ✔️ | ✔️ | |
| OTP - Open Document Presentation Template | ✔️ | ✔️ | ✔️ | ✔️ | |
Emails | PST - Outlook Personal Information Store File | | | ✔️ | | ✔️ |
| OST - Outlook Offline Data File | | | ✔️ | | ✔️ |
| EML - E-Mail Message | | ✔️ | ✔️ | | ✔️ |
| EMLX - Apple Mail Message | | ✔️ | ✔️ | | ✔️ |
| MSG - Outlook Mail Message | | ✔️ | ✔️ | | ✔️ |
Notes | ONE - OneNote Document | | ✔️ | ✔️ | | |
Archives | ZIP - Zipped File | | | ✔️ | | ✔️ |
Get Started
You do not need to install anything to get started with GroupDocs.Parser Cloud SDK for .Net. Just create an account at GroupDocs for Cloud and get your application information.
Simply execute Install-Package GroupDocs.Parser-Cloud
from Package Manager Console in Visual Studio to fetch & reference GroupDocs.Parser assembly in your project. If you already have GroupDocs.Parser Cloud SDK for .Net and want to upgrade it, please execute Update-Package GroupDocs.Parser-Cloud
to get the latest version.
Please check the GitHub Repository for common usage scenarios.
GroupDocs.Parser Cloud API Code Samples
These code samples demonstrate various parsing capabilities of GroupDocs.Parser Cloud, including extracting text, extracting images, and parsing documents by template.
Learn how to extract text from a document using the GroupDocs.Parser Cloud API. This example demonstrates the text extraction process in C#.
using System;
using GroupDocs.Parser.Cloud.Sdk.Api;
using GroupDocs.Parser.Cloud.Sdk.Model.Requests;
namespace GroupDocs.Parser.Cloud.Sdk.Examples
{
class Extract_Text_From_Document
{
public static void Run()
{
// Get your AppSID and AppKey from https://dashboard.groupdocs.cloud/ (free registration required)
var configuration = new Configuration
{
AppSid = "YOUR_APP_SID",
AppKey = "YOUR_APP_KEY"
};
// Initialize the Parser API instance
var apiInstance = new ParserApi(configuration);
try
{
// Define the document to parse
var fileInfo = new FileInfo { Folder = "path/to/folder", Name = "document.docx" };
// Create a text extraction request
var request = new ExtractTextRequest(fileInfo);
// Extract text from the document
var response = apiInstance.ExtractText(request);
// Output the extracted text to the console
Console.WriteLine("Extracted Text: " + response.Text);
}
catch (Exception e)
{
// Handle any exceptions that occur during the API call
Console.WriteLine("Exception when calling ParserApi.ExtractText: " + e.Message);
}
}
}
}
Learn how to extract images embedded within a document using the GroupDocs.Parser Cloud API. This example illustrates the process in C#.
using System;
using GroupDocs.Parser.Cloud.Sdk.Api;
using GroupDocs.Parser.Cloud.Sdk.Model.Requests;
namespace GroupDocs.Parser.Cloud.Sdk.Examples
{
class Extract_Images_From_Document
{
public static void Run()
{
// Get your AppSID and AppKey from https://dashboard.groupdocs.cloud/ (free registration required)
var configuration = new Configuration
{
AppSid = "YOUR_APP_SID",
AppKey = "YOUR_APP_KEY"
};
// Initialize the Parser API instance
var apiInstance = new ParserApi(configuration);
try
{
// Define the document to parse
var fileInfo = new FileInfo { Folder = "path/to/folder", Name = "document.pdf" };
// Create an image extraction request
var request = new ExtractImagesRequest(fileInfo);
// Extract images from the document
var response = apiInstance.ExtractImages(request);
// Loop through and output each extracted image's info
foreach (var image in response.Images)
{
Console.WriteLine("Image Format: " + image.Format + ", Image Path: " + image.Path);
}
}
catch (Exception e)
{
// Handle any exceptions that occur during the API call
Console.WriteLine("Exception when calling ParserApi.ExtractImages: " + e.Message);
}
}
}
}
Parsing Document by Template
Learn how to parse a document by using a custom template for structured data extraction with the GroupDocs.Parser Cloud API. This example shows the template-based parsing in C#.
using System;
using GroupDocs.Parser.Cloud.Sdk.Api;
using GroupDocs.Parser.Cloud.Sdk.Model.Requests;
using GroupDocs.Parser.Cloud.Sdk.Model;
namespace GroupDocs.Parser.Cloud.Sdk.Examples
{
class Parse_Document_By_Template
{
public static void Run()
{
// Get your AppSID and AppKey from https://dashboard.groupdocs.cloud/ (free registration required)
var configuration = new Configuration
{
AppSid = "YOUR_APP_SID",
AppKey = "YOUR_APP_KEY"
};
// Initialize the Parser API instance
var apiInstance = new ParserApi(configuration);
try
{
// Define the document and template file
var fileInfo = new FileInfo { Folder = "path/to/folder", Name = "invoice.pdf" };
var templatePath = "path/to/template.json";
// Create a template-based parsing request
var request = new ParseRequest(fileInfo, templatePath);
// Parse the document using the template
var response = apiInstance.Parse(request);
// Output the parsed data to the console
foreach (var field in response.Fields)
{
Console.WriteLine("Field Name: " + field.Name + ", Field Value: " + field.Value);
}
}
catch (Exception e)
{
// Handle any exceptions that occur during the API call
Console.WriteLine("Exception when calling ParserApi.Parse: " + e.Message);
}
}
}
}
Document Data Extraction
| REST API
| GroupDocs.Parser
| Text Extraction
| Image Extraction
| Template Parsing
| Markdown Extraction
| HTML Extraction
| Container Files
| Data Parsing
| Document Information
| File Management
| Cloud Storage
| SDKs
| Cross Platform
| Storage API
| File Operations
| Folder Operations
| Security and Authentication
| Document Parsing
| API Integration
| Data Extraction
| ZIP Files
| PDF
| PST/OST Files
| Extract Images
| Document Processing
| Data Extraction API
| GroupDocs SDK
| API Explorer
| Metadata Extraction