🚀
Upstage EduStage
  • Welcome
  • Getting Started
    • Introduction to Upstage x AWS AI Initiative
    • Getting Started with Upstage Edu Full Package
  • Basics
    • [KOR] Edu Full Package - No-Code Zone
      • Introduction to LLM
      • Capabilities of LLM
      • Introduction to Solar
      • Introduction to Embedding
      • Introduction to Document AI
      • Introduction to Document Parse
    • [KOR] Edu Full Package - Dev Start Zone
      • Introduction to Upstage API
      • Getting Started with Solar Chat
      • Getting Started with Document Digitization
    • [KOR] Edu Full Package - Use Case Zone
      • Introduction to RAG
      • Introduction to AI Agent
    • [ENG] Edu Full Package - No-Code Zone
      • Introduction to LLM
      • Capabilities of LLM
      • Introduction to Solar
      • Introduction to Embedding
      • Introduction to Document AI
      • Introduction to Document Parse
    • [ENG] Edu Full Package - Dev Start Zone
      • Introduction to Upstage API
      • Getting Started with Solar Chat
      • Getting Started with Document Digitization
    • [ENG] Edu Full Package - Use Case Zone
      • Introduction to RAG
      • Introduction to AI Agent
Powered by GitBook
On this page
  • 1. What is Document AI?
  • 2. Why DocAI?
  • 🚀 (1) Increased Productivity
  • 🌎 (2) Enhanced Accessibility
  • 💰 (3) Cost Reduction and Error Minimization
  • 3. Upstage DocAI Comparison
  • 💡 Upstage Document OCR (Optical Character Recognition)
  • 💡 Upstage Information Extract
  • 💡 Upstage Document Parse
  • Wrap Up
  1. Basics
  2. [ENG] Edu Full Package - No-Code Zone

Introduction to Document AI

PreviousIntroduction to EmbeddingNextIntroduction to Document Parse

Last updated 2 months ago

📌 Table of Contents

  • What is Document AI?

  • Why DocAI?

  • Upstage Document AI Comparison

1. What is Document AI?

Document AI (DocAI) is an AI technology that digitizes documents and automatically extracts key information.

Although it may seem unfamiliar, DocAI is already integrated into many everyday tasks, significantly reducing the time spent on document-related processes.

Let’s look at some examples.

✔ Automated Document Scanning and Data Extraction

Have you ever scanned your ID at a bank or a contract and seen key details automatically extracted?

DocAI recognizes text in documents and accurately extracts essential information like names, dates, and addresses, automatically populating relevant fields.

✔ Receipt Processing Automation

Submitting receipts for reimbursement can be tedious when manually entering the date and amount.

DocAI scans receipts and automatically extracts details such as date, amount, and items, organizing them efficiently.

This reduces repetitive manual work and streamlines document processing.

✔ Automatic Document Classification

Have you seen a system automatically categorize multiple types of documents, such as insurance papers or invoices, when scanning them?

DocAI identifies text in various documents, such as contracts, invoices, and IDs, and automatically classifies them for easier document management and retrieval.

2. Why DocAI?

DocAI is transforming document processing in multiple ways.

🚀 (1) Increased Productivity

  • Automated Document Processing: Scanning, data extraction, and classification are automated, enhancing workflow speed and efficiency.

  • Accurate Data Extraction: Extracts key details quickly and accurately, reducing human errors in data entry.

  • Time-Saving: Reduces repetitive manual tasks, allowing employees to focus on more critical work.

🌎 (2) Enhanced Accessibility

  • Support for Various Document Formats: Can process paper documents, PDFs, and images, facilitating digital transformation.

  • Multilingual Document Processing: Recognizes and processes documents in multiple languages, improving international workflows.

  • Structured Data Output: Organizes extracted data for easier searching and analysis.

💰 (3) Cost Reduction and Error Minimization

  • Lower Labor Costs: Automation reduces the need for manual data entry, optimizing workforce efficiency.

  • Fewer Errors: Automated extraction minimizes input errors and enhances data reliability.

3. Upstage DocAI Comparison

Upstage provides three primary tools for document processing and information extraction: Document OCR, Document Parse, and Information Extract.

💡 Upstage Document OCR (Optical Character Recognition)

Document OCR extracts raw text from scanned images or documents.

  • Example: Extracting "Apple Inc." as plain text from a financial statement.

  • Best Use Case: When you need to extract text-only quickly.

  • Output Format: Plain text.

💡 Upstage Information Extract

Information Extract automatically extracts key structured information from documents.

  • Example: Extracting {Company Name: 'Apple Inc'} from a financial statement.

  • Best Use Case: When you need to extract not just text but also structured information like company names, dates, and amounts.

  • Output Format: JSON-formatted structured data.

💡 Upstage Document Parse

Document Parse recognizes document structure and converts it into a structured format (HTML or Markdown) that LLMs can process.

  • Example: Extracting "Apple Inc." from a financial statement and structuring it in HTML.

  • Best Use Case:

    • Used for complex documents such as reports, financial statements, or research papers where tables, figures, and formulas need to be structured.

    • Prepares documents in a structured format for LLMs to summarize or analyze.

  • Output Format: HTML-formatted structured data.

Wrap Up

This guide covered DocAI's definition, importance, and Upstage DocAI’s strengths.

🔹 Key DocAI Functions: Automates document scanning, data extraction, and classification, maximizing efficiency.

🔹 Importance of DocAI: Improves productivity, accessibility, and cost savings.

🔹 Upstage Document OCR: Extracts text from scanned documents.

🔹 Upstage Document Parse: Converts extracted text into structured formats (HTML, etc.) for LLM processing and recognizes tables and charts.

🔹 Upstage Information Extract: Extracts structured data from complex documents and layouts.

YoungHoon Jeon | AI Edu | Upstage

👉 Now, try Upstage DocAI and explore the differences between each product!

Try Upstage DocumentAI