Make your unstructured large documents LLM ready markdown using LandingAI Document Parsing.
- Automatically watches a Google Drive folder, submits new documents to Landing.ai for parsing, caches processed files in - Supabase to avoid reprocessing, and reliably polls results with retry and timeout handling.
Use Cases
- Automated document ingestion for RAG pipelines
- Invoice, contract, or report parsing
- AI-powered document analysis workflows
- Knowledge base ingestion from Google Drive
- Preventing duplicate document processing in ETL pipelines
External services:
Credentials Required
Required
- Google Drive OAuth2
- Landing.ai API (HTTP Bearer Token)
- Supabase API
How it works
Once the pdf land in google drive location it trigger and it convert pdf (even more then 200 pages to LLM ready markdown).
It also check in database if the parsing is already done or not, this help to avoid any unnecessary landingAI api call.
Setup Instructions
Step 1: Google Drive
- Create or select a folder in Google Drive
- Copy the folder ID
- Update the Google Drive Trigger node with this folder ID
- Create a Landing.ai account
- Generate an API key
- Add it in n8n as an HTTP Bearer Auth credential
- Update the
organization-id header if required
Step 3: Supabase
- Create a Supabase project
- Create a table named
landing_parse_cache
- Add fields such as:
file_id
document_name
mime_type
file_size_bytes
job_id
job_status
markdown
uploaded_at
workflow_run_id
- Connect Supabase credentials in n8n
Expected Input
- A document uploaded into the configured Google Drive folder
(PDF, DOCX, or other supported formats)
Expected Output
- Parsed markdown content stored in Supabase
- Metadata including:
- File ID
- File name
- MIME type
- File size
- Job ID
- Processing status
- Early exit if the document already exists in cache
Error Handling & Edge Cases
- Cache check to prevent duplicate processing
- Retry-based polling for async job completion
- Timeout detection for stuck jobs
- Large file output URL handling
- Detailed logging for debugging and audits
Customization Ideas
- Push parsed output to a vector database
- Trigger Slack or email notifications
- Store results in cloud storage (S3, GCS)
- Extend into a RAG or AI agent pipeline
Categories
- Document Processing
- AI & LLM
- Knowledge Management
- Automation
Difficulty Level
Advanced
Happy Automating - from Alok