Ragie

Connects your unstructured data to AI agents by automating document ingestion, partitioning, and retrieval. Use this to turn static files and URLs into a queryable knowledge base for your workflows.

Try Ragie in Ceven

Ask Ceven anything
Standard

Why use Ceven?

  1. AI native Ragie integration

    • Describe the outcome and Ceven picks the right Ragie calls, fills the parameters, and checks the result.
    • Structured, agent friendly tool schemas so each call runs reliably instead of by guesswork.
    • Rich coverage for reading, writing, and querying your Ragie data, across all 31 of its actions.
  2. Managed auth

    • Built in OAuth with automatic token refresh and rotation.
    • One place to manage, scope, and revoke Ragie access.
    • Per user and per environment credentials instead of shared keys.
  3. Agent optimized design

    • Actions are tuned from real success and error rates so reliability climbs over time.
    • Full execution logs so you always know what ran in Ragie, when, and on whose behalf.
    • The agent pauses and asks when Ragie is unclear instead of plowing ahead.
  4. Enterprise grade security

    • Fine grained access so you control which agents and people can reach Ragie.
    • Least privilege by default, read scopes first and only the writes a workflow needs.
    • A full audit trail of every Ragie action to support review and sign off.

Supported tools

Every action Ceven's agents can run on Ragie, and when to use it.

Create Document
Use this to upload and process a document file in various formats including text and images.
Create Document From URL
Ingest a document from a publicly accessible URL to add external sources to your knowledge base.
Create Document Raw
Ingest a document as raw text or JSON data for immediate processing and indexing.
Create Instruction
Define natural language directives for structured data extraction or analysis during ingestion.
Create Partition
Create a new partition to scope documents and set resource limits for different tenants.
Delete Document
Permanently remove a document from the system using its unique identifier.
Get Document Summary
Pull an LLM generated summary of a specific document by its ID.
List Documents
Retrieve a paginated list of all documents to browse metadata and creation dates.
List Entities By Document
Retrieve all structured entities extracted from a specific document.
Retrieve Document Chunks
Search and pull relevant document chunks based on a query with optional reranking.
Patch Document Metadata
Modify specific metadata fields for a document without replacing the entire object.
Update Document From URL
Refresh the content of an existing document by fetching it from a public URL again.
Create OAuth Redirect URL
Tool to create an OAuth redirect URL for initializing embedded connector OAuth flows. Use when you need to set up OAuth authentication for connectors like Google Drive, Notion, or HubSpot.
Delete Instruction
Tool to delete an instruction and all associated entities. Use when you need to permanently remove an instruction (irreversible operation). Requires the instruction ID (UUID format).
Delete Partition
Tool to delete a partition and all associated data irreversibly. Use when you need to permanently remove a partition. Returns status 200 for synchronous deletion or 202 for asynchronous deletion.
Get Document
Tool to retrieve a specific document by its unique identifier. Use when you need to get document details, metadata, processing status, or check for errors. Returns comprehensive document information including chunk count, page count, and an
Get Document Chunk
Tool to retrieve a specific document chunk by its document and chunk ID. Use when you need detailed information about a specific chunk within a document, including its content, metadata, position index, and optional modality data for audio/
Get Document Chunk Content
Tool to retrieve document chunk content in requested format with streaming support for media. Use when you need to get the actual content of a specific chunk from a document.
Get Document Chunks
Tool to retrieve document chunks with pagination support. Lists all document chunks sorted by index in ascending order (max 100 items per page). Documents created prior to 9/18/2024 that have not been updated since have chunks sorted by ID
Get Document Content
Tool to retrieve the content of a document by its ID. Use when you need to access the full content of a specific document. The media_type parameter can be used to request content in different formats.
Get Partition
Tool to retrieve a partition by ID with usage statistics and resource limits. Use when you need to get detailed information about a specific partition.
Get Response
Tool to retrieve a response by its unique identifier. Use when you need to check the status or details of a previously created response.
List Connections
Tool to list all connections sorted by creation date descending with pagination support. Use when you need to retrieve connections, optionally filtered by metadata.
List Connection Source Types
Tool to list available connection source types like 'google_drive' and 'notion' along with their metadata. Use when you need to discover what connector types are available in Ragie.
List Entities by Instruction
Tool to retrieve entities generated by a specific instruction. Use when you need to fetch entities extracted from documents based on a specific instruction's processing.
List Instructions
Tool to retrieve all instruction records from the Ragie system. Use when you need to view all available instructions that define natural language prompts and entity schemas applied to documents.
List Partitions
Tool to retrieve a paginated list of all partitions sorted by name in ascending order. Use when you need to list available partitions with their configurations and limits.
Set Partition Limits
Tool to set usage limits on partition pages and media. Use when you need to configure monthly or maximum limits for pages processed/hosted, video/audio processing, or media streaming/hosting for a specific partition.
Update Document Raw
Tool to update a document's content from raw text or JSON data. Use when modifying existing document content. The document undergoes processing and becomes available for retrieval once it reaches the ready state.
Update Instruction
Tool to update an instruction's active status. Use when you need to activate or deactivate an existing instruction.

30 actions · scroll to see them all

Frequently asked questions

Ragie allows you to update documents via raw text or public URLs. When you use the update tool, the system replaces the existing content and puts the document back through the processing pipeline. This includes re partitioning the text and updating the vector embeddings to ensure that retrieval calls return the most current information. You can track the status of this update through the document metadata until it reaches the ready state. This process ensures your AI agents are not hallucinating based on outdated documentation or old policy files that have since been revised by your team.
Partitions are logical containers used to isolate documents and connections. This is critical for multi tenant applications where you must ensure that a query for Customer A never retrieves data belonging to Customer B. By scoping every retrieval and ingestion call to a specific partition ID, you create a hard boundary at the data layer. Ceven can automate the creation of these partitions during your user onboarding workflow, assigning a unique partition to every new account and setting specific resource limits to prevent any single user from consuming your entire processing quota or storage capacity.
Yes. Ragie uses Instructions to apply natural language directives to documents during ingestion. You can define a schema of entities you want to find, such as contract expiration dates or product SKUs. Ragie then processes the document and stores these as extracted entities. Using Ceven, you can list these entities by document ID and push them into a structured database like Airtable or Postgres. This turns a folder of messy PDFs into a clean, queryable table of data without requiring you to build a custom OCR and parsing pipeline.
Ragie provides tools to update documents from URLs or raw text, but it is not a live mirror. You must trigger the update via an API call or a Ceven workflow. For example, you can set a schedule in Ceven to pull a URL every twenty four hours and call the Update Document From URL action. Once the call is made, Ragie handles the partitioning and indexing. This means there is a small lag between the source content changing and the AI agent seeing the update, depending on how often your workflow runs the refresh.
A document is the entire file or text block you upload to Ragie. A chunk is a smaller, semantically meaningful piece of that document created during the partitioning process. When you perform a retrieval search, Ragie does not return the whole document because that would exceed the context window of most LLMs. Instead, it returns the most relevant chunks. Ceven can retrieve these specific chunks and feed them into a prompt, or it can use the Get Document Content tool if you actually need the full text for a task like a complete rewrite.
Yes. Ragie enforces limits on the number of pages and media processed per partition. Depending on your plan, you may hit a ceiling on the total number of hosted pages or the amount of video and audio processing allowed per month. If you exceed these limits, the API will return an error and the document will not be indexed. You can use the Get Partition tool in Ceven to monitor your current usage statistics and programmatically trigger an alert or a plan upgrade when you approach eighty percent of your limit to avoid workflow interruptions.
When you trigger a retrieval action, Ceven sends your natural language query to Ragie. Ragie converts that query into a vector and searches its index for the closest matching chunks within the specified partition. It can also perform reranking to ensure the most helpful content is at the top of the list. Ceven then receives these chunks and their associated metadata. You can then instruct your AI agent to answer the user query using only those chunks, which significantly reduces hallucinations and ensures the answer is grounded in your own private data.
Yes. Ragie supports various formats including images. When you upload an image via the Create Document tool, Ragie processes the visual information to make it searchable and retrievable. This is particularly useful for diagrams, screenshots of software, or scanned invoices. The AI agent in Ceven can then retrieve the text or descriptions extracted from these images to answer questions. This allows you to build a knowledge base that includes visual evidence and technical drawings alongside your standard text documents and web pages.

Alternatives to Ragie

Other tools that solve a similar problem. Ceven supports these too, so you can switch or run more than one at once.

Pinecone logoPineconeLlamaIndex logoLlamaIndexWeaviate logoWeaviate

Try Ceven on your stack

Plug Ceven on top of the tools you already run. Connect Ragie and the rest of your stack, describe the outcome, and its agents handle the work end to end, days of it in minutes.

Get started for free