Turn documents into structured, queryable knowledge. Upload files, define a schema, and let your agents traverse entities and relationships — not just match keywords.
What is a knowledge graph?
A knowledge graph extracts structured entities and relationships from your documents and stores them in a queryable graph. Instead of plain vector search ("find chunks that sound similar to my question"), your agent can follow connections: Who founded Acme Corp? → What products does Acme make? → Which customers use those products?
You define what matters by providing a schema — the entity types and relationship types you care about. The system uses an LLM to extract matching entities and relationships from every file you add, then indexes them for both semantic search and graph traversal.
When to use a knowledge graph vs. plain vector search
Use a knowledge graph when:
- Your data has meaningful relationships (people → companies, products → features, papers → citations)
- Your agents need to answer multi-hop questions ("Who manages the team that built Feature X?")
- You want structured extraction, not just fuzzy retrieval
- You need to browse, filter, and inspect what was extracted
Use plain vector search when:
- You just need "find the most relevant passage" for a question
- Your documents are homogeneous and don't have relational structure
- Speed matters more than precision
Quick start
1. Create a knowledge graph with a schema
The schema tells the extraction model what entity types and relationship types to look for.
curl -X POST https://api.ragwalla.com/knowledge_graphs \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Engineering Org",
"description": "People, teams, and projects across engineering",
"embedding_settings": {
"model": "text-embedding-3-small"
},
"extraction_schema": {
"type": "object",
"properties": {
"entities": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": { "type": "string" },
"entity_type": { "enum": ["Person", "Team", "Project", "Technology"] },
"properties": { "type": "object" }
},
"additionalProperties": false,
"required": ["name", "entity_type"]
}
},
"relationships": {
"type": "array",
"items": {
"type": "object",
"properties": {
"from_entity": { "type": "string" },
"to_entity": { "type": "string" },
"relationship_type": { "enum": ["manages", "member_of", "works_on", "uses"] },
"properties": { "type": "object" }
},
"additionalProperties": false,
"required": ["from_entity", "to_entity", "relationship_type"]
}
}
},
"additionalProperties": false,
"required": ["entities", "relationships"]
}
}'
{
"id": "kg_abc123",
"object": "knowledge_graph",
"name": "Engineering Org",
"description": "People, teams, and projects across engineering",
"project_id": "proj_xyz",
"embedding_model": "text-embedding-3-small",
"extraction_model": "google/gemini-3-flash-preview",
"extraction_schema": { "..." },
"extraction_prompt": null,
"document_schema": null,
"dimensions": 1536,
"metric": "cosine",
"status": "active",
"entity_count": 0,
"relationship_count": 0,
"file_count": 0,
"created_at": 1710000000
}
2. Upload a file and add it to the graph
Files are uploaded through the Files API first, then associated with a knowledge graph.
# Upload the file
curl -X POST https://api.ragwalla.com/files \
-H "Authorization: Bearer $API_KEY" \
-F purpose=knowledge_graph \
-F file=@team-roster.pdf
# Add it to the knowledge graph
curl -X POST https://api.ragwalla.com/knowledge_graphs/kg_abc123/files \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{ "file_id": "file_roster456" }'
{
"knowledge_base_id": "kg_abc123",
"file_id": "file_roster456",
"status": "pending",
"entity_count": 0,
"relationship_count": 0,
"created_at": 1710000000
}
The file is now queued for processing. The system will:
- Extract text (with OCR for PDFs)
- Chunk the content
- Generate embeddings
- Run the extraction model against your schema to find entities and relationships
- Index everything for search and traversal
Processing is asynchronous. Poll the file status to track progress:
curl https://api.ragwalla.com/knowledge_graphs/kg_abc123/files/file_roster456 \
-H "Authorization: Bearer $API_KEY"
{
"knowledge_base_id": "kg_abc123",
"file_id": "file_roster456",
"filename": "team-roster.pdf",
"content_type": "application/pdf",
"bytes": 245000,
"status": "active",
"entity_count": 34,
"relationship_count": 47,
"chunks_extracted": 12,
"total_chunks": 12,
"created_at": 1710000000,
"updated_at": 1710000060
}
File status progresses through: pending → processing → active (or failed).
3. Query the graph
curl -X POST https://api.ragwalla.com/knowledge_graphs/kg_abc123/query \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "Who works on the payments project?",
"max_hops": 2,
"top_k": 10
}'
{
"entities": [
{
"entity_id": "ent_pay001",
"name": "Payments",
"entity_type": "Project",
"properties_json": "{\"status\": \"active\", \"started\": \"2024-Q1\"}"
}
],
"neighbors": [
{
"entity_id": "ent_per042",
"name": "Alice Chen",
"entity_type": "Person",
"properties_json": "{\"role\": \"Tech Lead\"}"
},
{
"entity_id": "ent_team03",
"name": "Platform Team",
"entity_type": "Team",
"properties_json": null
}
],
"relationships": [
{
"from_entity": "ent_per042",
"to_entity": "ent_pay001",
"relationship_type": "works_on"
},
{
"from_entity": "ent_per042",
"to_entity": "ent_team03",
"relationship_type": "member_of"
}
]
}
The query endpoint combines semantic vector search with keyword matching, then traverses the graph outward from the matched entities. max_hops controls how far to follow relationships — 1 returns direct connections, 2 returns connections of connections.
4. Attach the graph to an agent
curl -X POST https://api.ragwalla.com/agents/ag_myagent/knowledge_graphs \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{ "knowledge_base_id": "kg_abc123" }'
{
"agent_id": "ag_myagent",
"knowledge_base_id": "kg_abc123",
"created_at": 1710000000
}
Once attached, the agent gets a search_knowledge_graph tool. When users ask questions, the agent can query the graph, follow relationships, and include the results in its response — without you writing any tool code.
Schemas
Extraction schema
The extraction schema is a JSON Schema that defines what the extraction model looks for in your documents. It must follow a specific structure:
{
"type": "object",
"properties": {
"entities": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": { "type": "string" },
"entity_type": { "enum": ["Person", "Company", "Product"] },
"properties": { "type": "object" }
},
"additionalProperties": false,
"required": ["name", "entity_type"]
}
},
"relationships": {
"type": "array",
"items": {
"type": "object",
"properties": {
"from_entity": { "type": "string" },
"to_entity": { "type": "string" },
"relationship_type": { "enum": ["founded_by", "works_at", "produces"] },
"properties": { "type": "object" }
},
"additionalProperties": false,
"required": ["from_entity", "to_entity", "relationship_type"]
}
}
},
"additionalProperties": false,
"required": ["entities", "relationships"]
}
Key rules:
entity_typemust be an enum — the extraction model picks from a fixed list of types, not free textrelationship_typemust also be an enumadditionalProperties: falseis required at each level- The optional
propertiesobject on entities and relationships lets you capture extra attributes (role, date, amount, etc.) without constraining them to a schema
Document schema
If you already have a JSON Schema that describes the structure of your documents (fields, nested objects, etc.), you can send it as document_schema and the system will infer an appropriate extraction schema from it using an LLM:
curl -X POST https://api.ragwalla.com/knowledge_graphs \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Invoice Graph",
"embedding_settings": { "model": "text-embedding-3-small" },
"document_schema": {
"type": "object",
"properties": {
"invoice_number": { "type": "string" },
"vendor": { "type": "string" },
"line_items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": { "type": "string" },
"amount": { "type": "number" }
}
}
}
}
}
}'
You cannot provide both extraction_schema and document_schema in the same request.
Schema suggestion
Not sure what schema to use? Upload a few files first, then ask the system to suggest one:
curl -X POST https://api.ragwalla.com/knowledge_graphs/kg_abc123/schema/suggest \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"max_files": 5,
"max_chunks_per_file": 3
}'
{
"object": "knowledge_graph.schema_suggestion",
"knowledge_base_id": "kg_abc123",
"model": "google/gemini-3-flash-preview",
"sampled_files": 3,
"sampled_chunks": 9,
"source_file_ids": ["file_a", "file_b", "file_c"],
"extraction_schema": {
"type": "object",
"properties": {
"entities": { "..." },
"relationships": { "..." }
}
},
"summary": "Identified Person, Department, and Policy entity types with manages, belongs_to, and governs relationships.",
"assumptions": [
"Documents are HR policy files",
"Department names are unique identifiers"
]
}
The suggestion samples content from your uploaded files and generates a schema using the extraction model. You can scope the suggestion to specific files with file_ids, or let it sample automatically.
If the suggested schema is too weak (not enough distinct types or relationships), the endpoint returns 422 with an issues array explaining what's wrong.
Browsing entities and relationships
After files are processed, you can inspect what was extracted.
List entities
# All entities
curl "https://api.ragwalla.com/knowledge_graphs/kg_abc123/entities?limit=20" \
-H "Authorization: Bearer $API_KEY"
# Filter by type
curl "https://api.ragwalla.com/knowledge_graphs/kg_abc123/entities?entity_type=Person&limit=20" \
-H "Authorization: Bearer $API_KEY"
Get a specific entity
curl https://api.ragwalla.com/knowledge_graphs/kg_abc123/entities/ent_per042 \
-H "Authorization: Bearer $API_KEY"
List relationships
# All relationships
curl "https://api.ragwalla.com/knowledge_graphs/kg_abc123/relationships?limit=20" \
-H "Authorization: Bearer $API_KEY"
# Relationships for a specific entity
curl "https://api.ragwalla.com/knowledge_graphs/kg_abc123/relationships?entity_id=ent_per042" \
-H "Authorization: Bearer $API_KEY"
# Filter by relationship type
curl "https://api.ragwalla.com/knowledge_graphs/kg_abc123/relationships?relationship_type=manages" \
-H "Authorization: Bearer $API_KEY"
# Filter by source file
curl "https://api.ragwalla.com/knowledge_graphs/kg_abc123/relationships?source_file_id=file_roster456" \
-H "Authorization: Bearer $API_KEY"
Delete an entity
curl -X DELETE https://api.ragwalla.com/knowledge_graphs/kg_abc123/entities/ent_per042 \
-H "Authorization: Bearer $API_KEY"
Search vs. query
There are two ways to retrieve information from a knowledge graph:
Search — flat semantic match
curl -X POST https://api.ragwalla.com/knowledge_graphs/kg_abc123/search \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "machine learning engineers",
"top_k": 10,
"entity_type": "Person"
}'
Search finds entities whose names or descriptions are semantically similar to your query. It returns a flat list — no graph traversal, no relationships. Use it when you want a quick lookup.
Query — semantic match + graph traversal
curl -X POST https://api.ragwalla.com/knowledge_graphs/kg_abc123/query \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "machine learning engineers",
"max_hops": 2,
"top_k": 10
}'
Query starts with the same semantic match, then walks outward through relationships up to max_hops levels. It returns the matched entities, their neighbors, and the relationships connecting them. Use it when you need context — not just "who matches?" but "who are they connected to, and how?"
Managing knowledge graphs
Update a knowledge graph
curl -X POST https://api.ragwalla.com/knowledge_graphs/kg_abc123 \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Engineering Org v2",
"extraction_prompt": "Focus on reporting relationships and project ownership."
}'
You can update name, description, extraction_model, extraction_schema, extraction_prompt, and document_schema after creation. The embedding_settings are immutable — they're locked at creation because existing vectors were generated with that model.
Updating the extraction schema or prompt does not retroactively re-extract existing files. It applies to files added after the change.
Remove a file
curl -X DELETE https://api.ragwalla.com/knowledge_graphs/kg_abc123/files/file_roster456 \
-H "Authorization: Bearer $API_KEY"
{
"knowledge_base_id": "kg_abc123",
"file_id": "file_roster456",
"deleted": false,
"cleanup_queued": true,
"cleanup_status": "queued"
}
File removal is asynchronous. The response returns 202 immediately. Entities, relationships, vectors, and chunks associated with the file are cleaned up in the background.
List an agent's knowledge graphs
curl https://api.ragwalla.com/agents/ag_myagent/knowledge_graphs \
-H "Authorization: Bearer $API_KEY"
{
"object": "list",
"data": [
{
"agent_id": "ag_myagent",
"knowledge_base_id": "kg_abc123",
"name": "Engineering Org",
"description": "People, teams, and projects across engineering",
"embedding_model": "text-embedding-3-small",
"status": "active",
"created_at": 1710000000
}
]
}
Detach a knowledge graph from an agent
curl -X DELETE https://api.ragwalla.com/agents/ag_myagent/knowledge_graphs/kg_abc123 \
-H "Authorization: Bearer $API_KEY"
Delete a knowledge graph
curl -X DELETE https://api.ragwalla.com/knowledge_graphs/kg_abc123 \
-H "Authorization: Bearer $API_KEY"
{
"id": "kg_abc123",
"object": "knowledge_graph",
"deleted": true
}
Deleting a knowledge graph removes all entities, relationships, file associations, and vector indexes.
Use cases
Internal knowledge base
Upload company wikis, onboarding docs, and org charts. Define entity types like Person, Team, Policy, System with relationships like manages, owns, depends_on. Your support agents can then answer questions like "Who owns the billing system?" by traversing the graph rather than hoping a relevant text chunk appears in search results.
Research corpus
Upload academic papers with entity types like Author, Paper, Institution, Method and relationships like authored_by, cites, affiliated_with. Query with max_hops: 3 to discover citation chains and collaboration networks.
Product catalog
Upload product specs and datasheets. Extract Product, Feature, Component, Specification entities with has_feature, contains, compatible_with relationships. Sales agents can answer "Which products support feature X and are compatible with System Y?" with graph traversal instead of keyword guessing.
Compliance and policy
Upload regulatory documents. Extract Regulation, Requirement, Department, Process entities with subject_to, responsible_for, references relationships. Compliance agents can trace which departments are affected by a regulation change by following the graph.
Configuration reference
| Field | Set at | Mutable | Description |
|---|---|---|---|
name |
creation | yes | Display name |
description |
creation | yes | Human-readable description |
embedding_settings.model |
creation | no | Embedding model for vector search |
embedding_settings.metric |
creation | no | Distance metric (default: cosine) |
extraction_model |
creation | yes | LLM used for entity/relationship extraction |
extraction_schema |
creation | yes | JSON Schema defining entity and relationship types |
extraction_prompt |
creation | yes | Custom instructions for the extraction model |
document_schema |
creation | yes | Document structure schema (auto-compiles to extraction schema) |
API reference summary
| Method | Endpoint | Description |
|---|---|---|
POST |
/knowledge_graphs |
Create a knowledge graph |
GET |
/knowledge_graphs |
List knowledge graphs |
GET |
/knowledge_graphs/:id |
Get a knowledge graph |
POST |
/knowledge_graphs/:id |
Update a knowledge graph |
DELETE |
/knowledge_graphs/:id |
Delete a knowledge graph |
POST |
/knowledge_graphs/:id/files |
Add a file |
GET |
/knowledge_graphs/:id/files |
List files |
GET |
/knowledge_graphs/:id/files/:fileId |
Get file status |
DELETE |
/knowledge_graphs/:id/files/:fileId |
Remove a file |
GET |
/knowledge_graphs/:id/entities |
List entities |
GET |
/knowledge_graphs/:id/entities/:entityId |
Get an entity |
DELETE |
/knowledge_graphs/:id/entities/:entityId |
Delete an entity |
GET |
/knowledge_graphs/:id/relationships |
List relationships |
POST |
/knowledge_graphs/:id/search |
Semantic entity search |
POST |
/knowledge_graphs/:id/query |
Semantic search + graph traversal |
POST |
/knowledge_graphs/:id/schema/suggest |
Suggest an extraction schema |
POST |
/agents/:id/knowledge_graphs |
Attach graph to agent |
GET |
/agents/:id/knowledge_graphs |
List agent's graphs |
DELETE |
/agents/:id/knowledge_graphs/:kgId |
Detach graph from agent |