Vector-Grounded Graph Retrieval: The Pattern Behind Modern GraphRAG

Knowledge graphs encode structured facts that LLMs can reason over, but they require precise entry points — while user queries are fuzzy natural language. Vector-grounded graph retrieval solves this by using embedding similarity to discover seed entities, then walking the graph outward to surface connected relationships. This retrieve-and-traverse pattern has become the standard approach in production GraphRAG systems, combining the semantic flexibility of vector search with the structural reasoning of graph traversal.

Share:

Knowledge graphs are powerful. They encode entities, relationships, and structured facts in a way that LLMs can reason over. But there's a fundamental tension at the heart of using them in AI applications: graphs require precise entry points, while user queries are fuzzy natural language.

Ask a knowledge graph "Who reports to the VP of Engineering?" and you need to first find the VP of Engineering node. If the user's actual query was "who's on the engineering leadership team?" — good luck with exact string matching.

This is where vector search enters the picture, and the combination of the two has become the dominant pattern in production GraphRAG systems.

The Retrieve-and-Traverse Pattern

The idea is simple and composable:

  1. Embed the query — convert the user's natural language question into a vector using the same embedding model used to index entity descriptions
  2. Vector search for seed entities — find the top-K entities whose embeddings are closest to the query vector. These are your entry points into the graph.
  3. Graph walk from seeds — starting from those seed entities, traverse relationships outward (1-3 hops) to discover connected entities and the edges between them
  4. Return the subgraph — the seed entities, their neighbors, and all connecting relationships form the context window for the LLM
User Query
    │
    ▼
┌──────────────┐
│  Embed Query  │
└──────┬───────┘
       │ vector
       ▼
┌──────────────┐     ┌─────────────────┐
│ Vector Index  │────▶│  Seed Entity IDs │
│  (top-K)      │     │  (entry points)  │
└──────────────┘     └────────┬────────┘
                              │
                              ▼
                    ┌─────────────────┐
                    │  Graph Traversal │
                    │  (N hops out)    │
                    └────────┬────────┘
                             │
                             ▼
                    ┌─────────────────┐
                    │   Subgraph       │
                    │   (entities +    │
                    │    relationships) │
                    └────────┬────────┘
                             │
                             ▼
                    ┌─────────────────┐
                    │   LLM Prompt     │
                    │   (structured    │
                    │    context)      │
                    └─────────────────┘

Each stage does what it's best at. Vector search handles the fuzzy-to-precise mapping. The graph handles structured relationship traversal. The LLM handles reasoning over the assembled context.

Why Not Just Use Vector Search Alone?

Vector search over a chunked document corpus (vanilla RAG) retrieves passages. Those passages might mention related entities, but there's no guarantee that the retrieval will surface the connections between them.

Consider a knowledge base about a software company:

  • Chunk A mentions "Alice is the VP of Engineering"
  • Chunk B mentions "Bob reports to Alice"
  • Chunk C mentions "Bob leads the payments team"

A vector search for "who works on payments?" might retrieve Chunk C, giving you Bob. But it won't tell you that Bob reports to Alice, or that Alice is VP of Engineering — unless those facts happen to appear in the same chunk.

A knowledge graph with Bob --[reports_to]--> Alice --[holds_role]--> VP of Engineering and Bob --[leads]--> Payments Team captures these relationships explicitly. Vector search finds Bob as the seed entity, then a 2-hop graph walk surfaces the full reporting chain and team structure.

Graph traversal retrieves structure, not just similarity.

Why Not Just Use Graph Queries Alone?

Pure graph queries (Cypher, SPARQL, Gremlin) are precise but brittle. They require you to know:

  • The exact entity names or IDs to start from
  • The relationship types to traverse
  • The schema of the graph

Users don't think in graph query syntax. They ask "what's the connection between the authentication system and the billing module?" — a question that requires mapping natural language concepts to specific graph nodes before any traversal can begin.

Vector embeddings solve this by creating a semantic bridge between unstructured language and structured graph nodes. The embedding of "authentication system" will be close to entities named "AuthService", "Login Module", "OAuth Provider", etc. — even without exact lexical overlap.

Keyword Search as a Complement

In practice, the best implementations use a two-pronged seed discovery approach:

  1. Semantic search (vector similarity) — catches conceptual matches
  2. Keyword/full-text search — catches exact lexical matches that embeddings might rank lower

The two result sets are merged and deduplicated before feeding into graph traversal. This hedge matters because embeddings occasionally miss obvious exact matches, and keyword search occasionally misses semantic equivalents. Together they provide robust seed discovery.

The Graph Walk: BFS vs. Recursive Queries

Once you have seed entity IDs, the graph traversal can be implemented in several ways:

Iterative BFS

The straightforward approach: query for neighbors of seeds, then neighbors of neighbors, up to N hops. Each hop is a separate database query.

Hop 0: seeds = {A, B}
Hop 1: query edges FROM {A, B} → discover {C, D, E}
Hop 2: query edges FROM {C, D, E} → discover {F, G}
Result: {A, B, C, D, E, F, G} + all connecting edges

This works but requires multiple round trips and manual cycle detection.

Recursive CTEs

SQL databases that support WITH RECURSIVE can handle multi-hop traversal in a single query:

sql
WITH RECURSIVE traversal(entity_id, depth) AS (
    -- Base case: seed entities
    SELECT entity_id, 0
    FROM entities
    WHERE entity_id IN ('seed_1', 'seed_2')

    UNION

    -- Recursive step: walk edges
    SELECT
        CASE WHEN r.from_id = t.entity_id
             THEN r.to_id ELSE r.from_id END,
        t.depth + 1
    FROM traversal t
    JOIN relationships r
        ON r.from_id = t.entity_id
        OR r.to_id = t.entity_id
    WHERE t.depth < 3  -- max hops
)
SELECT DISTINCT entity_id, MIN(depth)
FROM traversal
GROUP BY entity_id;

The recursive CTE approach has several advantages:

  • Single query — no application-level loop, no multiple round trips
  • Native cycle preventionUNION (not UNION ALL) deduplicates automatically
  • Query planner optimization — the database can optimize the join strategy across all hops
  • No bind parameter scaling — seed IDs can be inlined as literals; the frontier expansion happens inside the database engine

Native Graph Databases

If you're using Neo4j, Neptune, or similar, pattern matching queries like Cypher's variable-length paths handle this natively:

cypher
MATCH (seed)-[*1..3]-(neighbor)
WHERE seed.id IN $seedIds
RETURN DISTINCT neighbor, length(path) AS depth

The choice depends on your infrastructure. If you're already running a relational or document database with SQLite/PostgreSQL, recursive CTEs give you graph-like traversal without adding a new database to your stack.

Formatting the Subgraph for LLMs

The traversal produces a subgraph: entities with their properties, and relationships connecting them. This needs to be formatted into something an LLM can reason over effectively.

A markdown representation works well in practice:

markdown
### Entities
- Alice (Person) — role: VP of Engineering
- Bob (Person) — role: Tech Lead
- Payments Team (Team) — size: 12

### Relationships
- Bob --[reports_to]--> Alice
- Bob --[leads]--> Payments Team
- Alice --[oversees]--> Payments Team

### Connected Entities
- AuthService (Service) — status: active
- Payments Team --[depends_on]--> AuthService

This gives the LLM structured context that it can use for multi-hop reasoning — answering questions like "who oversees the team that depends on AuthService?" in a single inference step.

The Ecosystem

This pattern has converged across the industry:

  • Microsoft's GraphRAG uses vector embeddings of community summaries and entities to seed graph context assembly
  • Neo4j's GenAI integrations pair vector indexes on node properties with Cypher traversal from matched nodes
  • LlamaIndex's KnowledgeGraphRAG retrieves entities via embeddings, then expands through graph neighbors
  • Amazon Neptune + Bedrock combines vector similarity on node embeddings with Gremlin/SPARQL traversal

The consistency across these implementations isn't coincidental — it reflects the fact that vector search and graph traversal solve complementary problems, and their composition is strictly more powerful than either alone.

Key Design Decisions

If you're implementing this pattern, a few decisions matter:

How many seed entities? Top-10 from each search method (semantic + keyword) is a reasonable starting point. Too few and you miss relevant subgraphs. Too many and you pull in noise.

How many hops? 1-2 hops covers most practical use cases. 3 hops is useful for discovering indirect connections but can explode combinatorially on dense graphs. Cap the frontier size per hop to prevent blowup.

What to embed? Entity names alone are a minimum. Concatenating name + type + key_properties into the embedding input produces better semantic matches.

Direction filtering? Supporting outgoing-only, incoming-only, and bidirectional traversal lets callers control the shape of the subgraph. "What does X depend on?" (outgoing) is a different question than "What depends on X?" (incoming).

Relationship type filtering? Allowing callers to specify relationship types (e.g., only reports_to edges) prevents irrelevant connections from cluttering the context.

Conclusion

Vector-grounded graph retrieval is a simple pattern with outsized impact. It solves the entry point problem that makes knowledge graphs hard to query with natural language, while preserving the structural reasoning that makes knowledge graphs valuable in the first place.

The key insight is composability: vector search and graph traversal aren't competing approaches to retrieval — they're complementary stages in a pipeline. Vector search finds what's relevant. Graph traversal finds what's connected. Together, they give LLMs the structured context they need for multi-hop reasoning over complex domains.

Build your AI knowledge base today

Start creating intelligent AI assistants that understand your business, your documentation, and your customers.

Get started for free