Retrieving Vectors from Vector Store Files
Ragwalla supports retrieving the actual vector embeddings for files in your vector stores. This feature gives you direct access to the processed embeddings, allowing you to examine how your documents were chunked and embedded, or export vectors for use in other systems.
Overview
When you upload files to a vector store, Ragwalla processes them by:
1. Extracting text from your documents
2. Chunking the text into smaller segments
3. Generating embeddings for each chunk using your chosen embedding model
4. Storing the vectors in Cloudflare Vectorize
The vector retrieval API lets you access these stored embeddings along with their metadata, giving you full visibility into how your content was processed.
API Endpoint
List Vectors for a File
GET /v1/vector_stores/{vectorStoreId}/files/{fileId}/vectors
Parameters:
- {vectorStoreId}
- The ID of your vector store
- {fileId}
- The ID of the file whose vectors you want to retrieve
Get a Single Vector
GET /v1/vector_stores/{vectorStoreId}/vectors/{vectorId}
Parameters:
- {vectorStoreId}
- The ID of your vector store
- {vectorId}
- The specific vector/chunk ID you want to retrieve
Request Examples
Basic Request
curl -X GET "https://your-instance.ragwalla.com/v1/vector_stores/vs_abc123/files/file_xyz789/vectors" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "X-Site-Name: your-site-name"
Request with Parameters
curl -X GET "https://your-instance.ragwalla.com/v1/vector_stores/vs_abc123/files/file_xyz789/vectors?limit=50&include_values=false" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "X-Site-Name: your-site-name"
Single Vector Request
curl -X GET "https://your-instance.ragwalla.com/v1/vector_stores/vs_abc123/vectors/chunk_abc123" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "X-Site-Name: your-site-name"
Query Parameters
Parameter | Type | Default | Description |
---|---|---|---|
include_values |
boolean | true |
Whether to include the actual vector values in the response |
limit |
integer | 100 |
Number of vectors to return (max: 1000) |
cursor |
string | - | Pagination cursor from previous response |
Response Examples
Successful Response
When the file has been fully processed and vectors are available:
{
"object": "list",
"data": [
{
"object": "vector",
"id": "chunk_abc123",
"values": [0.1234, -0.5678, 0.9012, ...],
"metadata": {
"resource_type": "file",
"fileId": "file_xyz789",
"chunkId": "chunk_abc123",
"chunk_number": 0,
"chunk_path": "content.text",
"chunk_start": 0,
"chunk_end": 512,
"file_id": "file_xyz789",
"filename": "document.pdf",
"page": 1
},
"created_at": 1754400370
},
{
"object": "vector",
"id": "chunk_def456",
"values": [0.2345, -0.6789, 0.0123, ...],
"metadata": {
"resource_type": "file",
"fileId": "file_xyz789",
"chunkId": "chunk_def456",
"chunk_number": 1,
"chunk_path": "content.text",
"chunk_start": 512,
"chunk_end": 1024,
"file_id": "file_xyz789",
"filename": "document.pdf",
"page": 2
},
"created_at": 1754400371
}
],
"has_more": true,
"next_cursor": "2"
}
File Still Processing
If the file is still being processed, you'll get a 202 status:
{
"object": "list",
"data": [],
"has_more": false,
"status": "in_progress",
"message": "Vectors are still being generated. Please try again later.",
"file_status": "in_progress",
"retry_after": 30
}
Processing Failed
If file processing failed, you'll get a 400 status:
{
"error": "File processing failed",
"file_id": "file_xyz789",
"status": "failed",
"error_message": "PDF extraction failed: Invalid PDF format"
}
Without Vector Values
When include_values=false
:
{
"object": "list",
"data": [
{
"object": "vector",
"id": "chunk_abc123",
"metadata": {
"resource_type": "file",
"fileId": "file_xyz789",
"chunkId": "chunk_abc123",
"chunk_number": 0,
"chunk_path": "content.text",
"chunk_start": 0,
"chunk_end": 512,
"file_id": "file_xyz789",
"filename": "document.pdf"
},
"created_at": 1754400370
}
],
"has_more": false
}
Understanding Vector Metadata
Each vector includes detailed metadata about the chunk it represents:
Field | Description |
---|---|
chunk_number |
Sequential number of this chunk within the file |
chunk_path |
Path to the content (usually "content.text") |
chunk_start |
Character position where this chunk starts |
chunk_end |
Character position where this chunk ends |
filename |
Original filename |
page |
Page number (for PDFs and similar documents) |
fileId |
The original file ID |
chunkId |
Unique identifier for this chunk |
Pagination
When working with large files that have many chunks, use pagination:
let cursor = null;
let allVectors = [];
do {
const params = new URLSearchParams({
limit: '100',
...(cursor && { cursor })
});
const response = await fetch(
`https://your-instance.ragwalla.com/v1/vector_stores/vs_abc123/files/file_xyz789/vectors?${params}`,
{
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'X-Site-Name': 'your-site-name'
}
}
);
const data = await response.json();
allVectors.push(...data.data);
cursor = data.next_cursor;
} while (cursor);
Common Use Cases
Debugging Chunking Strategy
Examine how your documents were split into chunks:
// Get vectors without values to see chunk boundaries
const response = await fetch(
'https://your-instance.ragwalla.com/v1/vector_stores/vs_abc123/files/file_xyz789/vectors?include_values=false',
{ /* headers */ }
);
const data = await response.json();
data.data.forEach(vector => {
console.log(`Chunk ${vector.metadata.chunk_number}: chars ${vector.metadata.chunk_start}-${vector.metadata.chunk_end}`);
});
Exporting Vectors for Analysis
Export embeddings for use in other systems:
const response = await fetch(
'https://your-instance.ragwalla.com/v1/vector_stores/vs_abc123/files/file_xyz789/vectors',
{ /* headers */ }
);
const data = await response.json();
const embeddings = data.data.map(vector => ({
id: vector.id,
text_chunk: `${vector.metadata.filename} chunk ${vector.metadata.chunk_number}`,
embedding: vector.values,
metadata: vector.metadata
}));
// Now you can use embeddings in your own analysis
Quality Assurance
Check that file processing completed successfully:
async function checkFileProcessing(vectorStoreId, fileId) {
const response = await fetch(
`https://your-instance.ragwalla.com/v1/vector_stores/${vectorStoreId}/files/${fileId}/vectors?limit=1`,
{ /* headers */ }
);
if (response.status === 202) {
console.log('File still processing...');
return false;
} else if (response.status === 400) {
const error = await response.json();
console.log('Processing failed:', error.error_message);
return false;
} else {
const data = await response.json();
console.log(`File processed successfully: ${data.data.length} chunks found`);
return true;
}
}
Important Notes
- Vector dimensions depend on your embedding model (e.g., 1536 for text-embedding-3-small)
- Large files may have hundreds or thousands of chunks - use pagination appropriately
- Processing time varies based on file size and complexity
- Rate limits apply - avoid making too many concurrent requests
- Vector values are the actual embeddings generated by your chosen model
Error Handling
Always handle potential errors when retrieving vectors:
async function getVectors(vectorStoreId, fileId) {
try {
const response = await fetch(
`https://your-instance.ragwalla.com/v1/vector_stores/${vectorStoreId}/files/${fileId}/vectors`,
{
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'X-Site-Name': 'your-site-name'
}
}
);
if (response.status === 202) {
// Still processing
const data = await response.json();
console.log(data.message);
return { status: 'processing', retryAfter: data.retry_after };
} else if (!response.ok) {
// Error occurred
const error = await response.json();
throw new Error(error.error || 'Failed to retrieve vectors');
}
return await response.json();
} catch (error) {
console.error('Error retrieving vectors:', error.message);
throw error;
}
}
Next Steps
Now that you can retrieve vectors from your files, you might want to:
- Analyze the quality of your chunking strategy
- Export embeddings for use in custom similarity search
- Debug issues with document processing
- Integrate vectors with external analytics tools
For more advanced vector operations, see our Vector Search Guide and Custom Embedding Models documentation.
Need help with vector retrieval? Contact our support team for assistance with implementing vector analysis in your applications.