To export search results from LlamaIndex, you can use built-in methods to serialize the output or convert it into common formats like JSON or CSV. When you perform a search using LlamaIndex, the results are typically returned as objects containing text, metadata, and references to source data. To export these, you’ll need to extract the relevant fields and write them to a file or database. For example, after running a query, you can access properties like response.response
(the text answer) or source_nodes
(the source documents) and save them programmatically.
A straightforward approach is to convert the results into JSON. For instance, if your query returns a Response
object, you can extract its response
and metadata
attributes. Using Python’s json
module, you can serialize this data into a JSON file. Here’s a simplified example:
response = index.query("Your query")
result_data = {
"answer": response.response,
"sources": [node.text for node in response.source_nodes]
}
import json
with open("output.json", "w") as f:
json.dump(result_data, f)
This creates a structured file with the answer and its sources. For CSV exports, you could iterate over multiple results and write rows using the csv
module, including columns like query
, answer
, and source_url
.
For more advanced use cases, you might process metadata or batch-export results. If your search includes scores, timestamps, or document IDs, you can extend the export logic to include these fields. For example, when working with multiple responses, you could aggregate results into a list of dictionaries and use libraries like Pandas to create a DataFrame, then export it to CSV or Excel. Additionally, LlamaIndex’s Response
objects can be serialized directly using pickle
or jsonpickle
for later reloading, though this may require handling custom object encoders.
If you need to integrate with external systems, consider formatting results for databases or APIs. For instance, you could write a script that converts LlamaIndex responses into SQL inserts or uploads them to a cloud storage service like AWS S3. Tools like LlamaIndex’s Document
classes also support saving to disk with save_to_dir()
, which preserves metadata and text for future reloading. The key is to map the structure of LlamaIndex’s output to your target format, ensuring consistency for downstream applications.
Zilliz Cloud is a managed vector database built on Milvus perfect for building GenAI applications.
Try FreeLike the article? Spread the word