Tutoriel : Mise en œuvre d'un classement basé sur le temps dans MilvusCompatible with Milvus 2.6.x
Dans de nombreuses applications de recherche, la fraîcheur du contenu est tout aussi importante que sa pertinence. Les articles d'actualité, les listes de produits, les messages sur les médias sociaux et les documents de recherche bénéficient tous de systèmes de classement qui équilibrent la pertinence sémantique et la récence. Ce didacticiel montre comment mettre en œuvre un classement basé sur le temps dans Milvus à l'aide de classificateurs de décroissance.
Comprendre les classeurs de décroissance dans Milvus
Les classificateurs de décroissance vous permettent d'améliorer ou de pénaliser les documents en fonction de valeurs numériques (comme les horodatages) par rapport à un point de référence. Pour le classement basé sur le temps, cela signifie que les documents les plus récents peuvent recevoir des scores plus élevés que les plus anciens, même si leur pertinence sémantique est similaire.
Milvus prend en charge trois types de classeurs de décroissance :
Gaussien (
gauss) : Une courbe en forme de cloche offrant une décroissance douce et graduelle.Exponentiel (
exp) : Crée une chute initiale plus nette pour mettre fortement l'accent sur le contenu récent.Linéaire (
linear) : Une décroissance linéaire prévisible et facile à comprendre.
Chaque classificateur présente des caractéristiques différentes qui le rendent adapté à divers cas d'utilisation. Pour plus d'informations, reportez-vous à la section Vue d'ensemble des classificateurs de décroissance.
Construire un système de recherche tenant compte du temps
Nous allons créer un système de recherche d'articles d'actualité qui montre comment classer efficacement le contenu en fonction de la pertinence et du temps. Commençons par la mise en œuvre :
import datetime
import matplotlib.pyplot as plt
import numpy as np
from pymilvus import (
MilvusClient,
DataType,
Function,
FunctionType,
AnnSearchRequest,
)
# Create connection to Milvus
milvus_client = MilvusClient("http://localhost:19530")
# Define collection name
collection_name = "news_articles_tutorial"
# Clean up any existing collection with the same name
milvus_client.drop_collection(collection_name)
Étape 1 : Conception du schéma
Pour la recherche temporelle, nous devons stocker l'horodatage de la publication avec le contenu :
# Create schema with fields for content and temporal information
schema = milvus_client.create_schema(enable_dynamic_field=False, auto_id=True)
schema.add_field("id", DataType.INT64, is_primary=True)
schema.add_field("headline", DataType.VARCHAR, max_length=200, enable_analyzer=True)
schema.add_field("content", DataType.VARCHAR, max_length=2000, enable_analyzer=True)
schema.add_field("dense", DataType.FLOAT_VECTOR, dim=1024) # For dense embeddings
schema.add_field("sparse_vector", DataType.SPARSE_FLOAT_VECTOR) # For sparse (BM25) search
schema.add_field("publish_date", DataType.INT64) # Timestamp for decay ranking
Étape 2 : Mise en place des fonctions d'intégration
Nous allons configurer des fonctions d'intégration denses (sémantiques) et éparses (mots-clés) :
# Create embedding function for semantic search
text_embedding_function = Function(
name="siliconflow_embedding",
function_type=FunctionType.TEXTEMBEDDING,
input_field_names=["content"],
output_field_names=["dense"],
params={
"provider": "siliconflow",
"model_name": "BAAI/bge-large-en-v1.5",
"credential": "your-api-key"
}
)
schema.add_function(text_embedding_function)
# Create BM25 function for keyword search
bm25_function = Function(
name="bm25",
input_field_names=["content"],
output_field_names=["sparse_vector"],
function_type=FunctionType.BM25,
)
schema.add_function(bm25_function)
Pour plus de détails sur l'utilisation des fonctions d'intégration de Milvus, voir Vue d'ensemble des fonctions d'intégration.
Etape 3 : Configuration des paramètres d'index
Configurons les paramètres d'index appropriés pour la recherche vectorielle rapide :
# Set up indexes for fast search
index_params = milvus_client.prepare_index_params()
# Dense vector index
index_params.add_index(field_name="dense", index_type="AUTOINDEX", metric_type="L2")
# Sparse vector index
index_params.add_index(
field_name="sparse_vector",
index_name="sparse_inverted_index",
index_type="AUTOINDEX",
metric_type="BM25",
)
# Create the collection with our schema and indexes
milvus_client.create_collection(
collection_name,
schema=schema,
index_params=index_params,
consistency_level="Bounded"
)
Étape 4 : Préparer les données d'échantillon
Pour ce tutoriel, nous allons créer un ensemble d'articles de presse avec différentes dates de publication. Remarquez que nous avons inclus des paires d'articles au contenu presque identique mais avec des dates différentes pour démontrer clairement l'effet de classement décroissant :
# Get current time
current_time = int(datetime.datetime.now().timestamp())
current_date = datetime.datetime.fromtimestamp(current_time)
print(f"Current time: {current_date.strftime('%Y-%m-%d %H:%M:%S')}")
# Sample news articles spanning different dates
articles = [
{
"headline": "AI Breakthrough Enables Medical Diagnosis Advancement",
"content": "Researchers announced a major breakthrough in AI-based medical diagnostics, enabling faster and more accurate detection of rare diseases.",
"publish_date": int((current_date - datetime.timedelta(days=120)).timestamp()) # ~4 months ago
},
{
"headline": "Tech Giants Compete in New AI Race",
"content": "Major technology companies are investing billions in a new race to develop the most advanced artificial intelligence systems.",
"publish_date": int((current_date - datetime.timedelta(days=60)).timestamp()) # ~2 months ago
},
{
"headline": "AI Ethics Guidelines Released by International Body",
"content": "A consortium of international organizations has released new guidelines addressing ethical concerns in artificial intelligence development and deployment.",
"publish_date": int((current_date - datetime.timedelta(days=30)).timestamp()) # 1 month ago
},
{
"headline": "Latest Deep Learning Models Show Remarkable Progress",
"content": "The newest generation of deep learning models demonstrates unprecedented capabilities in language understanding and generation.",
"publish_date": int((current_date - datetime.timedelta(days=15)).timestamp()) # 15 days ago
},
# Articles with identical content but different dates
{
"headline": "AI Research Advancements Published in January",
"content": "Breakthrough research in artificial intelligence shows remarkable advancements in multiple domains.",
"publish_date": int((current_date - datetime.timedelta(days=90)).timestamp()) # ~3 months ago
},
{
"headline": "New AI Research Results Released This Week",
"content": "Breakthrough research in artificial intelligence shows remarkable advancements in multiple domains.",
"publish_date": int((current_date - datetime.timedelta(days=5)).timestamp()) # Very recent - 5 days ago
},
{
"headline": "AI Development Updates Released Yesterday",
"content": "Recent developments in artificial intelligence research are showing promising results across various applications.",
"publish_date": int((current_date - datetime.timedelta(days=1)).timestamp()) # Just yesterday
},
]
# Insert articles into the collection
milvus_client.insert(collection_name, articles)
print(f"Inserted {len(articles)} articles into the collection")
Étape 5 : Configurer différents classeurs de décroissance
Créons maintenant trois classeurs de décroissance différents, chacun avec des paramètres distincts pour mettre en évidence leurs différences :
# Use current time as reference point
print(f"Using current time as reference point")
# Create a Gaussian decay ranker
gaussian_ranker = Function(
name="time_decay_gaussian",
input_field_names=["publish_date"],
function_type=FunctionType.RERANK,
params={
"reranker": "decay",
"function": "gauss", # Gaussian/bell curve decay
"origin": current_time, # Current time as reference point
"offset": 7 * 24 * 60 * 60, # One week (full relevance)
"decay": 0.5, # Articles from two weeks ago have half relevance
"scale": 14 * 24 * 60 * 60 # Two weeks scale parameter
}
)
# Create an exponential decay ranker with different parameters
exponential_ranker = Function(
name="time_decay_exponential",
input_field_names=["publish_date"],
function_type=FunctionType.RERANK,
params={
"reranker": "decay",
"function": "exp", # Exponential decay
"origin": current_time, # Current time as reference point
"offset": 3 * 24 * 60 * 60, # Shorter offset (3 days vs 7 days)
"decay": 0.3, # Steeper decay (0.3 vs 0.5)
"scale": 10 * 24 * 60 * 60 # Different scale (10 days vs 14 days)
}
)
# Create a linear decay ranker
linear_ranker = Function(
name="time_decay_linear",
input_field_names=["publish_date"],
function_type=FunctionType.RERANK,
params={
"reranker": "decay",
"function": "linear", # Linear decay
"origin": current_time, # Current time as reference point
"offset": 7 * 24 * 60 * 60, # One week (full relevance)
"decay": 0.5, # Articles from two weeks ago have half relevance
"scale": 14 * 24 * 60 * 60 # Two weeks scale parameter
}
)
Dans le code précédent :
reranker: Réglé surdecaypour les fonctions de décroissance basées sur le tempsfunction: Le type de fonction de désintégration (gauss, exp, ou linéaire)origin: Le point de référence (généralement l'heure actuelle)offset: La période pendant laquelle les documents conservent toute leur pertinencescale: Contrôle la vitesse à laquelle la pertinence diminue au-delà du décalagedecay: Le facteur de décroissance au décalage+échelle (par exemple, 0,5 signifie la moitié de la pertinence).
Remarquez que nous avons configuré le classificateur exponentiel avec différents paramètres pour montrer comment vous pouvez adapter ces fonctions à différents comportements.
Étape 6 : Visualiser les classeurs de décroissance
Avant d'effectuer des recherches, comparons visuellement le comportement de ces classeurs de décroissance configurés différemment :
# Visualize the decay functions with different parameters
days = np.linspace(0, 90, 100)
# Gaussian: offset=7, scale=14, decay=0.5
gaussian_values = [1.0 if d <= 7 else (0.5 ** ((d - 7) / 14)) for d in days]
# Exponential: offset=3, scale=10, decay=0.3
exponential_values = [1.0 if d <= 3 else (0.3 ** ((d - 3) / 10)) for d in days]
# Linear: offset=7, scale=14, decay=0.5
linear_values = [1.0 if d <= 7 else max(0, 1.0 - ((d - 7) / 14) * 0.5) for d in days]
plt.figure(figsize=(10, 6))
plt.plot(days, gaussian_values, label='Gaussian (offset=7, scale=14, decay=0.5)')
plt.plot(days, exponential_values, label='Exponential (offset=3, scale=10, decay=0.3)')
plt.plot(days, linear_values, label='Linear (offset=7, scale=14, decay=0.5)')
plt.axhline(y=0.5, color='gray', linestyle='--', alpha=0.5, label='Half relevance')
plt.xlabel('Days ago')
plt.ylabel('Relevance factor')
plt.title('Decay Functions Comparison')
plt.legend()
plt.grid(True)
plt.savefig('decay_functions.png')
plt.close()
# Print numerical representation
print("\n=== TIME DECAY EFFECT VISUALIZATION ===")
print("Days ago | Gaussian | Exponential | Linear")
print("-----------------------------------------")
for days in [0, 3, 7, 10, 14, 21, 30, 60, 90]:
# Calculate decay factors based on the parameters in our rankers
gaussian_decay = 1.0 if days <= 7 else (0.5 ** ((days - 7) / 14))
exponential_decay = 1.0 if days <= 3 else (0.3 ** ((days - 3) / 10))
linear_decay = 1.0 if days <= 7 else max(0, 1.0 - ((days - 7) / 14) * 0.5)
print(f"{days:2d} days | {gaussian_decay:.4f} | {exponential_decay:.4f} | {linear_decay:.4f}")
Résultat attendu :
=== TIME DECAY EFFECT VISUALIZATION ===
Days ago | Gaussian | Exponential | Linear
-----------------------------------------
0 days | 1.0000 | 1.0000 | 1.0000
3 days | 1.0000 | 1.0000 | 1.0000
7 days | 1.0000 | 0.6178 | 1.0000
10 days | 0.8620 | 0.4305 | 0.8929
14 days | 0.7071 | 0.2660 | 0.7500
21 days | 0.5000 | 0.1145 | 0.5000
30 days | 0.3202 | 0.0387 | 0.1786
60 days | 0.0725 | 0.0010 | 0.0000
90 days | 0.0164 | 0.0000 | 0.0000
Étape 7 : Fonction d'aide pour l'affichage des résultats
# Helper function to format search results with dates and scores
def print_search_results(results, title):
print(f"\n=== {title} ===")
for i, hit in enumerate(results[0]):
publish_date = datetime.datetime.fromtimestamp(hit.get('publish_date'))
days_from_now = (current_time - hit.get('publish_date')) / (24 * 60 * 60)
print(f"{i+1}. {hit.get('headline')}")
print(f" Published: {publish_date.strftime('%Y-%m-%d')} ({int(days_from_now)} days ago)")
print(f" Score: {hit.score:.4f}")
print()
Étape 8 : Comparer la recherche standard et la recherche basée sur la décroissance
Lançons maintenant une requête de recherche et comparons les résultats avec et sans classement par décroissance :
# Define our search query
query = "artificial intelligence advancements"
# 1. Search without decay ranking (purely based on semantic relevance)
standard_results = milvus_client.search(
collection_name,
data=[query],
anns_field="dense",
limit=7, # Get all our articles
output_fields=["headline", "content", "publish_date"],
consistency_level="Bounded"
)
print_search_results(standard_results, "SEARCH RESULTS WITHOUT DECAY RANKING")
# Store original scores for later comparison
original_scores = {}
for hit in standard_results[0]:
original_scores[hit.get('headline')] = hit.score
# 2. Search with each decay function
# Gaussian decay
gaussian_results = milvus_client.search(
collection_name,
data=[query],
anns_field="dense",
limit=7,
output_fields=["headline", "content", "publish_date"],
ranker=gaussian_ranker,
consistency_level="Bounded"
)
print_search_results(gaussian_results, "SEARCH RESULTS WITH GAUSSIAN DECAY RANKING")
# Exponential decay
exponential_results = milvus_client.search(
collection_name,
data=[query],
anns_field="dense",
limit=7,
output_fields=["headline", "content", "publish_date"],
ranker=exponential_ranker,
consistency_level="Bounded"
)
print_search_results(exponential_results, "SEARCH RESULTS WITH EXPONENTIAL DECAY RANKING")
# Linear decay
linear_results = milvus_client.search(
collection_name,
data=[query],
anns_field="dense",
limit=7,
output_fields=["headline", "content", "publish_date"],
ranker=linear_ranker,
consistency_level="Bounded"
)
print_search_results(linear_results, "SEARCH RESULTS WITH LINEAR DECAY RANKING")
Résultats attendus :
=== SEARCH RESULTS WITHOUT DECAY RANKING ===
1. AI Development Updates Released Yesterday
Published: 2025-05-14 (1 days ago)
Score: 0.3670
2. AI Research Advancements Published in January
Published: 2025-02-14 (90 days ago)
Score: 0.4315
3. New AI Research Results Released This Week
Published: 2025-05-10 (5 days ago)
Score: 0.4316
4. Tech Giants Compete in New AI Race
Published: 2025-03-16 (60 days ago)
Score: 0.6671
5. Latest Deep Learning Models Show Remarkable Progress
Published: 2025-04-30 (15 days ago)
Score: 0.6674
6. AI Breakthrough Enables Medical Diagnosis Advancement
Published: 2025-01-15 (120 days ago)
Score: 0.7279
7. AI Ethics Guidelines Released by International Body
Published: 2025-04-15 (30 days ago)
Score: 0.7661
=== SEARCH RESULTS WITH GAUSSIAN DECAY RANKING ===
1. Latest Deep Learning Models Show Remarkable Progress
Published: 2025-04-30 (15 days ago)
Score: 0.5322
2. New AI Research Results Released This Week
Published: 2025-05-10 (5 days ago)
Score: 0.4316
3. AI Development Updates Released Yesterday
Published: 2025-05-14 (1 days ago)
Score: 0.3670
4. AI Ethics Guidelines Released by International Body
Published: 2025-04-15 (30 days ago)
Score: 0.1180
5. Tech Giants Compete in New AI Race
Published: 2025-03-16 (60 days ago)
Score: 0.0000
6. AI Research Advancements Published in January
Published: 2025-02-14 (90 days ago)
Score: 0.0000
7. AI Breakthrough Enables Medical Diagnosis Advancement
Published: 2025-01-15 (120 days ago)
Score: 0.0000
=== SEARCH RESULTS WITH EXPONENTIAL DECAY RANKING ===
1. AI Development Updates Released Yesterday
Published: 2025-05-14 (1 days ago)
Score: 0.3670
2. New AI Research Results Released This Week
Published: 2025-05-10 (5 days ago)
Score: 0.3392
3. Latest Deep Learning Models Show Remarkable Progress
Published: 2025-04-30 (15 days ago)
Score: 0.1574
4. AI Ethics Guidelines Released by International Body
Published: 2025-04-15 (30 days ago)
Score: 0.0297
5. Tech Giants Compete in New AI Race
Published: 2025-03-16 (60 days ago)
Score: 0.0007
6. AI Research Advancements Published in January
Published: 2025-02-14 (90 days ago)
Score: 0.0000
7. AI Breakthrough Enables Medical Diagnosis Advancement
Published: 2025-01-15 (120 days ago)
Score: 0.0000
=== SEARCH RESULTS WITH LINEAR DECAY RANKING ===
1. Latest Deep Learning Models Show Remarkable Progress
Published: 2025-04-30 (15 days ago)
Score: 0.4767
2. New AI Research Results Released This Week
Published: 2025-05-10 (5 days ago)
Score: 0.4316
3. AI Ethics Guidelines Released by International Body
Published: 2025-04-15 (30 days ago)
Score: 0.3831
4. AI Development Updates Released Yesterday
Published: 2025-05-14 (1 days ago)
Score: 0.3670
5. AI Breakthrough Enables Medical Diagnosis Advancement
Published: 2025-01-15 (120 days ago)
Score: 0.3640
6. Tech Giants Compete in New AI Race
Published: 2025-03-16 (60 days ago)
Score: 0.3335
7. AI Research Advancements Published in January
Published: 2025-02-14 (90 days ago)
Score: 0.2158
Étape 9 : Comprendre le calcul du score
Voyons comment les scores finaux sont calculés en combinant la pertinence d'origine et les facteurs de décroissance :
# Add a detailed breakdown for the first 3 results from Gaussian decay
print("\n=== SCORE CALCULATION BREAKDOWN (GAUSSIAN DECAY) ===")
for item in gaussian_results[0][:3]:
headline = item.get('headline')
publish_date = datetime.datetime.fromtimestamp(item.get('publish_date'))
days_ago = (current_time - item.get('publish_date')) / (24 * 60 * 60)
# Get the original score
original_score = original_scores.get(headline, 0)
# Calculate decay factor
decay_factor = 1.0 if days_ago <= 7 else (0.5 ** ((days_ago - 7) / 14))
# Show breakdown
print(f"Item: {headline}")
print(f" Published: {publish_date.strftime('%Y-%m-%d')} ({int(days_ago)} days ago)")
print(f" Original relevance score: {original_score:.4f}")
print(f" Decay factor (Gaussian): {decay_factor:.4f}")
print(f" Expected final score = Original × Decay: {original_score * decay_factor:.4f}")
print(f" Actual final score: {item.score:.4f}")
print()
Résultat attendu :
=== SCORE CALCULATION BREAKDOWN (GAUSSIAN DECAY) ===
Item: Latest Deep Learning Models Show Remarkable Progress
Published: 2025-04-30 (15 days ago)
Original relevance score: 0.6674
Decay factor (Gaussian): 0.6730
Expected final score = Original × Decay: 0.4491
Actual final score: 0.5322
Item: New AI Research Results Released This Week
Published: 2025-05-10 (5 days ago)
Original relevance score: 0.4316
Decay factor (Gaussian): 1.0000
Expected final score = Original × Decay: 0.4316
Actual final score: 0.4316
Item: AI Development Updates Released Yesterday
Published: 2025-05-14 (1 days ago)
Original relevance score: 0.3670
Decay factor (Gaussian): 1.0000
Expected final score = Original × Decay: 0.3670
Actual final score: 0.3670
Étape 10 : Recherche hybride avec décroissance temporelle
Pour des scénarios plus complexes, nous pouvons combiner des vecteurs denses (sémantiques) et épars (mots-clés) à l'aide de la recherche hybride :
# Set up hybrid search (combining dense and sparse vectors)
dense_search = AnnSearchRequest(
data=[query],
anns_field="dense", # Search dense vectors
param={},
limit=7
)
sparse_search = AnnSearchRequest(
data=[query],
anns_field="sparse_vector", # Search sparse vectors (BM25)
param={},
limit=7
)
# Execute hybrid search with each decay function
# Gaussian decay
hybrid_gaussian_results = milvus_client.hybrid_search(
collection_name,
[dense_search, sparse_search],
ranker=gaussian_ranker,
limit=7,
output_fields=["headline", "content", "publish_date"]
)
print_search_results(hybrid_gaussian_results, "HYBRID SEARCH RESULTS WITH GAUSSIAN DECAY RANKING")
# Exponential decay
hybrid_exponential_results = milvus_client.hybrid_search(
collection_name,
[dense_search, sparse_search],
ranker=exponential_ranker,
limit=7,
output_fields=["headline", "content", "publish_date"]
)
print_search_results(hybrid_exponential_results, "HYBRID SEARCH RESULTS WITH EXPONENTIAL DECAY RANKING")
Résultat attendu :
=== HYBRID SEARCH RESULTS WITH GAUSSIAN DECAY RANKING ===
1. New AI Research Results Released This Week
Published: 2025-05-10 (5 days ago)
Score: 2.1467
2. AI Development Updates Released Yesterday
Published: 2025-05-14 (1 days ago)
Score: 0.7926
3. Latest Deep Learning Models Show Remarkable Progress
Published: 2025-04-30 (15 days ago)
Score: 0.5322
4. AI Ethics Guidelines Released by International Body
Published: 2025-04-15 (30 days ago)
Score: 0.1180
5. Tech Giants Compete in New AI Race
Published: 2025-03-16 (60 days ago)
Score: 0.0000
6. AI Research Advancements Published in January
Published: 2025-02-14 (90 days ago)
Score: 0.0000
7. AI Breakthrough Enables Medical Diagnosis Advancement
Published: 2025-01-15 (120 days ago)
Score: 0.0000
=== HYBRID SEARCH RESULTS WITH EXPONENTIAL DECAY RANKING ===
1. New AI Research Results Released This Week
Published: 2025-05-10 (5 days ago)
Score: 1.6873
2. AI Development Updates Released Yesterday
Published: 2025-05-14 (1 days ago)
Score: 0.7926
3. Latest Deep Learning Models Show Remarkable Progress
Published: 2025-04-30 (15 days ago)
Score: 0.1574
4. AI Ethics Guidelines Released by International Body
Published: 2025-04-15 (30 days ago)
Score: 0.0297
5. Tech Giants Compete in New AI Race
Published: 2025-03-16 (60 days ago)
Score: 0.0007
6. AI Research Advancements Published in January
Published: 2025-02-14 (90 days ago)
Score: 0.0001
7. AI Breakthrough Enables Medical Diagnosis Advancement
Published: 2025-01-15 (120 days ago)
Score: 0.0000
Étape 11 : Expérimentation avec différentes valeurs de paramètres
Voyons comment l'ajustement du paramètre d'échelle affecte la fonction de décroissance gaussienne :
# Create variations of the Gaussian decay function with different scale parameters
print("\n=== PARAMETER VARIATION EXPERIMENT: SCALE ===")
for scale_days in [7, 14, 30]:
scaled_ranker = Function(
name=f"time_decay_gaussian_{scale_days}",
input_field_names=["publish_date"],
function_type=FunctionType.RERANK,
params={
"reranker": "decay",
"function": "gauss",
"origin": current_time,
"offset": 7 * 24 * 60 * 60, # Fixed offset of 7 days
"decay": 0.5, # Fixed decay of 0.5
"scale": scale_days * 24 * 60 * 60 # Variable scale
}
)
# Get results
scale_results = milvus_client.search(
collection_name,
data=[query],
anns_field="dense",
limit=7,
output_fields=["headline", "content", "publish_date"],
ranker=scaled_ranker,
consistency_level="Bounded"
)
print_search_results(scale_results, f"SEARCH WITH GAUSSIAN DECAY (SCALE = {scale_days} DAYS)")
Résultat attendu :
=== PARAMETER VARIATION EXPERIMENT: SCALE ===
=== SEARCH WITH GAUSSIAN DECAY (SCALE = 7 DAYS) ===
1. New AI Research Results Released This Week
Published: 2025-05-10 (5 days ago)
Score: 0.4316
2. AI Development Updates Released Yesterday
Published: 2025-05-14 (1 days ago)
Score: 0.3670
3. Latest Deep Learning Models Show Remarkable Progress
Published: 2025-04-30 (15 days ago)
Score: 0.2699
4. AI Ethics Guidelines Released by International Body
Published: 2025-04-15 (30 days ago)
Score: 0.0004
5. Tech Giants Compete in New AI Race
Published: 2025-03-16 (60 days ago)
Score: 0.0000
6. AI Research Advancements Published in January
Published: 2025-02-14 (90 days ago)
Score: 0.0000
7. AI Breakthrough Enables Medical Diagnosis Advancement
Published: 2025-01-15 (120 days ago)
Score: 0.0000
=== SEARCH WITH GAUSSIAN DECAY (SCALE = 14 DAYS) ===
1. Latest Deep Learning Models Show Remarkable Progress
Published: 2025-04-30 (15 days ago)
Score: 0.5322
2. New AI Research Results Released This Week
Published: 2025-05-10 (5 days ago)
Score: 0.4316
3. AI Development Updates Released Yesterday
Published: 2025-05-14 (1 days ago)
Score: 0.3670
4. AI Ethics Guidelines Released by International Body
Published: 2025-04-15 (30 days ago)
Score: 0.1180
5. Tech Giants Compete in New AI Race
Published: 2025-03-16 (60 days ago)
Score: 0.0000
6. AI Research Advancements Published in January
Published: 2025-02-14 (90 days ago)
Score: 0.0000
7. AI Breakthrough Enables Medical Diagnosis Advancement
Published: 2025-01-15 (120 days ago)
Score: 0.0000
=== SEARCH WITH GAUSSIAN DECAY (SCALE = 30 DAYS) ===
1. Latest Deep Learning Models Show Remarkable Progress
Published: 2025-04-30 (15 days ago)
Score: 0.6353
2. AI Ethics Guidelines Released by International Body
Published: 2025-04-15 (30 days ago)
Score: 0.5097
3. New AI Research Results Released This Week
Published: 2025-05-10 (5 days ago)
Score: 0.4316
4. AI Development Updates Released Yesterday
Published: 2025-05-14 (1 days ago)
Score: 0.3670
5. Tech Giants Compete in New AI Race
Published: 2025-03-16 (60 days ago)
Score: 0.0767
6. AI Research Advancements Published in January
Published: 2025-02-14 (90 days ago)
Score: 0.0021
7. AI Breakthrough Enables Medical Diagnosis Advancement
Published: 2025-01-15 (120 days ago)
Score: 0.0000
Étape 12 : Test avec différentes requêtes
Voyons comment le classement de décroissance se comporte avec différentes requêtes de recherche :
# Try different queries with Gaussian decay
for test_query in ["machine learning", "neural networks", "ethics in AI"]:
print(f"\n=== TESTING QUERY: '{test_query}' WITH GAUSSIAN DECAY ===")
test_results = milvus_client.search(
collection_name,
data=[test_query],
anns_field="dense",
limit=4,
output_fields=["headline", "content", "publish_date"],
ranker=gaussian_ranker,
consistency_level="Bounded"
)
print_search_results(test_results, f"TOP 4 RESULTS FOR '{test_query}'")
Résultat attendu :
=== TESTING QUERY: 'machine learning' WITH GAUSSIAN DECAY ===
=== TOP 4 RESULTS FOR 'machine learning' ===
1. New AI Research Results Released This Week
Published: 2025-05-10 (5 days ago)
Score: 0.8208
2. AI Development Updates Released Yesterday
Published: 2025-05-14 (1 days ago)
Score: 0.7287
3. Latest Deep Learning Models Show Remarkable Progress
Published: 2025-04-30 (15 days ago)
Score: 0.6633
4. AI Research Advancements Published in January
Published: 2025-02-14 (90 days ago)
Score: 0.0000
=== TESTING QUERY: 'neural networks' WITH GAUSSIAN DECAY ===
=== TOP 4 RESULTS FOR 'neural networks' ===
1. New AI Research Results Released This Week
Published: 2025-05-10 (5 days ago)
Score: 0.8509
2. AI Development Updates Released Yesterday
Published: 2025-05-14 (1 days ago)
Score: 0.7574
3. Latest Deep Learning Models Show Remarkable Progress
Published: 2025-04-30 (15 days ago)
Score: 0.6364
4. AI Research Advancements Published in January
Published: 2025-02-14 (90 days ago)
Score: 0.0000
=== TESTING QUERY: 'ethics in AI' WITH GAUSSIAN DECAY ===
=== TOP 4 RESULTS FOR 'ethics in AI' ===
1. New AI Research Results Released This Week
Published: 2025-05-10 (5 days ago)
Score: 0.7977
2. AI Development Updates Released Yesterday
Published: 2025-05-14 (1 days ago)
Score: 0.7322
3. AI Ethics Guidelines Released by International Body
Published: 2025-04-15 (30 days ago)
Score: 0.0814
4. AI Research Advancements Published in January
Published: 2025-02-14 (90 days ago)
Score: 0.0000
Conclusion
Le classement basé sur le temps à l'aide des fonctions de décroissance dans Milvus constitue un moyen puissant d'équilibrer la pertinence sémantique et la récence. En configurant la fonction de décroissance et les paramètres appropriés, vous pouvez créer des expériences de recherche qui mettent en évidence le contenu récent tout en respectant la pertinence sémantique.
Cette approche est particulièrement utile pour
les plateformes d'information et de médias
les listes de produits de commerce électronique
les flux de contenu des médias sociaux
les bases de connaissances et les systèmes de documentation
les référentiels de documents de recherche.
En comprenant les mathématiques qui sous-tendent les fonctions de décroissance et en expérimentant différents paramètres, vous pouvez affiner votre système de recherche afin d'obtenir l'équilibre optimal entre pertinence et fraîcheur pour votre cas d'utilisation spécifique.