밀버스 소개
시작하기
개념
사용자 가이드
모델
데이터 가져오기
Milvus 마이그레이션
관리 가이드
도구
통합
튜토리얼
자주 묻는 질문
API Reference

Elasticsearch에서

이 가이드는 Elasticsearch에서 Milvus 2.x로 데이터를 마이그레이션하기 위한 포괄적인 단계별 프로세스를 제공합니다. 이 가이드를 따르면 Milvus 2.x의 고급 기능과 향상된 성능을 활용하여 데이터를 효율적으로 전송할 수 있습니다.

전제 조건

소프트웨어 버전:
- 소스 Elasticsearch: 7.x 또는 8.x
- Milvus 대상: 2.x
- 설치에 대한 자세한 내용은 Elasticsearch 설치 및 Milvus 설치를 참조하세요.
필요한 도구:
- Milvus 마이그레이션 도구. 설치에 대한 자세한 내용은 마이그레이션 도구 설치를 참조하세요.
마이그레이션에 지원되는 데이터 유형: 소스 Elasticsearch 인덱스에서 마이그레이션할 필드는 다음과 같은 유형입니다 - dense_vector, 키워드, 텍스트, long, 정수, double, float, boolean, object. 여기에 나열되지 않은 데이터 유형은 현재 마이그레이션이 지원되지 않습니다. Milvus 컬렉션과 Elasticsearch 인덱스 간의 데이터 매핑에 대한 자세한 내용은 필드 매핑 참조를 참조하세요.
Elasticsearch 인덱스 요구 사항:
- 소스 Elasticsearch 인덱스에는 dense_vector 유형의 벡터 필드가 포함되어야 합니다. 벡터 필드가 없으면 마이그레이션을 시작할 수 없습니다.

마이그레이션 파일 구성

예제 마이그레이션 구성 파일을 migration.yaml 으로 저장하고 실제 조건에 따라 구성을 수정하세요. 구성 파일은 로컬 디렉터리에 자유롭게 저장할 수 있습니다.

dumper: # configs for the migration job.
  worker:
    workMode: "elasticsearch" # operational mode of the migration job.
    reader:
      bufferSize: 2500 # buffer size to read from Elasticsearch in each batch. A value ranging from 2000 to 4000 is recommended.
meta: # meta configs for the source Elasticsearch index and target Milvus 2.x collection.
  mode: "config" # specifies the source for meta configs. currently, onlly `config` is supported.
  version: "8.9.1"
  index: "qatest_index" # identifies the Elasticsearch index to migrate data from.
  fields: # fields within the Elasticsearch index to be migrated.
  - name: "my_vector" # name of the Elasticsearch field.
    type: "dense_vector" # data type of the Elasticsearch field.
    dims: 128 # dimension of the vector field. required only when `type` is `dense_vector`.
  - name: "id"
    pk: true # specifies if the field serves as a primary key.
    type: "long"
  - name: "num"
    type: "integer"
  - name: "double1"
    type: "double"
  - name: "text1"
    maxLen: 1000 # max. length of data fields. required only for `keyword` and `text` data types.
    type: "text"
  - name: "bl1"
    type: "boolean"
  - name: "float1"
    type: "float"
  milvus: # configs specific to creating the collection in Milvus 2.x
    collection: "Collection_01" # name of the Milvus collection. defaults to the Elasticsearch index name if not specified.
    closeDynamicField: false # specifies whether to disable the dynamic field in the collection. defaults to `false`.
    shardNum: 2 # number of shards to be created in the collection.
    consistencyLevel: Strong # consistency level for Milvus collection.
source: # connection configs for the source Elasticsearch server
  es:
    urls:
    - "http://10.15.1.***:9200" # address of the source Elasticsearch server.
    username: "" # username for the Elasticsearch server.
    password: "" # password for the Elasticsearch server.
target:
  mode: "remote" # storage location for dumped files. valid values: `remote` and `local`.
  remote: # configs for remote storage
    outputDir: "migration/milvus/test" # output directory path in the cloud storage bucket.
    cloud: "aws" # cloud storage service provider. Examples: `aws`, `gcp`, `azure`, etc.
    region: "us-west-2" # region of the cloud storage; can be any value if using local Minio.
    bucket: "zilliz-aws-us-****-*-********" # bucket name for storing data; must align with configs in milvus.yaml for Milvus 2.x.
    useIAM: true # whether to use an IAM Role for connection.
    checkBucket: false # checks if the specified bucket exists in the storage.
  milvus2x: # connection configs for the target Milvus 2.x server
    endpoint: "http://10.102.*.**:19530" # address of the target Milvus server.
    username: "****" # username for the Milvus 2.x server.
    password: "******" # password for the Milvus 2.x server.

다음 표에서는 예제 구성 파일의 매개변수에 대해 설명합니다. 전체 구성 목록은 Milvus 마이그레이션을 참조하세요: Milvus 2.x로의 Elasticsearch 마이그레이션을 참조하세요.

dumper

매개변수	설명
`dumper.worker.workMode`	마이그레이션 작업의 작동 모드입니다. Elasticsearch 인덱스에서 마이그레이션하는 경우 `elasticsearch` 로 설정합니다.
`dumper.worker.reader.bufferSize`	각 배치에서 Elasticsearch에서 읽을 버퍼 크기입니다. 단위: KB.

meta

매개변수	설명
`meta.mode`	메타 구성의 소스를 지정합니다. 현재 `config` 만 지원됩니다.
`meta.index`	데이터를 마이그레이션할 Elasticsearch 인덱스를 식별합니다.
`meta.fields`	마이그레이션할 Elasticsearch 인덱스 내의 필드입니다.
`meta.fields.name`	Elasticsearch 필드의 이름입니다.
`meta.fields.maxLen`	필드의 최대 길이입니다. 이 매개 변수는 `meta.fields.type` 가 `keyword` 또는 `text` 인 경우에만 필요합니다.
`meta.fields.pk`	필드가 기본 키로 사용되는지 여부를 지정합니다.
`meta.fields.type`	Elasticsearch 필드의 데이터 유형. 현재 Elasticsearch에서 지원되는 데이터 유형은 다음과 같습니다: dense_vector, 키워드, 텍스트, long, 정수, double, float, boolean, object.
`meta.fields.dims`	벡터 필드의 차원입니다. 이 매개 변수는 `meta.fields.type` 가 `dense_vector` 일 때만 필요합니다.
`meta.milvus`	Milvus 2.x에서 컬렉션 생성에 특정한 설정입니다.
`meta.milvus.collection`	Milvus 컬렉션의 이름입니다. 지정하지 않으면 기본값은 Elasticsearch 인덱스 이름입니다.
`meta.milvus.closeDynamicField`	컬렉션에서 동적 필드를 비활성화할지 여부를 지정합니다. 기본값은 `false` 입니다. 동적 필드에 대한 자세한 내용은 동적 필드 사용을 참조하세요.
`meta.milvus.shardNum`	컬렉션에 생성할 샤드 수입니다. 샤드에 대한 자세한 내용은 용어를 참조하세요.
`meta.milvus.consistencyLevel`	Milvus에서 컬렉션의 일관성 수준. 자세한 내용은 일관성을 참조하세요.

source

파라미터	설명
`source.es`	소스 Elasticsearch 서버에 대한 연결 구성입니다.
`source.es.urls`	소스 Elasticsearch 서버의 주소입니다.
`source.es.username`	Elasticsearch 서버의 사용자 이름입니다.
`source.es.password`	Elasticsearch 서버의 비밀번호입니다.

target

매개변수	설명
`target.mode`	덤프된 파일의 저장 위치. 유효한 값: - `local`: 덤프된 파일을 로컬 디스크에 저장합니다. - `remote`: 덤프된 파일을 오브젝트 스토리지에 저장합니다.
`target.remote.outputDir`	클라우드 스토리지 버킷의 출력 디렉토리 경로.
`target.remote.cloud`	클라우드 스토리지 서비스 제공업체. 예시 값: `aws`, `gcp`, `azure`.
`target.remote.region`	클라우드 스토리지 지역. 로컬 MinIO를 사용하는 경우 어떤 값이라도 가능합니다.
`target.remote.bucket`	데이터를 저장할 버킷 이름. 이 값은 Milvus 2.x의 구성과 동일해야 합니다. 자세한 내용은 시스템 구성을 참조하세요.
`target.remote.useIAM`	연결에 IAM 역할을 사용할지 여부입니다.
`target.remote.checkBucket`	지정한 버킷이 오브젝트 스토리지에 존재하는지 확인할지 여부입니다.
`target.milvus2x`	대상 Milvus 2.x 서버에 대한 연결 구성입니다.
`target.milvus2x.endpoint`	대상 Milvus 서버의 주소입니다.
`target.milvus2x.username`	Milvus 2.x 서버의 사용자 이름입니다. Milvus 서버에 사용자 인증이 활성화된 경우 이 매개변수는 필수입니다. 자세한 내용은 인증 활성화를 참조하세요.
`target.milvus2x.password`	Milvus 2.x 서버의 비밀번호. Milvus 서버에 사용자 인증이 활성화된 경우 이 파라미터가 필요합니다. 자세한 내용은 인증 활성화를 참조하세요.

마이그레이션 작업 시작

다음 명령으로 마이그레이션 작업을 시작합니다. {YourConfigFilePath} 을 구성 파일 migration.yaml 이 있는 로컬 디렉토리로 바꿉니다.

./milvus-migration start --config=/{YourConfigFilePath}/migration.yaml

다음은 성공적인 마이그레이션 로그 출력의 예입니다:

[task/load_base_task.go:94] ["[LoadTasker] Dec Task Processing-------------->"] [Count=0] [fileName=testfiles/output/zwh/migration/test_mul_field4/data_1_1.json] [taskId=442665677354739304]
[task/load_base_task.go:76] ["[LoadTasker] Progress Task --------------->"] [fileName=testfiles/output/zwh/migration/test_mul_field4/data_1_1.json] [taskId=442665677354739304]
[dbclient/cus_field_milvus2x.go:86] ["[Milvus2x] begin to ShowCollectionRows"]
[loader/cus_milvus2x_loader.go:66] ["[Loader] Static: "] [collection=test_mul_field4_rename1] [beforeCount=50000] [afterCount=100000] [increase=50000]
[loader/cus_milvus2x_loader.go:66] ["[Loader] Static Total"] ["Total Collections"=1] [beforeTotalCount=50000] [afterTotalCount=100000] [totalIncrease=50000]
[migration/es_starter.go:25] ["[Starter] migration ES to Milvus finish!!!"] [Cost=80.009174459]
[starter/starter.go:106] ["[Starter] Migration Success!"] [Cost=80.00928425]
[cleaner/remote_cleaner.go:27] ["[Remote Cleaner] Begin to clean files"] [bucket=a-bucket] [rootPath=testfiles/output/zwh/migration]
[cmd/start.go:32] ["[Cleaner] clean file success!"]

결과 확인

마이그레이션 작업이 실행되면 API 호출을 하거나 Attu를 사용하여 마이그레이션된 엔티티의 수를 확인할 수 있습니다. 자세한 내용은 Attu 및 get_collection_stats()를 참조하세요.

필드 매핑 참조

아래 표를 검토하여 Elasticsearch 인덱스의 필드 유형이 Milvus 컬렉션의 필드 유형에 매핑되는 방식을 이해하세요.

Milvus에서 지원되는 데이터 유형에 대한 자세한 내용은 지원되는 데이터 유형을 참조하세요.

Elasticsearch 필드 유형	Milvus 필드 유형	설명
dense_vector	FloatVector	마이그레이션 중에 벡터 차원은 변경되지 않습니다.
keyword	VarChar	최대 길이(1~65,535)를 설정합니다. 제한을 초과하는 문자열은 마이그레이션 오류를 유발할 수 있습니다.
text	VarChar	최대 길이(1~65,535)를 설정합니다. 제한을 초과하는 문자열은 마이그레이션 오류를 유발할 수 있습니다.
long	Int64	-
정수	Int32	-
double	Double	-
float	Float	-
부울	Bool	-
객체	JSON	-

번역 DeepL

Try Managed Milvus for Free

Zilliz Cloud is hassle-free, powered by Milvus and 10x faster.

Get Started

피드백

이 페이지가 도움이 되었나요?