索引文档 | Easysearch | 分布式搜索型数据库

向指定索引中索引（添加或更新）文档。

API #

POST /{index}/_doc
PUT /{index}/_doc
POST /{index}/_doc/{id}
PUT /{index}/_doc/{id}

API 的作用 #

该 API 用于向索引中添加或更新文档，是 Easysearch 中最核心的数据写入操作。

操作类型 #

情况	操作类型	描述
不指定文档 ID	创建	自动生成文档 ID，创建新文档
指定文档 ID，文档不存在	创建	使用指定 ID 创建新文档
指定文档 ID，文档已存在	更新	覆盖整个文档（部分更新使用 `_update`）

与其他 API 的区别 #

API	用途
`/_doc`	索引或替换整个文档
`/_create/{id}`	仅创建新文档（文档存在则失败）
`/_update/{id}`	部分更新文档

API 的参数 #

路由参数 #

参数	类型	是否必填	描述
`{index}`	字符串	必需	索引名称
`{id}`	字符串	否	文档 ID。不指定则自动生成

Query String 参数 #

参数	类型	是否必填	默认值	描述
`routing`	字符串	否	-	路由值，用于将文档路由到特定分片
`pipeline`	字符串	否	-	指定要使用的摄取管道
`refresh`	字符串	否	false	刷新策略：`true`、`false`、`wait_for`
`timeout`	时间值	否	1m	等待操作完成的超时时间
`version`	整数	否	-	文档版本号
`version_type`	字符串	否	internal	版本类型：`internal`、`external`、`external_gte`
`op_type`	字符串	否	index	操作类型：`index`、`create`
`wait_for_active_shards`	字符串	否	1	等待的活跃分片数：`1`、`all`、具体数字
`if_seq_no`	整数	否	-1	序列号条件
`if_primary_term`	整数	否	-1	主版本条件
`require_alias`	布尔值	否	false	是否要求索引必须是别名

请求体参数 #

请求体必须包含文档的 JSON 数据：

{
  "field1": "value1",
  "field2": "value2",
  ...
}

示例 #

自动生成 ID 索引文档 #

POST /my_index/_doc
{
  "user": "kimchy",
  "post_date": "2026-02-04T08:00:00",
  "message": "Trying out Easysearch"
}

响应示例：

{
  "_index": "my_index",
  "_type": "_doc",
  "_id": "W0psoYsB7H8j7X2L9-8Q",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 0,
  "_primary_term": 1,
  "created": true
}

指定 ID 索引文档 #

PUT /my_index/_doc/1
{
  "user": "kimchy",
  "post_date": "2026-02-04T08:00:00",
  "message": "Trying out Easysearch"
}

响应示例：

{
  "_index": "my_index",
  "_type": "_doc",
  "_id": "1",
  "_version": 1,
  "result": "created",
  "_shards": { ... },
  "_seq_no": 0,
  "_primary_term": 1
}

仅创建新文档（文档存在则失败） #

PUT /my_index/_doc/1?op_type=create
{
  "user": "kimchy",
  "message": "Hello"
}

或使用 /_create 端点：

PUT /my_index/_create/1
{
  "user": "kimchy",
  "message": "Hello"
}

使用路由 #

POST /my_index/_doc?routing=user123
{
  "user": "kimchy",
  "message": "Hello"
}

使用摄取管道 #

POST /my_index/_doc?pipeline=timestamp_pipeline
{
  "user": "kimchy",
  "message": "Hello"
}

立即刷新 #

POST /my_index/_doc?refresh=true
{
  "user": "kimchy",
  "message": "Hello"
}

等待刷新完成 #

POST /my_index/_doc?refresh=wait_for
{
  "user": "kimchy",
  "message": "Hello"
}

使用版本控制 #

PUT /my_index/_doc/1?version=2&version_type=external
{
  "user": "kimchy",
  "message": "Updated message"
}

使用条件更新 #

PUT /my_index/_doc/1?if_seq_no=10&if_primary_term=1
{
  "user": "kimchy",
  "message": "Conditional update"
}

等待所有分片活跃 #

PUT /my_index/_doc/1?wait_for_active_shards=all
{
  "user": "kimchy",
  "message": "Hello"
}

设置超时 #

PUT /my_index/_doc/1?timeout=5m
{
  "user": "kimchy",
  "message": "Hello"
}

响应字段说明 #

字段	描述
`_index`	文档所在的索引
`_type`	文档类型（固定为 `_doc`）
`_id`	文档 ID
`_version`	文档版本号
`result`	操作结果：`created`、`updated`
`_shards`	分片信息
`_seq_no`	序列号
`_primary_term`	主版本号

操作类型（op_type） #

值	描述
`index`	索引操作（默认），文档不存在则创建，存在则更新
`create`	仅创建，文档已存在则操作失败

版本控制（version_type） #

类型	描述
`internal`	内部版本控制（默认），版本号必须大于当前版本
`external`	外部版本控制，版本号必须大于当前版本
`external_gte`	外部版本大于等于，版本号必须大于等于当前版本

刷新策略（refresh） #

值	描述
`true`	立即刷新，使文档可搜索
`false`	不刷新（默认），等待计划刷新
`wait_for`	等待刷新完成，但不立即执行

错误处理 #

文档已存在（op_type=create） #

{
  "error": {
    "type": "version_conflict_engine_exception",
    "reason": "[1]: version conflict, document already exists (current version [1])"
  },
  "status": 409
}

版本冲突 #

{
  "error": {
    "type": "version_conflict_engine_exception",
    "reason": "[1]: version conflict"
  },
  "status": 409
}

条件不满足 #

{
  "error": {
    "type": "version_conflict_engine_exception",
    "reason": "[1]: version conflict"
  },
  "status": 409
}

使用场景 #

场景 1：日志记录 #

POST /logs/_doc
{
  "@timestamp": "2026-02-04T08:00:00",
  "level": "INFO",
  "message": "Application started",
  "service": "api"
}

场景 2：用户数据 #

PUT /users/_doc/12345
{
  "name": "John Doe",
  "email": "john@example.com",
  "age": 30
}

场景 3：时间序列数据 #

POST /metrics-2026.02.04/_doc
{
  "@timestamp": "2026-02-04T08:00:00",
  "metric_name": "cpu_usage",
  "value": 75.5,
  "host": "server1"
}

场景 4：批量导入 #

POST /products/_doc?pipeline=product_processor
{
  "name": "Product 1",
  "price": 99.99,
  "category": "electronics"
}

最佳实践 #

自动 ID：对于日志类数据，使用自动生成的 ID 更快
指定 ID：对于需要精确控制的数据（如用户数据），使用业务 ID
刷新策略：批量导入时使用 refresh=false 或 refresh=wait_for
版本控制：需要乐观并发控制时使用版本号
路由：使用路由确保相关文档在同一分片
管道：使用摄取管道预处理数据

注意事项 #

类型已弃用：URL 中的类型参数已被弃用，默认使用 _doc 类型
替换 vs 更新：此 API 替换整个文档，部分更新使用 /_update
ID 限制：文档 ID 限制为 512 字节
字段限制：每个索引的字段数量有限制
性能考虑：频繁的 refresh=true 会影响性能

标签