failed to get type [type] and id [id] with includes/excludes set - 获取 type 和 ID 失败，且设置了 includes/excludes | Easysearch | 分布式搜索型数据库

适用版本： 5.6-7.x（typed API 已废弃，8.x 已移除）

1. 错误异常的基本描述 #

Failed to get type [type] and id [id] with includes/excludes set 表示旧版 typed get 请求已经拿到了文档 _source 并转成 map，但在按 includes/excludes 过滤后重新构建返回内容时失败。这个异常指向的是 source 过滤和重序列化阶段，而不是文档初次定位阶段。

源码先把 _source 转成 Map，然后通过 XContentMapValues.filter(...) 做 includes/excludes 过滤，最后再用 contentBuilder(...).map(sourceAsMap) 重新构建输出。只要重构过程出错，就会抛出当前异常。

常见现象 #

使用 type 路径且带 _source 过滤参数的请求返回 500 内部服务器错误，例如 GET /index/type/id?_source_includes=field1,field2。
同一个文档在不做 source 过滤时可能正常返回（GET /index/type/id 不带参数），但带过滤条件时失败。
日志里常出现 _source 转换、内容构建或 I/O 异常，如 JsonParseException 或 IOException。
如果是批量查询（_mget）中使用 typed 格式且指定了 source 过滤，只有部分文档会失败。
在 Elasticsearch 7.x 中，虽然默认不推荐，但 typed API 仍然可用；在 8.x 中，type 已被完全移除。

典型报错与异常栈 #

常见日志形态通常类似下面这样：

ElasticsearchException: Failed to get type [my_type] and id [abc123] with includes/excludes set
Caused by: java.io.IOException: read past EOF
	at org.elasticsearch.common.xcontent.XContentMapValues...

或者 _source 解析异常：

ElasticsearchException: Failed to get type [_doc] and id [xyz789] with includes/excludes set
Caused by: com.fasterxml.jackson.core.JsonParseException: Unexpected character
	at com.fasterxml.jackson.core.JsonParser...

或者 map 过滤异常：

ElasticsearchException: Failed to get type [logs] and id [doc456] with includes/excludes set
Caused by: java.lang.IllegalArgumentException: Cannot filter source
	at org.elasticsearch.common.xcontent.XContentMapValues.filter(XContentMapValues.java:...)

2. 为什么会发生这个错误 #

Failed to get type [type] and id [id] with includes/excludes set 的根因是"在旧版 typed Get 请求中，_source 内容过滤和重序列化过程失败"。Elasticsearch 需要先获取完整的 _source 并转成 Map，然后根据 includes/excludes 规则进行字段过滤，最后重新序列化为输出格式；如果 _source 结构异常或过滤过程出现问题，就会抛出此异常。

常见原因通常包括：

历史文档 _source 结构异常：旧版 typed 索引中的文档 _source JSON 结构异常，过滤后重建内容失败。
_source 格式异常：写入时数据格式有问题（如非 UTF-8 编码、非法 JSON 字符），转成 Map 后过滤出现问题。
includes/excludes 路径复杂叠加：过滤路径与复杂嵌套结构（如多层嵌套对象、数组）叠加，触发过滤或重序列化异常。
底层 I/O 或内容构建异常：节点在处理 _source 时遭遇底层 I/O 问题，或内容构建过程出现异常。
旧版 typed 索引升级遗留问题：从旧版本（如 5.x、6.x）升级到新版本（如 7.x）后，历史 typed 索引在升级链路中积累了不规则 _source 数据。
大文档问题：_source 特别大的文档在过滤和重序列化时可能触发内存或缓冲区问题。
字段路径不存在：includes/excludes 中指定的字段路径在文档中不存在，且过滤逻辑对此处理不当。

3. 如何排查和解决这个异常和解决这个异常 #

建议按"先复现问题、再检查 _source、后优化过滤规则"的顺序处理：

复现问题：分别测试"无 includes/excludes"和"有 includes/excludes"的 typed get 请求，确认问题触发条件。

# 不带过滤，查看是否能正常获取（旧版 typed API）
curl -X GET "localhost:9200/my_index/my_type/my_id?pretty"
   
# 带过滤，查看是否触发异常
curl -X GET "localhost:9200/my_index/my_type/my_id?_source_includes=field1,field2&pretty"

检查 _source 内容：抽取原始 _source，确认其 JSON 结构完整可解析。

# 获取原始 _source
curl -X GET "localhost:9200/my_index/my_type/my_id?_source=true" > source.json
   
# 验证 JSON 格式
cat source.json | jq .  # 如果 jq 解析失败，说明 JSON 有问题

简化过滤规则：逐步缩小触发异常的字段路径，定位是哪个字段过滤导致的问题。

# 逐个字段测试
curl -X GET "localhost:9200/my_index/my_type/my_id?_source_includes=field1"
curl -X GET "localhost:9200/my_index/my_type/my_id?_source_includes=field2"

考虑迁移到 typeless API：如果使用的是 7.x 或更高版本，建议迁移到 typeless API。

# 新版 typeless API（推荐使用）
curl -X GET "localhost:9200/my_index/_doc/my_id?_source_includes=field1,field2"

检查节点日志：结合节点日志确认是否还有更底层的 I/O 或内容构建异常。

# 查看相关错误日志
grep -r "Failed to get type.*includes/excludes" /var/log/elasticsearch/
grep -r "XContentMapValues" /var/log/elasticsearch/

评估索引迁移：如果问题出在历史 typed 索引，评估迁移和重建索引的可行性。

排查时需要注意的问题 #

如果 _source 本身损坏，不带过滤时可能也能返回数据（只是内容有问题），但过滤时需要解析完整内容并转成 Map，所以会暴露问题。
这个错误出现在旧版 typed API 中，如果可能，建议迁移到 typeless API，避免 typed API 的兼容性问题。
includes/excludes 支持通配符和路径表达式，复杂规则可能匹配到意外的字段，需要仔细检查。

4. 如何解决这个错误 #

常用修复思路 #

修复或删除损坏文档：对于 _source 损坏的文档，可以尝试重新索引或删除。

# 删除损坏的文档（旧版 typed API）
curl -X DELETE "localhost:9200/my_index/my_type/my_id"
  
# 重新索引（如果有原始数据）
curl -X PUT "localhost:9200/my_index/my_type/my_id" -H 'Content-Type: application/json' -d'
{
  "field1": "value1",
  "field2": "value2"
}
'

简化过滤规则：修正 includes/excludes 表达式，避免对异常结构做深层过滤。

# 使用更简单的过滤规则
# 原来可能过于复杂：?_source_includes=obj1.*.field,obj2.nested[*].value
# 改为更简单的：?_source_includes=field1,field2
curl -X GET "localhost:9200/my_index/my_type/my_id?_source_includes=field1,field2"

迁移到 typeless API：如果使用的是 7.x 或更高版本，建议迁移到 typeless API，从根本上避免 typed API 的问题。
```
# 使用 typeless API 代替 typed API
curl -X GET "localhost:9200/my_index/_doc/my_id?_source_includes=field1,field2"
```

重建索引：对历史 typed 索引，考虑重建为 typeless 索引，同时修复数据问题。

# 重建索引（从旧索引到新索引）
curl -X POST "localhost:9200/_reindex?pretty" -H 'Content-Type: application/json' -d'
{
  "source": {
    "index": "old_typed_index"
  },
  "dest": {
    "index": "new_typeless_index"
  },
  "conflicts": "proceed"
}
'

禁用 source 过滤：作为临时解决方案，可以暂时不使用 source 过滤，或者改用 stored_fields 参数（需要字段设置为 store=true）。

后续注意事项与推荐建议 #

如果仍在使用 typed API（Elasticsearch 7.x），建议尽快迁移到 typeless API，为升级到 8.x 做准备。
在应用层对写入 Elasticsearch 的数据进行校验，确保 _source 内容是合法的 JSON 格式。
对于历史 typed 索引，评估是否需要重建为 typeless 索引，避免后续升级风险。
建立对文档质量的监控，在文档结构异常时及时预警。

借助 INFINI 产品提升排障效率 #

INFINI Console 适合查看索引的文档样本、字段分布和 _source 内容，帮助快速判断是特定文档问题还是系统性问题，并提供可视化的文档查看、编辑和索引重建功能。
INFINI Gateway 适合部署在 Elasticsearch 前面做请求观测和流量治理，可以记录所有 Get 请求的详细日志，包括 _source 过滤参数，帮助定位是请求参数问题还是数据问题，同时提供请求缓存功能减少重复的过滤操作。
建议将文档读取失败、source 过滤异常等指标统一接入监控面板，结合 INFINI Console 的告警功能，在文档质量问题频发时及时通知。

5. 小结 #

Failed to get type [type] and id [id] with includes/excludes set 意味着历史 typed get 在 _source 过滤重建阶段失败。相比普通读取失败，它更强调 _source 结构和过滤规则的组合问题。如果可能，建议迁移到 typeless API 并重建历史索引，从根本上避免 typed API 的兼容性问题。

只要把数据校验、文档监控和版本升级规划固定下来，大多数 typed API 的 _source 过滤类异常都可以被有效解决，也更容易通过 INFINI Console 和 INFINI Gateway 实现持续防护。

参考文档 #

附：日志上下文 #

下面保留当前页面中的源码或日志片段，便于继续结合异常调用栈定位问题：

sourceAsMap = typeMapTuple.v2();
    sourceAsMap = XContentMapValues.filter(sourceAsMap; fetchSourceContext.includes(); fetchSourceContext.excludes());
    try {
        source = BytesReference.bytes(XContentFactory.contentBuilder(sourceContentType).map(sourceAsMap));
    } catch (IOException e) {
        throw new ElasticsearchException("Failed to get type [" + type + "] and id [" + id + "] with includes/excludes set"; e);
    }
}  return new GetResult(
        shardId.getIndexName();

标签

failed to get type [type] and id [id] with includes/excludes set - 获取 type 和 ID 失败，且设置了 includes/excludes