Press "Enter" to skip to content

logstash写入日志失败。retrying failed action with response code: 503。

问题

早上同事反馈ELK套件平台的日志看不到了。

查询logstash应用日志显示:

[INFO ] 2022-01-11 11:07:17.735 [[main]>worker1] elasticsearch - retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[log-center-2022.01.11][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[log-center-2022.01.11][0]] containing [11] requests]"})

解决问题

一、查看集群健康状态:

发现健康状态是red

[root@netmgmt-prod-elk-03 ~]# curl '10.7.1.8:9200/_cluster/health?pretty'
{
"cluster_name" : "es-e679l179",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 6,
"number_of_data_nodes" : 3,
"active_primary_shards" : 5831,
"active_shards" : 11422,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 250,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 97.8581220013708

二、查看异常的index

发现有个索引异常了,直接进行删除即可。

[root@netmgmt-prod-elk-03 ~]# curl http://'10.7.1.8:9200/_cat/indices' | grep red
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  1  144k    1  2704    0     0    907      0  0:02:43  0:00:02  0:02:41   907
red    open gps_lte-mode-2019.04.03 _N2IkwVeSxiP4s1gMyFQgw 5 1   

删除索引:

 curl -XDELETE ‘http://10.7.1.8:9200/gps_lte-mode-2019.04.03’

参考资料:https://blog.csdn.net/stefan1240/article/details/88988587

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注