Redis Cluster,Spring Lettuce连接池默认配置不支持高可用的问题 RedisConnectionException RedisCommandExecutionException

导读:本篇文章讲解 Redis Cluster,Spring Lettuce连接池默认配置不支持高可用的问题 RedisConnectionException RedisCommandExecutionException,希望对大家有帮助,欢迎收藏,转发!站点地址:www.bmabk.com

1.报错

某新接到的项目发送严重问题,页面打不开,后台多种报错,redis三主三从节点挂了两个分片的一主一从

2.问题排查

2.1 应用日志

redis节点连接失败

javax.servlet.ServletException: org.springframework.data.redis.RedisSystemException: Redis exception; nested exception is io.lettuce.core.RedisException: io.lettuce.core.RedisConnectionException: Unable to connect to 【ip:port】

集群宕机

org.springframework.data.redis.RedisSystemException: Error in execution; nested exception is io.lettuce.core.RedisCommandExecutionException: CLUSTERDOWN The cluster is down

2.2 redis日志

某分片从节点的日志,该分片主节点挂了

5494:S 30 Nov 2021 16:02:06.100 # Cluster state changed: fail
5494:S 30 Nov 2021 16:02:17.363 * FAIL message received from 479538c5f5fc7f752cfce6777c9ec86e745f2fb0 about 3090f06bf479c917dc9929792ecde57689f57ef6
5494:S 30 Nov 2021 16:02:17.363 * FAIL message received from 479538c5f5fc7f752cfce6777c9ec86e745f2fb0 about c1f0239774dd6f2a69fa17ddeac1bfa44ef2ec18
5494:S 30 Nov 2021 16:02:17.363 * FAIL message received from 479538c5f5fc7f752cfce6777c9ec86e745f2fb0 about 6f95ad87511cb567eb8265503ab4018c133453fa
5494:S 30 Nov 2021 16:02:17.424 * Clear FAIL state for node 6f95ad87511cb567eb8265503ab4018c133453fa: replica is reachable again.
5494:S 30 Nov 2021 16:02:18.426 * Clear FAIL state for node 3090f06bf479c917dc9929792ecde57689f57ef6: replica is reachable again.
5494:S 30 Nov 2021 16:02:48.475 * Clear FAIL state for node c1f0239774dd6f2a69fa17ddeac1bfa44ef2ec18: is reachable again and nobody is serving its slots after some time.
5494:S 30 Nov 2021 16:02:48.475 # Cluster state changed: ok
5494:S 30 Nov 2021 16:03:37.629 * FAIL message received from 0c4b89c5756b61b59bdebbc6630d356440fc96e9 about 6f95ad87511cb567eb8265503ab4018c133453fa
5494:S 30 Nov 2021 16:03:43.557 * Clear FAIL state for node 6f95ad87511cb567eb8265503ab4018c133453fa: replica is reachable again.
5494:S 30 Nov 2021 16:05:24.416 # Cluster state changed: fail
5494:S 30 Nov 2021 16:06:07.598 # Cluster state changed: ok

某分片主节点的日志,该分片从节点挂了

20733:S 30 Nov 2021 16:02:05.764 # Cluster state changed: fail
20733:S 30 Nov 2021 16:02:32.658 # Cluster state changed: ok
20733:S 30 Nov 2021 16:03:37.629 * FAIL message received from 0c4b89c5756b61b59bdebbc6630d356440fc96e9 about 6f95ad87511cb567eb8265503ab4018c133453fa
20733:S 30 Nov 2021 16:03:39.935 * Clear FAIL state for node 6f95ad87511cb567eb8265503ab4018c133453fa: replica is reachable again.
20733:S 30 Nov 2021 16:04:45.142 * 10000 changes in 60 seconds. Saving...
20733:S 30 Nov 2021 16:04:45.146 * Background saving started by pid 96306
96306:C 30 Nov 2021 16:04:45.247 * DB saved on disk
96306:C 30 Nov 2021 16:04:45.249 * RDB: 3 MB of memory used by copy-on-write
20733:S 30 Nov 2021 16:04:45.348 * Background saving terminated with success
20733:S 30 Nov 2021 16:05:27.119 # Cluster state changed: fail
20733:S 30 Nov 2021 16:06:07.597 # Cluster state changed: ok
20733:S 30 Nov 2021 16:07:28.534 # Cluster state changed: fail
20733:S 30 Nov 2021 16:07:39.351 # Cluster state changed: ok
20733:S 30 Nov 2021 16:07:39.757 * Starting automatic rewriting of AOF on 1190% growth
20733:S 30 Nov 2021 16:07:39.760 * Background append only file rewriting started by pid 532
20733:S 30 Nov 2021 16:07:39.882 * AOF rewrite child asks to stop sending diffs.
532:C 30 Nov 2021 16:07:39.883 * Parent agreed to stop sending diffs. Finalizing AOF...
532:C 30 Nov 2021 16:07:39.883 * Concatenating 0.07 MB of AOF diff received from parent.
532:C 30 Nov 2021 16:07:39.884 * SYNC append only file rewrite performed
532:C 30 Nov 2021 16:07:39.887 * AOF rewrite: 3 MB of memory used by copy-on-write
20733:S 30 Nov 2021 16:07:39.961 * Background AOF rewrite terminated with success
20733:S 30 Nov 2021 16:07:39.961 * Residual parent diff successfully flushed to the rewritten AOF (0.02 MB)
20733:S 30 Nov 2021 16:07:39.961 * Background AOF rewrite finished successfully
20733:S 30 Nov 2021 16:09:44.500 * FAIL message received from 0c4b89c5756b61b59bdebbc6630d356440fc96e9 about c1f0239774dd6f2a69fa17ddeac1bfa44ef2ec18
20733:S 30 Nov 2021 16:09:44.500 * FAIL message received from 0c4b89c5756b61b59bdebbc6630d356440fc96e9 about 6f95ad87511cb567eb8265503ab4018c133453fa
20733:S 30 Nov 2021 16:09:44.500 # Cluster state changed: fail
20733:S 30 Nov 2021 16:09:45.598 * Clear FAIL state for node c1f0239774dd6f2a69fa17ddeac1bfa44ef2ec18: master without slots is reachable again.

3.问题原因

3.1应用高可用问题

应用配置使用的连接池是lettuce,默认不启用拓扑刷新,当redis集群发生主备切换或节点丢失时,集群拓扑结构改变,lettuce还是连接原来节点,会持续报异常。

3.2服务器问题

日志里都是读写某服务器的redis节点失败,经沟通,该服务器不稳定,以前就出过问题。

4.解决方式

4.1应用高可用

ClusterTopologyRefreshOptions是拓扑刷新的配置类,需要开启自适应或定期刷新ClusterTopology,可以将其配置开启后注入LettuceConnectionFactory,再将LettuceConnectionFactory设置为redisTemplate的ConnectionFactory

4.2服务器

针对不稳定的服务器问题,将redis节点从其迁移走

5.应用连接redis高可用验证

5.1 操作redis集群参考命令

清除所有redis数据

rm -rf /usr/local/redis-cluster/9001/data/*
rm -rf /usr/local/redis-cluster/9002/data/*
rm -rf /usr/local/redis-cluster/9003/data/*
rm -rf /usr/local/redis-cluster/9004/data/*
rm -rf /usr/local/redis-cluster/9005/data/*
rm -rf /usr/local/redis-cluster/9006/data/*

启动redis实例

/usr/local/bin/redis-server /usr/local/redis-cluster/9001/redis.conf 
/usr/local/bin/redis-server /usr/local/redis-cluster/9002/redis.conf 
/usr/local/bin/redis-server /usr/local/redis-cluster/9003/redis.conf 
/usr/local/bin/redis-server /usr/local/redis-cluster/9004/redis.conf 
/usr/local/bin/redis-server /usr/local/redis-cluster/9005/redis.conf 
/usr/local/bin/redis-server /usr/local/redis-cluster/9006/redis.conf

查询redis进程

ps -ef|grep redis

配置集群关系

/usr/local/redis-cluster/bin/redis-cli --cluster create 所在虚拟机ip:9001 所在虚拟机ip:9002 所在虚拟机ip:9003 所在虚拟机ip:9004 所在虚拟机ip:9005 所在虚拟机ip:9006 --cluster-replicas 1

进入redis客户端

/usr/local/redis-cluster/bin/redis-cli -h 所在虚拟机ip -p 9001
cluster nodes
cluster keyslot 名称

杀死所有redis

ps -ef | grep redis | grep -v grep | awk '{print "kill -9 "$2}' | sh

增加从节点

/usr/local/redis-cluster/bin/redis-cli --cluster add-node 所在虚拟机ip:9007 所在虚拟机ip:9002 --cluster-slave --cluster-master-id a9c04d8bae78286a82a59767d79559c553549bbe

5.2 配置文件

#Redis Configuration
spring.redis.cluster.max-redirects=10
spring.redis.cluster.nodes=所在虚拟机ip:9001,所在虚拟机ip:9002,所在虚拟机ip:9003,所在虚拟机ip:9004,所在虚拟机ip:9005,所在虚拟机ip:9006
spring.redis.timeout=60000ms
#spring.redis.password=123456
spring.redis.lettuce.pool.max-active=1024
spring.redis.lettuce.pool.max-idle=16
spring.redis.lettuce.pool.min-idle=4
spring.redis.lettuce.pool.max-wait=-1ms

5.3 高可用配置代码

import java.time.Duration;

import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.autoconfigure.condition.ConditionalOnProperty;
import org.springframework.boot.autoconfigure.data.redis.RedisProperties;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.redis.connection.RedisClusterConfiguration;
import org.springframework.data.redis.connection.RedisConnectionFactory;
import org.springframework.data.redis.connection.lettuce.LettuceClientConfiguration;
import org.springframework.data.redis.connection.lettuce.LettuceConnectionFactory;
import org.springframework.data.redis.connection.lettuce.LettucePoolingClientConfiguration;
import org.springframework.data.redis.core.RedisTemplate;
import org.springframework.data.redis.serializer.StringRedisSerializer;

import io.lettuce.core.TimeoutOptions;
import io.lettuce.core.cluster.ClusterClientOptions;
import io.lettuce.core.cluster.ClusterTopologyRefreshOptions;

@Configuration
public class RedisConfig {
    private static final Logger LOG = LoggerFactory.getLogger(RedisConfig.class);
    @Autowired
    private RedisProperties redisProperties;

    public RedisConfig() {
    }

    @Bean
    @ConditionalOnProperty(
            prefix = "spring.redis",
            name = {"cluster.nodes"}
    )
    public LettuceConnectionFactory redisConnectionFactory() {
        LOG.debug("redisConnectionFactory...");
        RedisClusterConfiguration redisClusterConfiguration = new RedisClusterConfiguration(this.redisProperties.getCluster().getNodes());
        GenericObjectPoolConfig genericObjectPoolConfig = new GenericObjectPoolConfig();
        ClusterTopologyRefreshOptions clusterTopologyRefreshOptions = ClusterTopologyRefreshOptions.builder().enableAllAdaptiveRefreshTriggers().adaptiveRefreshTriggersTimeout(Duration.ofSeconds(10L)).build();
        ClusterClientOptions clusterClientOptions = ClusterClientOptions.builder().timeoutOptions(TimeoutOptions.enabled(Duration.ofSeconds(30L))).topologyRefreshOptions(clusterTopologyRefreshOptions).build();
        LettuceClientConfiguration lettuceClientConfiguration = LettucePoolingClientConfiguration.builder().poolConfig(genericObjectPoolConfig).clientOptions(clusterClientOptions).build();
        return new LettuceConnectionFactory(redisClusterConfiguration, lettuceClientConfiguration);
    }

    @Bean
    public RedisTemplate<Object, Object> redisTemplate(RedisConnectionFactory redisConnectionFactory) {
        RedisTemplate<Object, Object> template = new RedisTemplate();
        template.setConnectionFactory(redisConnectionFactory);
        template.setKeySerializer(new StringRedisSerializer());
        template.setValueSerializer(new RedisObjectSerializer());
        return template;
    }
}

5.4 测试代码

import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.data.redis.core.RedisTemplate;
import org.springframework.data.redis.core.ValueOperations;
import org.springframework.test.context.junit4.SpringRunner;

@RunWith(SpringRunner.class)
@SpringBootTest
public class RedisclusterApplicationTests {

    @Autowired
    RedisTemplate redisTemplate;

    @Test
    public void contextLoads() {
        System.out.println("哈哈哈哈哈哈哈哈哈哈或或或或或或或或或或");

        ValueOperations<String, String> operations = redisTemplate.opsForValue();
        operations.set("test", "test");
        int i = 0;
        try {
            while (true) {
                Thread.sleep(2000);
                try {
                    System.out.println(redisTemplate.opsForValue().get("test"));
                } catch (Exception e) {
                    System.out.println("断~~~~~~~~~~~~~~~~~~~~~~~~~·");
                }
                System.out.println(i++);
            }
        } catch (InterruptedException e) {
            System.out.println("aaaaaaaaaaaaaa");
        }
    }

参考

redis集群问题

SpringBoot +Redis集群(填坑Lettuce)

https://www.jianshu.com/p/435773bb8bd8?utm_campaign=haruki&utm_content=note&utm_medium=seo_notes&utm_source=recommendation

spring boot redis 集群下拓扑不自动刷新的问题

https://blog.csdn.net/liukaixiang000/article/details/103702008

【原创】SpringBoot2默认RedisCluster配置的坑

http://blog.sina.com.cn/s/blog_76c4136a01030df3.html

容器化redis-cluster使用(二)java客户端刷新cluster topology问题

https://blog.csdn.net/u013694670/article/details/100902548

RedisCluster集群模式下master宕机主从切换期间Lettuce连接Redis无法使用报错Redis command timed out的问题

搭建redis集群测试配置

redis服务安装

https://blog.csdn.net/qq_39135287/article/details/83474865

Redis 集群搭建详细指南

https://www.cnblogs.com/mafly/p/redis_cluster.html

ruby安装(有就别安了)

https://www.cnblogs.com/ding2016/p/7903147.html

https://blog.csdn.net/ankeway/article/details/100136675/

redis集群扩容(添加新节点)

https://www.cnblogs.com/yfacesclub/p/11860927.html

Redis 5.0 redis-cli –cluster help说明

https://www.cnblogs.com/zhoujinyi/p/11606935.html

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。

文章由极客之音整理,本文链接:https://www.bmabk.com/index.php/post/93692.html

(0)
小半的头像小半

相关推荐

极客之音——专业性很强的中文编程技术网站,欢迎收藏到浏览器,订阅我们!