Home > Software design >  Thread deadlock in SpringBoot Infinispan application
Thread deadlock in SpringBoot Infinispan application

Time:11-19

We have a Spring-Boot REST application running with Infinispan 13.0.12 caches and we see periodic seemingly random cases where the application becomes un-responsive. A thread dump indicates over 200 threads in this state:

"http-nio-8080-exec-379" #11999 daemon prio=5 os_prio=0 tid=0x00007f28900f9800 nid=0x2c68 
   waiting on condition [0x00007f28485c2000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000006c09af3e8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2163)
    at org.jgroups.util.Credit.decrementIfEnoughCredits(Credit.java:65)
    at org.jgroups.protocols.UFC.handleDownMessage(UFC.java:119)
    at org.jgroups.protocols.FlowControl.down(FlowControl.java:323)
    at org.jgroups.protocols.FlowControl.down(FlowControl.java:317)
    at org.jgroups.protocols.FRAG3.down(FRAG3.java:139)
    at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:927)
    at org.jgroups.JChannel.down(JChannel.java:645)
    at org.jgroups.JChannel.send(JChannel.java:484)
    at org.infinispan.remoting.transport.jgroups.JGroupsTransport.send(JGroupsTransport.java:1161)

Our Java configuration looks like this:

@Autowired
@Bean
public SpringEmbeddedCacheManagerFactoryBean springEmbeddedCacheManagerFactoryBean(GlobalConfigurationBuilder gcb, ConfigurationBuilder configurationBuilder) {
    SpringEmbeddedCacheManagerFactoryBean springEmbeddedCacheManagerFactoryBean = new SpringEmbeddedCacheManagerFactoryBean();
    springEmbeddedCacheManagerFactoryBean.addCustomGlobalConfiguration(gcb);
    springEmbeddedCacheManagerFactoryBean.addCustomCacheConfiguration(configurationBuilder);
    return springEmbeddedCacheManagerFactoryBean;
}

@Autowired
@Bean
public EmbeddedCacheManager defaultCacheManager(SpringEmbeddedCacheManager springEmbeddedCacheManager) throws Exception {
    return springEmbeddedCacheManager.getNativeCacheManager();
}

@Bean
public GlobalConfigurationBuilder globalConfigurationBuilder() {
    GlobalConfigurationBuilder result = GlobalConfigurationBuilder.defaultClusteredBuilder();
    
    result.transport().addProperty("configurationFile", jgroupsConfigFile);
    
    result.cacheManagerName(IDENTITY_CACHE);
    result.defaultCacheName(IDENTITY_CACHE   "-default");

    result.serialization()
            .marshaller(new JavaSerializationMarshaller())
            .allowList()
            .addClasses(
                    LinkedMultiValueMap.class, 
                    String.class
                );
    
    result.globalState().enable().persistentLocation(DATA_DIR);
                
    return result;      
}

@Bean
public ConfigurationBuilder configurationBuilder() {
    ConfigurationBuilder result = new ConfigurationBuilder();
        result.clustering().cacheMode(CacheMode.REPL_SYNC);
        return result;
}

@Bean
public org.infinispan.configuration.cache.Configuration cacheConfiguration() {
    ConfigurationBuilder builder = new ConfigurationBuilder();

    return builder
            .clustering()
                .cacheMode(CacheMode.REPL_SYNC)
                .remoteTimeout(replicationTimeoutSeconds, TimeUnit.SECONDS)
                .stateTransfer().timeout(stateTransferTimeoutMinutes, TimeUnit.MINUTES)
            
            .persistence()
                .addSoftIndexFileStore()
                .shared(false)
                .fetchPersistentState(true)

            .expiration().lifespan(expirationHours, TimeUnit.HOURS)
            
            .build();
}

@Autowired
@Bean
public Cache<String, MultiValueMap<String, String>> identityCache(EmbeddedCacheManager manager, org.infinispan.configuration.cache.Configuration cacheConfiguration) throws IOException {
    Cache<String, MultiValueMap<String, String>> result = manager
            .administration().withFlags(CacheContainerAdmin.AdminFlag.VOLATILE)
            .getOrCreateCache(IDENTITY_CACHE, cacheConfiguration);
    result.getAdvancedCache().getStats().setStatisticsEnabled(true);
    return result;
}

and we run a three node cluster with the default-jgroups-udp.xml config. Can anyone suggest a likely cause? Perhaps the config is sub-optimal?

TIA

CodePudding user response:

You have a replicated cache. This means that reads are always local, so the stack trace must be on a write (or rebalance).

The block means that the sender is waiting for credits from a receiver, which don't arrive, so the receiver must be stuck in sth. Also, the stack trace is not complete; can you show the entire trace?

To know what's going on, it would be good to see thread dumps of all members. I suggest zip them up and post a link to the zip here... Cheers

CodePudding user response:

Could this be related to https://issues.redhat.com/browse/ISPN-14260 ?

Are you using ACL cache authorisations?

Our temporary fix was to disable cache authorisations which solves all our locking issues. Waiting on the patch for 13 and 14 to be finalised

  • Related