Home > Mobile >  ActiveMQ Artemis: javax.jms.ConnectionFactory does not receive cluster topology in case of failover
ActiveMQ Artemis: javax.jms.ConnectionFactory does not receive cluster topology in case of failover

Time:11-12

On my Windows machine I set up a local test cluster of 2 brokers (version 2.19.0): 1 master, 1 slave. The ha-policy is replication and cluster communicating goes over JGroups.

master broker.xml:

<configuration xmlns="urn:activemq"
               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
               xmlns:xi="http://www.w3.org/2001/XInclude"
               xsi:schemaLocation="urn:activemq /schema/artemis-configuration.xsd">

   <core xmlns="urn:activemq:core" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="urn:activemq:core ">

      <name>broker1</name>

      <persistence-enabled>true</persistence-enabled>
      
      <connectors>
         <!-- Connector used to be announced through cluster connections and notifications -->
         <connector name="netty-connector">tcp://127.0.0.2:61616</connector>
      </connectors> 


      <graceful-shutdown-enabled>true</graceful-shutdown-enabled>


      <acceptors>
         <!-- Acceptor for every supported protocol -->
         <acceptor name="artemis">tcp://127.0.0.2:61616</acceptor>

         <!-- STOMP Acceptor. -->
         <acceptor name="stomp">tcp://127.0.0.2:61613?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;protocols=STOMP;useEpoll=true;anycastPrefix=/queue/;multicastPrefix=/topic/</acceptor>

      </acceptors>


      <cluster-user>user</cluster-user>
      <cluster-password>pw</cluster-password>
      
      <ha-policy>
         <replication>
            <master>
               <check-for-live-server>true</check-for-live-server>
            </master>
         </replication>   
      </ha-policy>

      <broadcast-groups>
         <broadcast-group name="bg-group1">
           <jgroups-file>test-jgroups-jdbc_ping.xml</jgroups-file>
            <jgroups-channel>active_broadcast_channel</jgroups-channel>
            <connector-ref>netty-connector</connector-ref>
         </broadcast-group>
      </broadcast-groups>

      <discovery-groups>
         <discovery-group name="dg-group1">
            <jgroups-file>test-jgroups-jdbc_ping.xml</jgroups-file>
            <jgroups-channel>active_broadcast_channel</jgroups-channel>
            <refresh-timeout>10000</refresh-timeout>
         </discovery-group>
      </discovery-groups>

      <cluster-connections>
         <cluster-connection name="my-cluster">
            <connector-ref>netty-connector</connector-ref>
            <use-duplicate-detection>true</use-duplicate-detection>
            <message-load-balancing>ON_DEMAND</message-load-balancing>
            <max-hops>1</max-hops>
            <discovery-group-ref discovery-group-name="dg-group1"/>
         </cluster-connection>
      </cluster-connections>

       [...]
   </core>
</configuration>

slave broker.xml:

<configuration xmlns="urn:activemq"
               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
               xmlns:xi="http://www.w3.org/2001/XInclude"
               xsi:schemaLocation="urn:activemq /schema/artemis-configuration.xsd">

   <core xmlns="urn:activemq:core" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="urn:activemq:core ">

      <name>broker2</name>

      <persistence-enabled>true</persistence-enabled>      
      
      <connectors>
         <!-- Connector used to be announced through cluster connections and notifications -->
         <connector name="netty-connector">tcp://127.0.0.3:61616</connector>
         <connector name="master1-netty-connector">tcp://127.0.0.2:61616</connector>
      </connectors>

      <graceful-shutdown-enabled>true</graceful-shutdown-enabled>

      <acceptors>
          <!-- Acceptor for every supported protocol -->
         <acceptor name="artemis">tcp://127.0.0.3:61616</acceptor>

         <!-- STOMP Acceptor. -->
         <acceptor name="stomp">tcp://127.0.0.3:61613?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;protocols=STOMP;useEpoll=true;anycastPrefix=/queue/;multicastPrefix=/topic/</acceptor>
      </acceptors>


      <cluster-user>user</cluster-user>
      <cluster-password>pw</cluster-password>
      
      <!-- failover config -->
      <ha-policy>
         <replication>
            <slave>
               <allow-failback>true</allow-failback>
            </slave>
         </replication>
      </ha-policy>

      <broadcast-groups>
         <broadcast-group name="bg-group1">
           <jgroups-file>test-jgroups-jdbc_ping.xml</jgroups-file>
            <jgroups-channel>active_broadcast_channel</jgroups-channel>
            <connector-ref>netty-connector</connector-ref>
         </broadcast-group>
      </broadcast-groups>

      <discovery-groups>
         <discovery-group name="dg-group1">
            <jgroups-file>test-jgroups-jdbc_ping.xml</jgroups-file>
            <jgroups-channel>active_broadcast_channel</jgroups-channel>
            <refresh-timeout>10000</refresh-timeout>
         </discovery-group>
      </discovery-groups>

      <cluster-connections>
         <cluster-connection name="my-cluster">
            <connector-ref>netty-connector</connector-ref>
            <use-duplicate-detection>true</use-duplicate-detection>
            <message-load-balancing>ON_DEMAND</message-load-balancing>
            <max-hops>1</max-hops>
            <discovery-group-ref discovery-group-name="dg-group1"/>
         </cluster-connection>
      </cluster-connections>
     [...]

   </core>
</configuration>

I left out the default config lines, so that it won't be too long. For reproducing the example, just replace the upper parts in broker.xml.

I am running also a spring boot application (2.3.8), which owns a javax.jms.ConnectionFactory and some consumers/producers which are connected to the Artemis server cluster.

For the initial connection, the application connects to master broker (one host).

import javax.jms.ConnectionFactory;
import org.apache.activemq.artemis.jms.client.ActiveMQConnectionFactory;

[...]

    @Bean
    public ConnectionFactory filterConnectionFactory(ArtemisConfig artemisConfig) {
           return new ActiveMQConnectionFactory(
                        "tcp://127.0.0.2:61616?ha=true&blockOnDurableSend=false",
                        artemisConfig.getUserName(), artemisConfig.getPassword());
}

My expectation in case of failover:

  • master broker crashes.
  • backup broker becomes live.
  • ConnectionFactory receives updated cluster topology and connects to slave broker. (host of slave is received via topology update)

Reality: By now it does not. So if failover occurs, the connections can not be recovered by the application.

I guess the broker config itself should be ok, because the master is replicating all addresses to the slave in case of failover.
Also failback is working.

Is my setup wrong in any point? And is there a way to log receiving of topology?

test-jgroups-jdbc_ping.xml:

<config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xmlns="urn:org:jgroups"
        xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/jgroups.xsd">
    <TCP recv_buf_size="${tcp.recv_buf_size:5M}"
         send_buf_size="${tcp.send_buf_size:5M}"
         max_bundle_size="64K"
         max_bundle_timeout="30"
         sock_conn_timeout="300"

         timer_type="new3"
         timer.min_threads="4"
         timer.max_threads="10"
         timer.keep_alive_time="3000"
         timer.queue_max_size="500"

         thread_pool.enabled="true"
         thread_pool.min_threads="2"
         thread_pool.max_threads="8"
         thread_pool.keep_alive_time="5000"
         thread_pool.queue_enabled="true"
         thread_pool.queue_max_size="10000"
         thread_pool.rejection_policy="discard"

         oob_thread_pool.enabled="true"
         oob_thread_pool.min_threads="1"
         oob_thread_pool.max_threads="8"
         oob_thread_pool.keep_alive_time="5000"
         oob_thread_pool.queue_enabled="false"
         oob_thread_pool.queue_max_size="100"
         oob_thread_pool.rejection_policy="discard"/>
         
   <!--  <TRACE/> -->
    
    <!-- <SSL_KEY_EXCHANGE
        keystore_name="./activemq.example.keystore"
        keystore_password="activemqexample"
    /> -->

    <JDBC_PING connection_url="jdbc:postgresql://127.0.0.1:5432/test" connection_username="test" connection_password="test" connection_driver="org.postgresql.Driver" initialize_sql="CREATE TABLE IF NOT EXISTS JGROUPSPING (own_addr varchar(200),bind_addr varchar(200),created timestamp DEFAULT CURRENT_TIMESTAMP,cluster_name varchar(200),ping_data BYTEA,constraint PK_JGROUPSPING PRIMARY KEY (own_addr, cluster_name))"/>
    
    <MERGE3  min_interval="10000"
             max_interval="30000"/>
    <FD_SOCK/>
    <FD timeout="3000" max_tries="3" />
    <VERIFY_SUSPECT timeout="1500"  />
    <BARRIER />
    <pbcast.NAKACK2 use_mcast_xmit="false"
                    discard_delivered_msgs="true"/>
    <UNICAST3 />
    <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
                   max_bytes="4M"/>
    <pbcast.GMS print_local_addr="true" join_timeout="2000"
                view_bundling="true"/>
    <MFC max_credits="2M"
         min_threshold="0.4"/>
    <FRAG2 frag_size="60K"  />
    <!--RSVP resend_interval="2000" timeout="10000"/-->
    <pbcast.STATE_TRANSFER/>
</config>

master logs:

2021-11-10 16:58:53,209 INFO  [org.apache.activemq.artemis.integration.bootstrap] AMQ101000: Starting ActiveMQ Artemis Server
2021-11-10 16:58:53,330 INFO  [org.apache.activemq.artemis.core.server] AMQ221000: live Message Broker is starting with configuration Broker Configuration (clustered=true,journalDirectory=data/journal,bindingsDirectory=data/bindings,largeMessagesDirectory=data/large-messages,pagingDirectory=data/paging)
2021-11-10 16:59:15,465 INFO  [org.apache.activemq.artemis.core.server] AMQ221013: Using NIO Journal
2021-11-10 16:59:15,528 INFO  [org.apache.activemq.artemis.core.server] AMQ221057: Global Max Size is being adjusted to 1/2 of the JVM max size (-Xmx). being defined as 536.870.912
2021-11-10 16:59:20,988 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-server]. Adding protocol support for: CORE
2021-11-10 16:59:20,990 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-amqp-protocol]. Adding protocol support for: AMQP
2021-11-10 16:59:20,990 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-hornetq-protocol]. Adding protocol support for: HORNETQ
2021-11-10 16:59:20,991 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-mqtt-protocol]. Adding protocol support for: MQTT
2021-11-10 16:59:20,992 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-openwire-protocol]. Adding protocol support for: OPENWIRE
2021-11-10 16:59:20,992 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-stomp-protocol]. Adding protocol support for: STOMP
2021-11-10 16:59:22,317 INFO  [org.apache.activemq.artemis.core.server] AMQ221080: Deploying address DLQ supporting [ANYCAST]
2021-11-10 16:59:22,328 INFO  [org.apache.activemq.artemis.core.server] AMQ221003: Deploying ANYCAST queue DLQ on address DLQ
2021-11-10 16:59:22,393 INFO  [org.apache.activemq.artemis.core.server] AMQ221080: Deploying address ExpiryQueue supporting [ANYCAST]
2021-11-10 16:59:22,394 INFO  [org.apache.activemq.artemis.core.server] AMQ221003: Deploying ANYCAST queue ExpiryQueue on address ExpiryQueue
2021-11-10 16:59:24,953 INFO  [org.apache.activemq.artemis.core.server] AMQ221020: Started NIO Acceptor at 127.0.0.2:61616 for protocols [CORE,MQTT,AMQP,HORNETQ,STOMP,OPENWIRE]
2021-11-10 16:59:24,972 INFO  [org.apache.activemq.artemis.core.server] AMQ221020: Started NIO Acceptor at 127.0.0.2:61613 for protocols [STOMP]
2021-11-10 16:59:24,973 INFO  [org.apache.activemq.artemis.core.server] AMQ221007: Server is now live
2021-11-10 16:59:24,973 INFO  [org.apache.activemq.artemis.core.server] AMQ221001: Apache ActiveMQ Artemis Message Broker version 2.19.0 [broker1, nodeID=1a4624d9-423f-11ec-b875-40167e37963a] 
2021-11-10 16:59:25,776 INFO  [org.apache.activemq.hawtio.branding.PluginContextListener] Initialized activemq-branding plugin
2021-11-10 16:59:26,092 INFO  [org.apache.activemq.hawtio.plugin.PluginContextListener] Initialized artemis-plugin plugin
2021-11-10 16:59:26,267 INFO  [org.apache.activemq.artemis.core.server] AMQ221025: Replication: sending NIOSequentialFile D:\apache-artemis-2.19.0-bin\apache-artemis-2.19.0\bin\master1\data\journal\activemq-data-2.amq (size=10.485.760) to replica.
2021-11-10 16:59:27,597 INFO  [org.apache.activemq.artemis.core.server] AMQ221025: Replication: sending NIOSequentialFile D:\apache-artemis-2.19.0-bin\apache-artemis-2.19.0\bin\master1\data\bindings\activemq-bindings-3.bindings (size=1.048.576) to replica.
2021-11-10 16:59:27,628 INFO  [org.apache.activemq.artemis.core.server] AMQ221025: Replication: sending NIOSequentialFile D:\apache-artemis-2.19.0-bin\apache-artemis-2.19.0\bin\master1\data\bindings\activemq-bindings-2.bindings (size=1.048.576) to replica.
2021-11-10 16:59:28,338 INFO  [io.hawt.HawtioContextListener] Initialising hawtio services
2021-11-10 16:59:28,355 INFO  [io.hawt.system.ConfigManager] Configuration will be discovered via system properties
2021-11-10 16:59:28,358 INFO  [io.hawt.jmx.JmxTreeWatcher] Welcome to Hawtio 2.14.0
2021-11-10 16:59:28,365 INFO  [io.hawt.web.auth.AuthenticationConfiguration] Starting hawtio authentication filter, JAAS realm: "activemq" authorized role(s): "amq" role principal classes: "org.apache.activemq.artemis.spi.core.security.jaas.RolePrincipal"
2021-11-10 16:59:28,376 INFO  [io.hawt.web.auth.LoginRedirectFilter] Hawtio loginRedirectFilter is using 1800 sec. HttpSession timeout
2021-11-10 16:59:28,394 INFO  [io.hawt.web.proxy.ProxyServlet] Proxy servlet is disabled
2021-11-10 16:59:28,403 INFO  [io.hawt.web.servlets.JolokiaConfiguredAgentServlet] Jolokia overridden property: [key=policyLocation, value=file:/D:/apache-artemis-2.19.0-bin/apache-artemis-2.19.0/bin/master1/etc/\jolokia-access.xml]
2021-11-10 16:59:29,401 INFO  [org.apache.activemq.artemis] AMQ241001: HTTP Server started at http://127.0.0.2:8261
2021-11-10 16:59:29,401 INFO  [org.apache.activemq.artemis] AMQ241002: Artemis Jolokia REST API available at http://127.0.0.2:8261/console/jolokia
2021-11-10 16:59:29,403 INFO  [org.apache.activemq.artemis] AMQ241004: Artemis Console available at http://127.0.0.2:8261/console

slave logs:

2021-11-10 16:58:55,465 INFO  [org.apache.activemq.artemis.integration.bootstrap] AMQ101000: Starting ActiveMQ Artemis Server
2021-11-10 16:58:55,597 INFO  [org.apache.activemq.artemis.core.server] AMQ221000: backup Message Broker is starting with configuration Broker Configuration (clustered=true,journalDirectory=data/journal,bindingsDirectory=data/bindings,largeMessagesDirectory=data/large-messages,pagingDirectory=data/paging)
2021-11-10 16:58:55,625 INFO  [org.apache.activemq.artemis.core.server] AMQ222162: Moving data directory D:\apache-artemis-2.19.0-bin\apache-artemis-2.19.0\bin\backup1\data\journal to D:\apache-artemis-2.19.0-bin\apache-artemis-2.19.0\bin\backup1\data\journal\oldreplica.1
2021-11-10 16:58:55,631 INFO  [org.apache.activemq.artemis.core.server] AMQ221055: There were too many old replicated folders upon startup, removing D:\apache-artemis-2.19.0-bin\apache-artemis-2.19.0\bin\backup1\data\paging\oldreplica.22
2021-11-10 16:58:55,721 INFO  [org.apache.activemq.artemis.core.server] AMQ222162: Moving data directory D:\apache-artemis-2.19.0-bin\apache-artemis-2.19.0\bin\backup1\data\paging to D:\apache-artemis-2.19.0-bin\apache-artemis-2.19.0\bin\backup1\data\paging\oldreplica.24
2021-11-10 16:58:55,847 INFO  [org.apache.activemq.artemis.core.server] AMQ221013: Using NIO Journal
2021-11-10 16:58:55,955 INFO  [org.apache.activemq.artemis.core.server] AMQ221057: Global Max Size is being adjusted to 1/2 of the JVM max size (-Xmx). being defined as 536.870.912
2021-11-10 16:58:56,419 INFO  [org.apache.activemq.hawtio.branding.PluginContextListener] Initialized activemq-branding plugin
2021-11-10 16:58:56,683 INFO  [org.apache.activemq.hawtio.plugin.PluginContextListener] Initialized artemis-plugin plugin
2021-11-10 16:58:58,188 INFO  [io.hawt.HawtioContextListener] Initialising hawtio services
2021-11-10 16:58:58,210 INFO  [io.hawt.system.ConfigManager] Configuration will be discovered via system properties
2021-11-10 16:58:58,214 INFO  [io.hawt.jmx.JmxTreeWatcher] Welcome to Hawtio 2.14.0
2021-11-10 16:58:58,223 INFO  [io.hawt.web.auth.AuthenticationConfiguration] Starting hawtio authentication filter, JAAS realm: "activemq" authorized role(s): "amq" role principal classes: "org.apache.activemq.artemis.spi.core.security.jaas.RolePrincipal"
2021-11-10 16:58:58,234 INFO  [io.hawt.web.auth.LoginRedirectFilter] Hawtio loginRedirectFilter is using 1800 sec. HttpSession timeout
2021-11-10 16:58:58,258 INFO  [io.hawt.web.proxy.ProxyServlet] Proxy servlet is disabled
2021-11-10 16:58:58,273 INFO  [io.hawt.web.servlets.JolokiaConfiguredAgentServlet] Jolokia overridden property: [key=policyLocation, value=file:/D:/apache-artemis-2.19.0-bin/apache-artemis-2.19.0/bin/backup1/etc/\jolokia-access.xml]
2021-11-10 16:58:59,913 INFO  [org.apache.activemq.artemis] AMQ241001: HTTP Server started at http://127.0.0.4:8261
2021-11-10 16:58:59,913 INFO  [org.apache.activemq.artemis] AMQ241002: Artemis Jolokia REST API available at http://127.0.0.4:8261/console/jolokia
2021-11-10 16:58:59,915 INFO  [org.apache.activemq.artemis] AMQ241004: Artemis Console available at http://127.0.0.4:8261/console
2021-11-10 16:59:06,946 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-server]. Adding protocol support for: CORE
2021-11-10 16:59:06,947 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-amqp-protocol]. Adding protocol support for: AMQP
2021-11-10 16:59:06,948 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-hornetq-protocol]. Adding protocol support for: HORNETQ
2021-11-10 16:59:06,948 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-mqtt-protocol]. Adding protocol support for: MQTT
2021-11-10 16:59:06,949 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-openwire-protocol]. Adding protocol support for: OPENWIRE
2021-11-10 16:59:06,949 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-stomp-protocol]. Adding protocol support for: STOMP
2021-11-10 16:59:20,433 WARNING [org.jgroups.protocols.TCP] JGRP000032: D114336-63392: no physical address for 53866421-fef5-9b4b-fdcb-ed1196697da8, dropping message
2021-11-10 16:59:25,161 INFO  [org.apache.activemq.artemis.core.server] AMQ221109: Apache ActiveMQ Artemis Backup Server version 2.19.0 [null] started, waiting live to fail before it gets active
2021-11-10 16:59:28,533 INFO  [org.apache.activemq.artemis.core.server] AMQ221024: Backup server ActiveMQServerImpl::name=broker2 is synchronized with live server, nodeID=1a4624d9-423f-11ec-b875-40167e37963a.
2021-11-10 16:59:28,679 INFO  [org.apache.activemq.artemis.core.server] AMQ221031: backup announced

To be clear, I am using JGroups, as this should be used later in an k8s environment and I wanted to test it. I first tried the standard UDP configuration, but also with that, I did not receive topology. So my mistake must be somewhere else.

CodePudding user response:

I believe the problem is that you need reconnectAttempts > 0 in your URL (e.g. reconnectAttempts=25). The default value for reconnectAttempts is 0.

Also, if you didn't receive a topology update when you connected you'd see an ActiveMQConnectionTimedOutException with a message like this:

Timed out waiting to receive cluster topology.

I also recommend that you specify the backup's host & port in your URL along with the primary's host & port. That way, if the primary is down when the application starts it will still be able to connect to the backup. Since you only have the primary's host & port then anytime the primary is down and the application starts or even if the application just re-creates the ActiveMQConnectionFactory then it will fail. Here's a simple example:

@Bean
public ConnectionFactory filterConnectionFactory(ArtemisConfig artemisConfig) {
    return new ActiveMQConnectionFactory(
                    "(tcp://127.0.0.2:61616,tcp://127.0.0.3:61616)?ha=true&reconnectAttempts=25&blockOnDurableSend=false",
                    artemisConfig.getUserName(), artemisConfig.getPassword());
}
  • Related