Home > database >  Is it possible to monitor hadoop, hbase and yarn using monit tool?
Is it possible to monitor hadoop, hbase and yarn using monit tool?


I want to monitor some services such that, those services needs to restart when they goes down and I found an amazing tool monit. It works fine for Zookeeper since I got a condition like matching "QuorumPeerMain" as shown below in monitrc file

check process Zookeeper matching "QuorumPeerMain"
        start program = "path/to/zkServer.sh start"
        stop program  = "path/to/zkServer.sh stop"

In the sameway, I want to monitor these : hadoop, yarn and hbase

check process Hadoop matching "?"
        start program = "startorstop.sh start"  #equivalent to start-dfs.sh
        stop program  = "startorstop.sh stop"   #equivalent to stop-dfs.sh

What should be written in the place of ?

These are the questions

  • In the hadoop case, there may be a chance any one of these going down NameNode, DataNode, SecondaryNameNode. Monit Doc says that "The top-most matching parent with highest uptime is selected". For e.g., If DataNode goes down, it still considers NameNode and won't try to restart hadoop. Another option was using pid file and I am not able to find hadoop's pid file in /var/run/
  • I want something like a top to bottom approach (not exactly). After starting zookeeper only, I want to start the remaining services like hbase, hadoop and yarn

CodePudding user response:

I got a way to start NameNode, DataNode, SecondaryNameNode independently using shell scripts i.e., hadoop-daemon.sh So in my monit conf NameNode looks like

Credits to @OneCricketeer for the comment, So that I can find a way to start these process independently

check process NameNode matching "NameNode"
    start program = "startorstop.sh start"  #hadoop-daemon.sh start namenode
    stop program  = "startorstop.sh stop"   #hadoop-daemon.sh stop namenode
    group hadoop

and for another part of my question, I got depends option. For more detail take a look here Service Dependencies . In my case, I wanted to restart HRegionServer whenever DataNode goes down. So below conf works

check process HRegionServer matching "HRegionServer"
    start program = "startorstop.sh start"  #hbase-daemon.sh start regionserver
    stop program =  "startorstop.sh stop"   #hbase-daemon.sh stop regionserver
    depends on DataNode

check process DataNode matching "DataNode"
    start program = "startorstop.sh start"  #hbase-daemon.sh start datanode
    stop program =  "startorstop.sh stop"   #hbase-daemon.sh stop datanode
  • Related