Tuesday, August 9, 2016

Exploration on Cloudera: managing services without Cloudera Manager

hi,

Cloudera hadoop ecosystem product is a wonderful project that ever been created. Some or many of the engineers thought and curious on the way how Cloudera in controlling hadoop processes, e.g. how to start/stop a NameNode, ResourceManager, etc services without actually login to Cloudera Manager portal. Actually, I like the way how Cloudera engineer the solution. The core of the technology is supervisord. For more detail explanation, you can visit the website at Cloudera documentation website (https://www.cloudera.com/documentation/enterprise/5-4-x/topics/cm_intro_primer.html).

Bottom line of this post is about sharing my findings about the how to control the hadoop processes e.g. start/stop/status through the command lines, but not from the web portal.

I am assuming that you have a up and running cloudera hadoop cluster installed. I have installed a bare minimal hadoop cluster products, which including zookeeper, HDFS, and YARN. I have it done on the aws cloud, 4 x mx4.xlarge instances . That's all.

To start with, I will explore the role of  a node.

[root@ip-172-31-17-183 ec2-user]# jps
2572 NameNode
2669 ResourceManager
3562 Jps

From here, I know this node is serving as for a NameNode and ResourceManager. That's beautiful. To dig further into the running processes. e.g. ResourceManager Pid. What caught my attention is the line that I have highlighted, /var/run/cloudera-scm-agent/process/59-yarn-RESOURCEMANAGER.  

/usr/java/jdk1.7.0_67-cloudera/bin/java -Dproc_resourcemanager -Xmx1000m -Djava.net.preferIPv4Stack=true -Xms1073741824 -Xmx1073741824 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -Dhadoop.event.appender=,EventCatcher -XX:OnOutOfMemoryError=/usr/lib64/cmf/service/common/killparent.sh -Dhadoop.log.dir=/var/log/hadoop-yarn -Dyarn.log.dir=/var/log/hadoop-yarn -Dhadoop.log.file=hadoop-cmf-yarn-RESOURCEMANAGER-ip-172-31-17-183.ap-southeast-1.compute.internal.log.out -Dyarn.log.file=hadoop-cmf-yarn-RESOURCEMANAGER-ip-172-31-17-183.ap-southeast-1.compute.internal.log.out -Dyarn.home.dir=/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop/lib/native -classpath /var/run/cloudera-scm-agent/process/59-yarn-RESOURCEMANAGER:/var/run/cloudera-scm-agent/process/59-yarn-RESOURCEMANAGER:/var/run/cloudera-scm-agent/process/59-yarn-RESOURCEMANAGER:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop/lib/*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop/.//*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-hdfs/./:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-hdfs/.//*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn/.//*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-mapreduce/lib/*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-mapreduce/.//*:/usr/share/cmf/lib/plugins/tt-instrumentation-5.8.1.jar:/usr/share/cmf/lib/plugins/event-publish-5.8.1-shaded.jar:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn/.//*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn/lib/*:/var/run/cloudera-scm-agent/process/59-yarn-RESOURCEMANAGER/rm-config/log4j.properties org.apache.hadoop.yarn.server.resourcemanager.ResourceManager

Also, from the pstree output I know that ResourceManager is not started up with a classic system daemon. There is a python script that actually forking the processes.

`-python-+-python2.6
              |-python2.6---5*[{python2.6}]
              |-java---107*[{java}]
              `-java---213*[{java}]

  |-python /usr/lib64/cmf/agent/build/env/bin/supervisord
  |   |-java -Dproc_resourcemanager -Xmx1000m -Djava.net.preferIPv4Stack=true-
  |   |   |-{java}
  |   |   |-{java}

Yes! that's the supervisord that I am expecting. I am too curious on the /var/run/cloudera-scm-agent/ too. So, I did an exploration and dig in at the directory to find out what could we have.

Surprise..surprise....there is a supervisord.conf configuration within the directory,

[root@ip-172-31-17-183 supervisor]# cat /var/run/cloudera-scm-agent/supervisor/supervisord.conf
[unix_http_server]
file=%(here)s/supervisord.sock
username=6434554715077552454
password=8561047171289009924

[inet_http_server]
port=127.0.0.1:19001
username=6434554715077552454
password=8561047171289009924

[supervisord]
nodaemon=false
logfile=/var/log/cloudera-scm-agent/supervisord.log
identifier=agent-1626-1470791793

[include]
files = /var/run/cloudera-scm-agent/supervisor/include/*.conf

[supervisorctl]
serverurl=http://127.0.0.1:19001/
username=6434554715077552454
password=8561047171289009924

Aha! Now, I know there is a port opening and listening at 19001, it has the credentials that listed in it. Also, it includes all the sub conf files sharing with other daemons. I am satisfying, indeed. Now, I want to know more on the port and the web ui of the supervisord/supervisorctl. 

For sure, I know there is a port that listening at 19001 at localhost. Perfect!

[root@ip-172-31-17-183 supervisor]# netstat -atun | grep 19001
tcp        0      0 127.0.0.1:19001             0.0.0.0:*                   LISTEN
tcp        0      0 127.0.0.1:41185             127.0.0.1:19001             ESTABLISHED
tcp        0      0 127.0.0.1:19001             127.0.0.1:41185             ESTABLISHED

Now, I just need to expose the localhost to external by SSH tunnelling. That's easy, just passing the hightlight line when you are login. 

MacBook-Pro:Downloads yenonn$ ssh -i hadoop.pem -L19001:localhost:19001 ec2-user@ec2-54-179-147-37.ap-southeast-1.compute.amazonaws.com
Last login: Tue Aug  9 21:21:18 2016 from 223.197.191.42
-bash: warning: setlocale: LC_CTYPE: cannot change locale (UTF-8): No such file or directory
[ec2-user@ip-172-31-17-183 ~]$

And I am ready to explore more supervisorctl web ui from my browser. Beautiful! It means I can startup hadoop services from here. pretty neat!

If lets say, you are not a big fan of web ui. We can make use of supervisorctl to achieve the same purpose.

[root@ip-172-31-17-183 supervisor]# /usr/lib64/cmf/agent/build/env/bin/supervisorctl
49-cloudera-mgmt-SERVICEMONITOR  RUNNING    pid 5526, uptime 0:03:36
53-hdfs-NAMENODE                 RUNNING    pid 2572, uptime 0:32:13
59-yarn-RESOURCEMANAGER          RUNNING    pid 2669, uptime 0:32:13
cmflistener                      RUNNING    pid 1801, uptime 0:32:18
flood                            RUNNING    pid 1991, uptime 0:32:16

I can make sure of the supervisord.conf to start/stop the services from here.

[root@ip-172-31-17-183 supervisor]# /usr/lib64/cmf/agent/build/env/bin/supervisorctl -c /var/run/cloudera-scm-agent/supervisor/supervisord.conf status
49-cloudera-mgmt-SERVICEMONITOR  RUNNING    pid 5526, uptime 0:05:01
53-hdfs-NAMENODE                 RUNNING    pid 2572, uptime 0:33:38
59-yarn-RESOURCEMANAGER          RUNNING    pid 2669, uptime 0:33:38
cmflistener                      RUNNING    pid 1801, uptime 0:33:43
flood                            RUNNING    pid 1991, uptime 0:33:41

[root@ip-172-31-17-183 supervisor]# /usr/lib64/cmf/agent/build/env/bin/supervisorctl -c /var/run/cloudera-scm-agent/supervisor/supervisord.conf stop 49-cloudera-mgmt-SERVICEMONITOR
49-cloudera-mgmt-SERVICEMONITOR: stopped

[root@ip-172-31-17-183 supervisor]# /usr/lib64/cmf/agent/build/env/bin/supervisorctl -c /var/run/cloudera-scm-agent/supervisor/supervisord.conf start 49-cloudera-mgmt-SERVICEMONITOR
49-cloudera-mgmt-SERVICEMONITOR: started


Hope you like it. I love Cloudera and will continue to do my exploration!!

No comments: