Friday, August 12, 2016
HDFS HA failover script
hi,
Assuming that you have HDFS HA enabled, then you can have it done. Compiling small HDFS utility to do the failover by handy. Enjoy... :)
#!/bin/bash
SUHDFS="sudo -u hdfs hdfs"
nameservice=`$SUHDFS getconf -confKey dfs.nameservices`
echo "Nameservice: $nameservice"
serviceIds=`$SUHDFS getconf -confKey dfs.ha.namenodes.$nameservice | sed -s 's|,| |g'`
state=""
is_active=""
is_standby=""
for Id in `echo $serviceIds`
do
namenode_hostname=`$SUHDFS getconf -confKey dfs.namenode.rpc-address.$nameservice.$Id`
state=`$SUHDFS haadmin -getServiceState $Id`
if [ "$state" == "active" ]
then
is_active="$Id"
fi
if [ "$state" == "standby" ]
then
is_standby="$Id"
fi
echo "Hostname : $namenode_hostname"
echo "Service ID: $Id ($state)"
done
echo ""
echo -n "Do you want to do a failover from $is_active (active) -> $is_standby (standby)?: [y/n]"
read ans
if [ "$ans" = "y" ]
then
echo " >> failing over now ...."
echo " Executing >>hdfs haadmin -failover $is_active $is_standby"
$SUHDFS haadmin -failover $is_active $is_standby
if [ "$?" == "0" ]
then
echo " >> Done"
else
echo " >> Failed"
fi
else
echo ">> Exitting ..."
fi
Here is the result.
[root@ip-172-31-17-185 ~]# ./haadmin.sh
Nameservice: nameservice1
Hostname : ip-172-31-17-183.ap-southeast-1.compute.internal:8020
Service ID: namenode22 (standby)
Hostname : ip-172-31-17-184.ap-southeast-1.compute.internal:8020
Service ID: namenode37 (active)
Do you want to do a failover from namenode37 (active) -> namenode22 (standby)?: [y/n]y
>> failing over now ....
Executing >>hdfs haadmin -failover namenode37 namenode22
Failover to NameNode at ip-172-31-17-183.ap-southeast-1.compute.internal/172.31.17.183:8022 successful
>> Done
[root@ip-172-31-17-185 ~]# ./haadmin.sh
Nameservice: nameservice1
Hostname : ip-172-31-17-183.ap-southeast-1.compute.internal:8020
Service ID: namenode22 (active)
Hostname : ip-172-31-17-184.ap-southeast-1.compute.internal:8020
Service ID: namenode37 (standby)
Do you want to do a failover from namenode22 (active) -> namenode37 (standby)?: [y/n]n
>> Exitting ...
Wednesday, August 10, 2016
Installing Python3.5 on CENTOS 6.8
hi all,
I would like to list down all the steps to do a upgrade for python 3.5 on CENTOS6.8.
1. Installing the prerequisite package to build python from source
yum -y groupinstall "Development tools"
2. Download the python source.
4. Download and install the pip
python3.5 get-pip.py
I would like to list down all the steps to do a upgrade for python 3.5 on CENTOS6.8.
1. Installing the prerequisite package to build python from source
yum -y groupinstall "Development tools"
yum -y install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-devel
2. Download the python source.
tar xvfz Python-3.5.*.tgz
cd Python-3.5.*
3. Compiling the python from source
./configure --prefix=/usr/local --enable-shared LDFLAGS="-Wl,-rpath /usr/local/lib"
make && make altinstall
ln -s /usr/local/bin/python3.5 /usr/bin/python3.5
4. Download and install the pip
python3.5 get-pip.py
ln -s /usr/local/bin/pip /usr/bin/pip
5. Voila, you have python3.5 ready.
[root@ip-172-31-18-184 ec2-user]# python3.5
Python 3.5.2 (default, Aug 10 2016, 23:27:34)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-17)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
6. Pip is ready too
[root@ip-172-31-18-184 ec2-user]# pip --version
pip 8.1.2 from /usr/local/lib/python3.5/site-packages (python 3.5)
Hope it helps.
Tuesday, August 9, 2016
Exploration on Cloudera: managing services without Cloudera Manager
hi,
Cloudera hadoop ecosystem product is a wonderful project that ever been created. Some or many of the engineers thought and curious on the way how Cloudera in controlling hadoop processes, e.g. how to start/stop a NameNode, ResourceManager, etc services without actually login to Cloudera Manager portal. Actually, I like the way how Cloudera engineer the solution. The core of the technology is supervisord. For more detail explanation, you can visit the website at Cloudera documentation website (https://www.cloudera.com/documentation/enterprise/5-4-x/topics/cm_intro_primer.html).
Bottom line of this post is about sharing my findings about the how to control the hadoop processes e.g. start/stop/status through the command lines, but not from the web portal.
I am assuming that you have a up and running cloudera hadoop cluster installed. I have installed a bare minimal hadoop cluster products, which including zookeeper, HDFS, and YARN. I have it done on the aws cloud, 4 x mx4.xlarge instances . That's all.
To start with, I will explore the role of a node.
[root@ip-172-31-17-183 ec2-user]# jps
2572 NameNode
2669 ResourceManager
3562 Jps
From here, I know this node is serving as for a NameNode and ResourceManager. That's beautiful. To dig further into the running processes. e.g. ResourceManager Pid. What caught my attention is the line that I have highlighted, /var/run/cloudera-scm-agent/process/59-yarn-RESOURCEMANAGER.
/usr/java/jdk1.7.0_67-cloudera/bin/java -Dproc_resourcemanager -Xmx1000m -Djava.net.preferIPv4Stack=true -Xms1073741824 -Xmx1073741824 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -Dhadoop.event.appender=,EventCatcher -XX:OnOutOfMemoryError=/usr/lib64/cmf/service/common/killparent.sh -Dhadoop.log.dir=/var/log/hadoop-yarn -Dyarn.log.dir=/var/log/hadoop-yarn -Dhadoop.log.file=hadoop-cmf-yarn-RESOURCEMANAGER-ip-172-31-17-183.ap-southeast-1.compute.internal.log.out -Dyarn.log.file=hadoop-cmf-yarn-RESOURCEMANAGER-ip-172-31-17-183.ap-southeast-1.compute.internal.log.out -Dyarn.home.dir=/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop/lib/native -classpath /var/run/cloudera-scm-agent/process/59-yarn-RESOURCEMANAGER:/var/run/cloudera-scm-agent/process/59-yarn-RESOURCEMANAGER:/var/run/cloudera-scm-agent/process/59-yarn-RESOURCEMANAGER:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop/lib/*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop/.//*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-hdfs/./:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-hdfs/.//*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn/.//*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-mapreduce/lib/*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-mapreduce/.//*:/usr/share/cmf/lib/plugins/tt-instrumentation-5.8.1.jar:/usr/share/cmf/lib/plugins/event-publish-5.8.1-shaded.jar:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn/.//*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn/lib/*:/var/run/cloudera-scm-agent/process/59-yarn-RESOURCEMANAGER/rm-config/log4j.properties org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
Also, from the pstree output I know that ResourceManager is not started up with a classic system daemon. There is a python script that actually forking the processes.
`-python-+-python2.6
|-python2.6---5*[{python2.6}]
|-java---107*[{java}]
`-java---213*[{java}]
|-python /usr/lib64/cmf/agent/build/env/bin/supervisord
| |-java -Dproc_resourcemanager -Xmx1000m -Djava.net.preferIPv4Stack=true-
| | |-{java}
| | |-{java}
Yes! that's the supervisord that I am expecting. I am too curious on the /var/run/cloudera-scm-agent/ too. So, I did an exploration and dig in at the directory to find out what could we have.
Surprise..surprise....there is a supervisord.conf configuration within the directory,
[root@ip-172-31-17-183 supervisor]# cat /var/run/cloudera-scm-agent/supervisor/supervisord.conf
[unix_http_server]
file=%(here)s/supervisord.sock
username=6434554715077552454
password=8561047171289009924
[inet_http_server]
port=127.0.0.1:19001
username=6434554715077552454
password=8561047171289009924
[supervisord]
nodaemon=false
logfile=/var/log/cloudera-scm-agent/supervisord.log
identifier=agent-1626-1470791793
[include]
files = /var/run/cloudera-scm-agent/supervisor/include/*.conf
[supervisorctl]
serverurl=http://127.0.0.1:19001/
username=6434554715077552454
password=8561047171289009924
Cloudera hadoop ecosystem product is a wonderful project that ever been created. Some or many of the engineers thought and curious on the way how Cloudera in controlling hadoop processes, e.g. how to start/stop a NameNode, ResourceManager, etc services without actually login to Cloudera Manager portal. Actually, I like the way how Cloudera engineer the solution. The core of the technology is supervisord. For more detail explanation, you can visit the website at Cloudera documentation website (https://www.cloudera.com/documentation/enterprise/5-4-x/topics/cm_intro_primer.html).
Bottom line of this post is about sharing my findings about the how to control the hadoop processes e.g. start/stop/status through the command lines, but not from the web portal.
I am assuming that you have a up and running cloudera hadoop cluster installed. I have installed a bare minimal hadoop cluster products, which including zookeeper, HDFS, and YARN. I have it done on the aws cloud, 4 x mx4.xlarge instances . That's all.
To start with, I will explore the role of a node.
[root@ip-172-31-17-183 ec2-user]# jps
2572 NameNode
2669 ResourceManager
3562 Jps
From here, I know this node is serving as for a NameNode and ResourceManager. That's beautiful. To dig further into the running processes. e.g. ResourceManager Pid. What caught my attention is the line that I have highlighted, /var/run/cloudera-scm-agent/process/59-yarn-RESOURCEMANAGER.
/usr/java/jdk1.7.0_67-cloudera/bin/java -Dproc_resourcemanager -Xmx1000m -Djava.net.preferIPv4Stack=true -Xms1073741824 -Xmx1073741824 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -Dhadoop.event.appender=,EventCatcher -XX:OnOutOfMemoryError=/usr/lib64/cmf/service/common/killparent.sh -Dhadoop.log.dir=/var/log/hadoop-yarn -Dyarn.log.dir=/var/log/hadoop-yarn -Dhadoop.log.file=hadoop-cmf-yarn-RESOURCEMANAGER-ip-172-31-17-183.ap-southeast-1.compute.internal.log.out -Dyarn.log.file=hadoop-cmf-yarn-RESOURCEMANAGER-ip-172-31-17-183.ap-southeast-1.compute.internal.log.out -Dyarn.home.dir=/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop/lib/native -classpath /var/run/cloudera-scm-agent/process/59-yarn-RESOURCEMANAGER:/var/run/cloudera-scm-agent/process/59-yarn-RESOURCEMANAGER:/var/run/cloudera-scm-agent/process/59-yarn-RESOURCEMANAGER:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop/lib/*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop/.//*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-hdfs/./:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-hdfs/.//*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn/.//*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-mapreduce/lib/*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-mapreduce/.//*:/usr/share/cmf/lib/plugins/tt-instrumentation-5.8.1.jar:/usr/share/cmf/lib/plugins/event-publish-5.8.1-shaded.jar:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn/.//*:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn/lib/*:/var/run/cloudera-scm-agent/process/59-yarn-RESOURCEMANAGER/rm-config/log4j.properties org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
Also, from the pstree output I know that ResourceManager is not started up with a classic system daemon. There is a python script that actually forking the processes.
`-python-+-python2.6
|-python2.6---5*[{python2.6}]
|-java---107*[{java}]
`-java---213*[{java}]
|-python /usr/lib64/cmf/agent/build/env/bin/supervisord
| |-java -Dproc_resourcemanager -Xmx1000m -Djava.net.preferIPv4Stack=true-
| | |-{java}
| | |-{java}
Yes! that's the supervisord that I am expecting. I am too curious on the /var/run/cloudera-scm-agent/ too. So, I did an exploration and dig in at the directory to find out what could we have.
Surprise..surprise....there is a supervisord.conf configuration within the directory,
[root@ip-172-31-17-183 supervisor]# cat /var/run/cloudera-scm-agent/supervisor/supervisord.conf
[unix_http_server]
file=%(here)s/supervisord.sock
username=6434554715077552454
password=8561047171289009924
[inet_http_server]
port=127.0.0.1:19001
username=6434554715077552454
password=8561047171289009924
[supervisord]
nodaemon=false
logfile=/var/log/cloudera-scm-agent/supervisord.log
identifier=agent-1626-1470791793
[include]
files = /var/run/cloudera-scm-agent/supervisor/include/*.conf
[supervisorctl]
serverurl=http://127.0.0.1:19001/
username=6434554715077552454
password=8561047171289009924
Aha! Now, I know there is a port opening and listening at 19001, it has the credentials that listed in it. Also, it includes all the sub conf files sharing with other daemons. I am satisfying, indeed. Now, I want to know more on the port and the web ui of the supervisord/supervisorctl.
For sure, I know there is a port that listening at 19001 at localhost. Perfect!
[root@ip-172-31-17-183 supervisor]# netstat -atun | grep 19001
tcp 0 0 127.0.0.1:19001 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:41185 127.0.0.1:19001 ESTABLISHED
tcp 0 0 127.0.0.1:19001 127.0.0.1:41185 ESTABLISHED
Now, I just need to expose the localhost to external by SSH tunnelling. That's easy, just passing the hightlight line when you are login.
MacBook-Pro:Downloads yenonn$ ssh -i hadoop.pem -L19001:localhost:19001 ec2-user@ec2-54-179-147-37.ap-southeast-1.compute.amazonaws.com
Last login: Tue Aug 9 21:21:18 2016 from 223.197.191.42
-bash: warning: setlocale: LC_CTYPE: cannot change locale (UTF-8): No such file or directory
[ec2-user@ip-172-31-17-183 ~]$
And I am ready to explore more supervisorctl web ui from my browser. Beautiful! It means I can startup hadoop services from here. pretty neat!
If lets say, you are not a big fan of web ui. We can make use of supervisorctl to achieve the same purpose.
[root@ip-172-31-17-183 supervisor]# /usr/lib64/cmf/agent/build/env/bin/supervisorctl
49-cloudera-mgmt-SERVICEMONITOR RUNNING pid 5526, uptime 0:03:36
53-hdfs-NAMENODE RUNNING pid 2572, uptime 0:32:13
59-yarn-RESOURCEMANAGER RUNNING pid 2669, uptime 0:32:13
cmflistener RUNNING pid 1801, uptime 0:32:18
flood RUNNING pid 1991, uptime 0:32:16
I can make sure of the supervisord.conf to start/stop the services from here.
[root@ip-172-31-17-183 supervisor]# /usr/lib64/cmf/agent/build/env/bin/supervisorctl -c /var/run/cloudera-scm-agent/supervisor/supervisord.conf status
49-cloudera-mgmt-SERVICEMONITOR RUNNING pid 5526, uptime 0:05:01
53-hdfs-NAMENODE RUNNING pid 2572, uptime 0:33:38
59-yarn-RESOURCEMANAGER RUNNING pid 2669, uptime 0:33:38
cmflistener RUNNING pid 1801, uptime 0:33:43
flood RUNNING pid 1991, uptime 0:33:41
[root@ip-172-31-17-183 supervisor]# /usr/lib64/cmf/agent/build/env/bin/supervisorctl -c /var/run/cloudera-scm-agent/supervisor/supervisord.conf stop 49-cloudera-mgmt-SERVICEMONITOR
49-cloudera-mgmt-SERVICEMONITOR: stopped
[root@ip-172-31-17-183 supervisor]# /usr/lib64/cmf/agent/build/env/bin/supervisorctl -c /var/run/cloudera-scm-agent/supervisor/supervisord.conf start 49-cloudera-mgmt-SERVICEMONITOR
49-cloudera-mgmt-SERVICEMONITOR: started
Hope you like it. I love Cloudera and will continue to do my exploration!!
Subscribe to:
Posts (Atom)