Wednesday, November 21, 2012

/bin/bash --norc

/bin/bash --norc /sbin/mkinitrd -f --with=vmxnet --with=vmxnet3 --with=pvscsi /boot/initrd-2.6.18-238.12.1.el5.img 2.6.18-238.12.1.el5

what is that mean?

man bash
       --norc Do not read and execute the personal initialization file ~/.bashrc  if  the  shell  is  interactive. This option is on by default if the shell is invoked as sh.

Tuesday, November 6, 2012

MapR Hadoop

MapR Hadoop


Hadoop is such a nice invention to manage the big data. Just got an opportunity to attend a training from MapR and find that is really inspiring.

I am going to couple some open source product to manage my hadoop cluster intelligently. I guess this would be a good combo

Puppet + Splunk + Hadoop

I guess with this it can burn a lot of calories from me.

Puppet codes - array and looping

Puppet codes - array and looping


It does bothering me at first when I start to code in puppet. Where is my array definition and how am I going to loop. Here is the caveat right after searching for help in the puppet user group.

This is the example.

I have declared some variables at /etc/puppet/manifests/site.pp

$all_nodes = ["node1.hiu.com", "node2.hiu.com", "node3.hiu.com", "node4.hiu.com"]

Here is the code for the looping 

define install_package_zookeeper {
        if $fqdn == "$name" {
                package { mapr-zookeeper:
                        ensure => present,
                }
        }
}

class mapr::install_zookeeper {
        install_package_zookeeper { $all_nodes:;}
}


Basiscally, here is the explanation, the define function can take the argument of array and traverse the array items as a single instance. e.g.

install_package_zookeeper { ["node1.hiu.com", "node2.hiu.com", "node3.hiu.com"] }

as like this

install_package_zookeeper {
                  "node1.hiu.com":
                  .
                  .
                  "node2.hiu.com":
                  .
                  .
                  "node3.hiu.com":

}

So, taking the puppet default behaviour of grant is actually mean to do a looping for an array. lol


Writing your own facter code - puppet

Writing your own facter code - puppet


Puppet agent provides a set of ruby codes and that stay at /usr/lib/ruby/site_ruby/1.8/facter/*.rb. These ruby codes are the one that provides values for facter's output.

[root@test ~]# facter -p
architecture => x86_64
augeasversion => 0.9.0
boardmanufacturer => Intel Corporation
boardproductname => 440BX Desktop Reference Platform
boardserialnumber => None
domain => houston.hp.com
facterversion => 1.6.6
fqdn =>test.hiu.com
hardwareisa => x86_64
hardwaremodel => x86_64
hostname => d9t0578g
id => root


If you stand on the $rubysitedir, /usr/lib/ruby/site_ruby/1.8, you will notice that you can addon more ruby codes to define your own facter parameter and values.

[root@test facter]# pwd
/usr/lib/ruby/site_ruby/1.8/facter
[root@d9t0578g facter]# ls
application.rb       kernel.rb                  puppetversion.rb
architecture.rb      kernelrelease.rb           qpk.rb
arp.rb               kernelversion.rb           rawpartition.rb
augeasversion.rb     lsbmajdistrelease.rb       rubysitedir.rb
Cfkey.rb             lsb.rb                     rubyversion.rb
domain.rb            macaddress.rb              selinux.rb
ec2.rb               macosx.rb                  ssh.rb
facterversion.rb     manufacturer.rb            timezone.rb
fqdn.rb              memory.rb                  uniqueid.rb
hardwareisa.rb       netmask.rb                 uptime_days.rb
hardwaremodel.rb     network.rb                 uptime_hours.rb
hostname.rb          operatingsystem.rb         uptime.rb
id.rb                operatingsystemrelease.rb  uptime_seconds.rb
interfaces.rb        osfamily.rb                util
ipaddress6.rb        path.rb                    virtual.rb
ipaddress.rb         physicalprocessorcount.rb  vlans.rb
iphostnumber.rb      processor.rb               xendomains.rb
kernelmajversion.rb  ps.rb


This is how i wrote my simple facter codes, called raw_partition.

[root@test facter]# cat rawpartition.rb
# Fact: checking the QPK version

Facter.add(:raw_partition) do
        setcode do
                confine :kernel => :linux
                Facter::Util::Resolution.exec("cat /proc/partitions | awk '/sd/ {print $4}' | sed -e 's|sd|/dev/sd|g' | grep -v sda")
        end
end

Basically, you can alter the line that i highlighted and put some meaningful shell script/strings to yield a different output.

The output of this will be like below.

[root@test facter]# facter -p
architecture => x86_64
augeasversion => 0.9.0
boardmanufacturer => Intel Corporation
boardproductname => 440BX Desktop Reference Platform
boardserialnumber => None
.
.
.
raw_partition => /dev/sdd
/dev/sdc
/dev/sdb



Harvesting facter values - puppet_watchdog



Harvesting facter values - puppet_watchdog


This script is meant for harvesting facter -p values from the puppet agents. It lives as in a process and pulls data from $PUPPET_YAML_HOME, then, it bump the data onto a sqlite db. This script should be setup at the puppetmaster node only.

It is good especially when you need to do some data analysis on your agents.

At the same times, you can also write your own facter codes and retrieve customized information of your agent nodes.

#!/bin/bash
# Hiu, Yen Onn
# This is the watchdog script that will harvest
# all the data of yaml into a db

PUPPET_YAML_HOME="/var/lib/puppet/yaml/node"
PUPPET_WATCHDOG_LOG="/var/log/puppet_watchdogd.log"
HADOOP_DB="/etc/puppet/web/test.db"
LATEST_PUPPET_NODE=""


[[ ! -e $PUPPET_WATCHDOG_LOG ]] && touch $PUPPET_WATCHDOG_LOG

function control_c()
{
        echo -en "\n**** process `basename $0` is terminated. ****\n"
        exit $?
}

function log()
{

        while read data
        do
                echo "[$(date +"%D %T")] $data" >> $PUPPET_WATCHDOG_LOG
        done
}

function harvest_data()
{
        sleep 2
        FILE="$PUPPET_YAML_HOME/$LATEST_PUPPET_NODE"
        SQLITE_BIN="/usr/bin/sqlite3"

        UPTIME=`grep "uptime_days" $FILE | awk '{print $2}'|sed -e 's/"//g'`
        PROCESSERCOUNT=`grep "physicalprocessorcount" $FILE | awk '{print $2}'|sed -e 's/"//g'`
        CLIENTCERT=`grep clientcert $FILE | awk '{print $2}'`
        TIMESTAMP=`grep expiration $FILE | awk '{print $2 " " $3}'`

        if [[ -e $HADOOP_DB ]]
        then
                SQL_CMD=""

                # Query if it is a new record
                SQL_CMD_NEW="SELECT hostname FROM PUPPET_FACTER WHERE hostname='$CLIENTCERT';"
                newrecord=`$SQLITE_BIN $HADOOP_DB "$SQL_CMD_NEW"`

                if [[ -z "$newrecord" ]]
                then
                        SQL_CMD="INSERT INTO PUPPET_FACTER (hostname,timestamp,uptime,physicalprocessorcount) \
                                VALUES ('$CLIENTCERT','$TIMESTAMP','$UPTIME','$PROCESSERCOUNT');"
               else
                        SQL_CMD="UPDATE PUPPET_FACTER SET uptime='$UPTIME', \
                                                 physicalprocessorcount='$PROCESSERCOUNT', \
                                                 timestamp='$TIMESTAMP' \
                               WHERE hostname='$CLIENTCERT';"
                fi

                if [[ -z "$SQL_CMD" ]]
                then
                        echo "SQL String is null" | log
                        return false
                fi

                if ! `$SQLITE_BIN $HADOOP_DB "$SQL_CMD"`
                then
                        return false
                fi
        else
                exit 1
        fi
}

# Main run
trap control_c SIGINT

while true
do
        NODE=`ls -1t $PUPPET_YAML_HOME | grep -i yaml | head -n1`
        if [[ ! -z "$NODE" ]] && [[ "$NODE" != "$LATEST_PUPPET_NODE" ]]
        then
                LATEST_PUPPET_NODE=$NODE

                if harvest_data
                then
                        echo -en "Data harvesting from $LATEST_PUPPET_NODE ...... [DONE]\n" | log
                else
                        echo -en "Data harvesting from $LATEST_PUPPET_NODE ...... [FAILED]\n" | log
                fi

        fi
done


Quick and easy init script

Quick and easy init script - puppetwatchdog

This is the init script that developed to launch puppetwatchdog in order to gather all the facter values from the puppet agent. Then, it will bump it into a sqlite3 database for further data analysis.

Puppetwatchdog should be setup at puppetmaster.

If you want to write a quick and easy daemon script at /etc/init.d/*. You can pretty much take a look at this code.


#
# chkconfig: 2345 90 60
# description: This is a watchdog where is collected \
#              all the puppet yaml files and inject it to a db
##### END INIT INFO

RETVAL=0
prog="puppet_watchdogd"
exec=/usr/bin/puppet_watchdogd
lockfile=/var/lock/subsys/puppet_watchdogd

# Source function library.
. /etc/rc.d/init.d/functions

start() {
        if [ $UID -ne 0 ]; then
                echo "User has insufficient privilege."
                exit 4
        fi

        [ -x $exec ] || exit 5
        echo -n $"Starting $prog: "
        daemon "$prog 2>&1 &"
        retval=$?
        echo
        [ $retval -eq 0 ] && touch $lockfile
}


stop() {
        if [ $UID -ne 0 ] ; then
                echo "User has insufficient privilege."
                exit 4
        fi

        echo -n $"Stopping $prog: "
        killproc $exec
        retval=$?
        echo
        [ $retval -eq 0 ] && rm -fr $lockfile
}


restart() {
        stop
        start
}

case "$1" in
        start)
                start
        ;;
        stop)
                stop
        ;;
        restart)
                restart
        ;;
        *)
        echo $"Usage $0 {start|stop|restart}"
        exit 2
esac
exit $?