Wednesday, August 3, 2022

Golang is blissful and painful at the same time.

Coming from python background, when started to learn Golang, I found these are the things are painful.

* Golang deletes a item in a slice, map or array

* Golang detecting a item exists in a slice, map or array

* Golang managing error handling, keep repeating writing the boilerplate code e.g. if error != nil statement.

* Golang does not comes with [list|tuple,dictionary]-comprehension techniques. Every operation has to be on the boilerplate codes on native for loop.

* Golang does not comes with inheritance but it does come with concept of composition. 

* Golang manages string, runes in a some complicated way.

 

On the contrary, Golang does showcasing it real power in certain area when solving problem.

* Golang go-routine and channels. Way impressive and powerful than many programming.

* Golang duck typing - interface concept, aka, if a animal walk like a duck, sound like a duck, then the animal is most likely a duck! 

* Golang is able to compile as a single binary for several platforms. wow! this is the most selling feature in the microservice environment.

* Golang testing framework is robust!

* Golang is performant!


Golang is well-received by community, and a lot of good engines are developed on top of Golang, it proves that Golang able to cope with large scale of technologies. Moreover, Golang is easy to learn, it makes the adoption of Golang in a lot of projects becoming even more seamless. With that said, even with all the pains listed above, the benefit is clearly outweigh the pains. 

Hopefully, Golang development team will have put those in mind in the upcoming release to solve the simple problems. That's would make a lot of developers even love Golang. Go! Golang.

Monday, July 25, 2022

Golang handling os.signal with goroutines

Learning goroutines and practising with `os.Signal`

This is the small snippet of golang codes that showing how goroutines work perfectly.

package main


import (

"fmt"

"os"

"os/signal"

"syscall"

"time"

)


func signalHandler(signal os.Signal) {

switch signal {

case syscall.SIGHUP:

fmt.Println("Signal:", signal.String())

case syscall.SIGINT:

fmt.Println("Signal:", signal.String())

case syscall.SIGTERM:

fmt.Println("Signal:", signal.String())

case syscall.SIGQUIT:

fmt.Println("Signal:", signal.String())

default:

fmt.Println("Unhandled/unknown signal")

}

}


func keepDoingStuffs() {

for {

fmt.Printf("Starting job...")

time.Sleep(time.Second * 5)

fmt.Println("completed.")

}

}


func printNumbers() {

for {

time.Sleep(time.Second * 1)

fmt.Printf(".")

}

}


func trapTimeout(done chan bool) {

for {

select {

case <-time.After(time.Second * 20):

fmt.Println("Timeout.")

done <- true

case <-done:

return

}

}

}


func trapSignal(done chan bool) {

sigc := make(chan os.Signal, 1)

defer close(sigc)

signal.Notify(sigc, syscall.SIGINT, syscall.SIGHUP, syscall.SIGTERM, syscall.SIGQUIT)

for {

select {

case s := <-sigc:

signalHandler(s)

done <- true

case <-done:

return

}

}

}


func main() {

done := make(chan bool)

defer close(done)


go trapSignal(done)

go trapTimeout(done)

go keepDoingStuffs()

go printNumbers()


if <-done {

fmt.Println("Exit gracefully.")

return

}

}


Monday, May 23, 2022

Part 1: Compiling elasticsearch from source

I recently came across some complication where I need to change some of the codes in the elasticsearch and compile them into the binary tar.gz, build a docker image and consume it. It is rarely out there in term of documentation, so I decide to share my findings here.


1. Getting the source codes from elasticsearch. I believe this should be easy, and no explanation needed here.

https://github.com/elastic/elasticsearch

For example, I am looking for the latest 8.2 release, so I will checkout the 8.2 branch from now.

2. Install the right Java version. Here, from the builddoc, it says JAVA17 is needed. So, get ready your gear with Java 17, and set a JAVA_HOME pointing to the right version.

❯ update-alternatives --config java 

There are 3 choices for the alternative java (providing /usr/bin/java). Selection Path Priority Status ------------------------------------------------------------ 

* 0 /usr/lib/jvm/java-17-openjdk-amd64/bin/java 1711 auto mode 

1 /usr/lib/jvm/java-11-openjdk-amd64/bin/java 1111 manual mode 

2 /usr/lib/jvm/java-17-openjdk-amd64/bin/java 1711 manual mode

manual mode Press to keep the current choice[*], or type selection number: ^C

3. Fun begins now, from the source codes itself, you can see a gradlew build tool is provided. Digging more onto it, you will discover to build a linux version of binary, you need to run a command like this.

./gradlew :distribution:archives:linux-tar:assemble 

Then end result will be like this,

❯ ./gradlew :distribution:archives:linux-tar:assemble 

Starting a Gradle Daemon (subsequent builds will be faster) ======================================= 

Elasticsearch Build Hamster says Hello! 

Gradle Version : 7.4.2 

OS Info : Linux 5.13.0-41-generic (amd64) 

JDK Version : 17.0.3 (Private Build) 

JAVA_HOME : /usr/lib/jvm/java-17-openjdk-amd64 

Random Testing Seed : D75C9C369C262E41 In FIPS 140 mode : false ======================================= 

> Task :server:compileJava 

Note: Some input files use or override a deprecated API. 

Note: Recompile with -Xlint:deprecation for details. 

Note: Some input files use or override a deprecated API that is marked for removal. 

Note: Recompile with -Xlint:removal for details. 

 BUILD SUCCESSFUL in 2m 18s 499 actionable tasks: 6 executed, 493 up-to-date 


4. Horray! Now you have your build success. You can get the binary tar.gz at here.

❯ ls -al distribution/archives/linux-tar/build/distributions/elasticsearch-8.2.1-SNAPSHOT-linux-x86_64.tar.gz


Friday, December 31, 2021

Typescript: Interface with function

 Typescript: interface


interface Owner {

  name: string;

  gender: string;

}

interface Vehicle {

  name: string;

  year: Date;

  broken: boolean;

  owner: Owner;

  summary: () => string;

}


const civic: Vehicle = {

  name: "civic",

  year: new Date(),

  broken: true,

  owner: { name: "yenonn", gender: "male" },

  summary: function (): string {

    return `summary: name ${this.name} year: ${this.year} broken: ${this.broken} owned by ${this.owner.name}`;

  },

};


console.log(civic.summary());


Typescript: 5 ways to declare a function

 There are different ways to write a function in typescript


//Method 1

function func0(age: number) {

  console.log(age);

}

func0(0);


//Method 2

const func1 = (age: number) => {

  console.log(age);

};

func1(1);


//Method 3

const func2 = function (age: number): void {

  console.log(age);

};

func2(2);


//Method 4

const func3: (age: number) => void = function (age: number): void {

  console.log(age);

};

func3(3);


//Method 5

const func4: (age: number) => void = (age: number): void => {

  console.log(age);

};

func4(4);


Wednesday, March 25, 2020

Releasing TIME_WAIT connection

If you are noticing that heavy connection piling up on a port and never released. Please try this kernel tuning

# Network tunning
net.ipv4.tcp_fin_timeout = 35
net.ipv4.tcp_keepalive_time = 1800
net.ipv4.tcp_keepalive_intvl = 35
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1 

[root@host1 hiuy]# netstat -atunlp | grep TIME
tcp        0      0 10.108.194.225:8443     10.119.209.8:62844      TIME_WAIT   -
tcp        0      0 10.108.194.225:8443     10.119.209.8:62735      TIME_WAIT   -
tcp        0      0 10.108.194.225:8443     10.119.209.8:62771      TIME_WAIT   -
tcp        0      0 10.108.194.225:8443     10.119.209.8:62784      TIME_WAIT   -
tcp        0      0 10.108.194.225:8443     10.119.209.8:62818      TIME_WAIT   -
tcp        0      0 10.108.194.225:8443     10.119.209.8:62833      TIME_WAIT   -

Sunday, November 17, 2019

Exposing system journal to HTTP REST endpoint.

Journal log naturally printing a lot of useful information, ranging from system kernel, critical, information, and etc log. If I would able to expose those log files onto a HTTP rest endpoint, then I could easily query all of the status of server without login to it. How cool, isn't it? Using the Flask and exposing any port(e.g. 5577).

from flask import Response
from flask import Flask
import json
import subprocess
import socket

app = Flask(__name__)

@app.route("/journal.json", methods = ['GET'])
def get_journal():
  journal_dict = {}
  cmd = "sudo journalctl -k -S today -o json"
  (output, _) = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE, encoding='utf-8').communicate()
  json_list = [line.strip() for line in output.split("\n") if len(line) != 0]
  journal_dict["journal"] = json_list
  js = json.dumps(journal_dict)
  resp = Response(js, status=200, mimetype='application/json')
  resp.headers['Link'] = 'http://{}'.format(socket.gethostname())
  return resp

def main(port=5577):
  app.run(host='0.0.0.0', port=port)

if __name__ == '__main__':
  main()

Monday, November 4, 2019

Namenode migration in CDH cluster

  • Stop Namenode when they in the standby mode.
  • Backup the `dfs.name.dir, dfs.namenode.name.dir` directory.
  • Make sure the backup is restored at the same location on the target Namenode host and permission preserved.
  • Usually the permission is `hdfs:hadoop` if in case of missing it.
  • There are 5 important Namenode setting that need to be inherited to the target Namenode
    1. `dfs.ha.automatic-failover.enabled` checked
    2. `NameNode Nameservice`, usually it is `nameservice1`. But however it could vary depends on the case. It is good to record it before hand.
    3. `Mount Points`, ditto as per above
    4. `Quorum-based Storage Journal name` ditto as per above
    5. Java Opts setting
  • Namenode migration will be resonating with failover controller as well. Make sure they are migrated at the same time, same node.
  • Once we are ready for all the backup and recorded information, then we are good to kick the tyre get rolling.
  • First of all, delete the "Namenode(Standby)" role and "Failover Controller" from the host.
  • Add new role > select "Namenode" and "Failover Controller" to the new host.
  • Make sure all above mentioned Namenode settings are in place.
  • Close your palms and pray when you hit the start button on both. Should you start with Failover Controller first, later followed by Namenode.
  • Now you are relief once Namenode and Failover Controller started, you should now followed with a series of services restart.
  • Do a rolling restart on Data nodes 
  • Do a rolling restart on Hive Metastore Server
  • Do a rolling restart on Hive server
  • Do a rolling restart on Node Manager
  • Do a rolling restart on Resource Manager
  • Do a rolling restart on Oozie
  • Do a rolling restart on Httpfs
  • Do a rolling restart on Journal Nodes
  • Lastly Namenode
  • Show is completed. The End

Tuesday, November 20, 2018

Hadoop utils: YML to XML parser

Dealing with XML files especially those from hadoop configuration is really painful. So, I have an idea to keep all the configurations in the YAML format, and write a parser to convert them into XML format.

e.g. a hadoop-site.yml file

---
dfs.name.dir : /var/local/hadoop/hdfs/name
dfs.data.dir : /var/local/hadoop/hdfs/data 
dfs.heartbeat.interval : 3
dfs.datanode.address : 0.0.0.0:1004
dfs.datanode.http.address : 0.0.0.0:1006 | Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.

It will later been converted to hadoop-site.xml


 
    dfs.datanode.http.address
    0.0.0.0:1006
    Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.
 
 
    dfs.datanode.address
    0.0.0.0:1004
 
 
    dfs.data.dir
    /var/local/hadoop/hdfs/data
 
 
    dfs.heartbeat.interval
    3
 
 
    dfs.name.dir
    /var/local/hadoop/hdfs/name
 


       
Here is the small python codes that I wrote to do the conversion

import yaml
from xml.etree.ElementTree import Element, SubElement, Comment
from xml.etree import ElementTree
from xml.dom import minidom


def prettify(element):
  rough_string = ElementTree.tostring(element, 'utf-8')
  reparsed = minidom.parseString(rough_string)
  return reparsed.toprettyxml(indent="  ")


def read_yaml_file(yaml_file):
  with open(yaml_file, "r") as file:
    site_config = yaml.load(file)
    return site_config


def generate_xml(yaml_file):
  config = read_yaml_file(yaml_file)
  _top = Element('configuration')
  for key, values in config.iteritems():
    _property = SubElement(_top, 'property')
    _name = SubElement(_property, 'name')
    _value = SubElement(_property, 'value')
    if "|" in str(values).split():
      [value, description] = values.split("|")
    else:
      value = values
      description = ""
    _name.text = key
    _value.text = str(value)
    if description:
      _description = SubElement(_property, 'description')
      _description.text = description
  xml_file = yaml_file.split(".")[0] + ".xml"
  with open(xml_file, "w") as file:
    file.write(prettify(_top))


To test it you can import function and use it like this

from hadoop_xml_parser import generate_xml

if __name__ == '__main__':
  generate_xml("hadoop-site.yml")
                                                                                       

Friday, July 13, 2018

Part 2: Docker networking domain sharing

Following up this post, I am giving out the solution to the problem.

Please revise the alipapa.yml, on the network part

networks:
  papa.com:
    driver: bridge 

If I want to give a name to the network bridge, can I do thing like this?

networks:
  papa.com:
    driver: bridge
    name: papa.com

Let's try out, and docker-compose it up!

ubuntu@ip-172-31-11-243:~$ docker-compose -f alipapa.yml up -d
Creating network "papa.com" with driver "bridge"
Recreating ali01 ... done
Recreating ali02 ... done

I can sense the smell of success. But, let's find out more.

ubuntu@ip-172-31-11-243:~$ docker exec -ti ali01 bash
root@ali01:/# hostname -f
ali01.papa.com
root@ali01:/# ping ali01
PING ali01.papa.com (172.18.0.2) 56(84) bytes of data.
64 bytes from ali01.papa.com (172.18.0.2): icmp_seq=1 ttl=64 time=0.036 ms
64 bytes from ali01.papa.com (172.18.0.2): icmp_seq=2 ttl=64 time=0.034 ms
root@ali01:/# ping ali02
PING ali02 (172.18.0.3) 56(84) bytes of data.
64 bytes from ali02.papa.com (172.18.0.3): icmp_seq=1 ttl=64 time=0.076 ms
64 bytes from ali02.papa.com (172.18.0.3): icmp_seq=2 ttl=64 time=0.066 ms
64 bytes from ali02.papa.com (172.18.0.3): icmp_seq=3 ttl=64 time=0.065 ms 

ubuntu@ip-172-31-11-243:~$ docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
8e0851dec6fd        papa.com            bridge              local
07cb97e27689        bridge              bridge              local
e81946364c3d        host                host                local
b285c6c7236e        none                null                local

5fb1bdc9f13e        ubuntu_papa.com     bridge              local

 Indeed! Problem has been solved. ali01.papa.com and ali02.papa.com is returning nicely to me as well. Thank you so much for the docker embedded DNS engine! The DNS resolution is working out of the box!