Hiu and Linux

Wednesday, August 3, 2022

Golang is blissful and painful at the same time.

Coming from python background, when started to learn Golang, I found these are the things are painful.

* Golang deletes a item in a slice, map or array

* Golang detecting a item exists in a slice, map or array

* Golang managing error handling, keep repeating writing the boilerplate code e.g. if error != nil statement.

* Golang does not comes with [list|tuple,dictionary]-comprehension techniques. Every operation has to be on the boilerplate codes on native for loop.

* Golang does not comes with inheritance but it does come with concept of composition.

* Golang manages string, runes in a some complicated way.

On the contrary, Golang does showcasing it real power in certain area when solving problem.

* Golang go-routine and channels. Way impressive and powerful than many programming.

* Golang duck typing - interface concept, aka, if a animal walk like a duck, sound like a duck, then the animal is most likely a duck!

* Golang is able to compile as a single binary for several platforms. wow! this is the most selling feature in the microservice environment.

* Golang testing framework is robust!

* Golang is performant!

Golang is well-received by community, and a lot of good engines are developed on top of Golang, it proves that Golang able to cope with large scale of technologies. Moreover, Golang is easy to learn, it makes the adoption of Golang in a lot of projects becoming even more seamless. With that said, even with all the pains listed above, the benefit is clearly outweigh the pains.

Hopefully, Golang development team will have put those in mind in the upcoming release to solve the simple problems. That's would make a lot of developers even love Golang. Go! Golang.

Monday, July 25, 2022

Golang handling os.signal with goroutines

Learning goroutines and practising with `os.Signal`

This is the small snippet of golang codes that showing how goroutines work perfectly.

package main

import (

"fmt"

"os"

"os/signal"

"syscall"

"time"

)

func signalHandler(signal os.Signal) {

switch signal {

case syscall.SIGHUP:

fmt.Println("Signal:", signal.String())

case syscall.SIGINT:

fmt.Println("Signal:", signal.String())

case syscall.SIGTERM:

fmt.Println("Signal:", signal.String())

case syscall.SIGQUIT:

fmt.Println("Signal:", signal.String())

default:

fmt.Println("Unhandled/unknown signal")

}

func keepDoingStuffs() {

for {

fmt.Printf("Starting job...")

time.Sleep(time.Second * 5)

fmt.Println("completed.")

}

func printNumbers() {

for {

time.Sleep(time.Second * 1)

fmt.Printf(".")

}

func trapTimeout(done chan bool) {

for {

select {

case <-time.After(time.Second * 20):

fmt.Println("Timeout.")

done <- true

case <-done:

return

}

func trapSignal(done chan bool) {

sigc := make(chan os.Signal, 1)

defer close(sigc)

signal.Notify(sigc, syscall.SIGINT, syscall.SIGHUP, syscall.SIGTERM, syscall.SIGQUIT)

for {

select {

case s := <-sigc:

signalHandler(s)

done <- true

case <-done:

return

}

func main() {

done := make(chan bool)

defer close(done)

go trapSignal(done)

go trapTimeout(done)

go keepDoingStuffs()

go printNumbers()

if <-done {

fmt.Println("Exit gracefully.")

return

}

Monday, May 23, 2022

Part 1: Compiling elasticsearch from source

I recently came across some complication where I need to change some of the codes in the elasticsearch and compile them into the binary tar.gz, build a docker image and consume it. It is rarely out there in term of documentation, so I decide to share my findings here.

1. Getting the source codes from elasticsearch. I believe this should be easy, and no explanation needed here.

https://github.com/elastic/elasticsearch

For example, I am looking for the latest 8.2 release, so I will checkout the 8.2 branch from now.

2. Install the right Java version. Here, from the builddoc, it says JAVA17 is needed. So, get ready your gear with Java 17, and set a JAVA_HOME pointing to the right version.

❯ update-alternatives --config java

There are 3 choices for the alternative java (providing /usr/bin/java). Selection Path Priority Status ------------------------------------------------------------

* 0 /usr/lib/jvm/java-17-openjdk-amd64/bin/java 1711 auto mode

1 /usr/lib/jvm/java-11-openjdk-amd64/bin/java 1111 manual mode

2 /usr/lib/jvm/java-17-openjdk-amd64/bin/java 1711 manual mode

manual mode Press to keep the current choice[*], or type selection number: ^C

3. Fun begins now, from the source codes itself, you can see a gradlew build tool is provided. Digging more onto it, you will discover to build a linux version of binary, you need to run a command like this.

./gradlew :distribution:archives:linux-tar:assemble

Then end result will be like this,

❯ ./gradlew :distribution:archives:linux-tar:assemble

Starting a Gradle Daemon (subsequent builds will be faster) =======================================

Elasticsearch Build Hamster says Hello!

Gradle Version : 7.4.2

OS Info : Linux 5.13.0-41-generic (amd64)

JDK Version : 17.0.3 (Private Build)

JAVA_HOME : /usr/lib/jvm/java-17-openjdk-amd64

Random Testing Seed : D75C9C369C262E41 In FIPS 140 mode : false =======================================

> Task :server:compileJava

Note: Some input files use or override a deprecated API.

Note: Recompile with -Xlint:deprecation for details.

Note: Some input files use or override a deprecated API that is marked for removal.

Note: Recompile with -Xlint:removal for details.

BUILD SUCCESSFUL in 2m 18s 499 actionable tasks: 6 executed, 493 up-to-date

4. Horray! Now you have your build success. You can get the binary tar.gz at here.

❯ ls -al distribution/archives/linux-tar/build/distributions/elasticsearch-8.2.1-SNAPSHOT-linux-x86_64.tar.gz

Friday, December 31, 2021

Typescript: Interface with function

Typescript: interface

interface Owner {

gender: string;

}

interface Vehicle {

year: Date;

broken: boolean;

owner: Owner;

summary: () => string;

}

const civic: Vehicle = {

year: new Date(),

broken: true,

owner: { name: "yenonn", gender: "male" },

summary: function (): string {

return `summary: name ${this.name} year: ${this.year} broken: ${this.broken} owned by ${this.owner.name}`;

};

console.log(civic.summary());

Typescript: 5 ways to declare a function

There are different ways to write a function in typescript

//Method 1

function func0(age: number) {

console.log(age);

}

func0(0);

//Method 2

const func1 = (age: number) => {

console.log(age);

};

func1(1);

//Method 3

const func2 = function (age: number): void {

console.log(age);

};

func2(2);

//Method 4

const func3: (age: number) => void = function (age: number): void {

console.log(age);

};

func3(3);

//Method 5

const func4: (age: number) => void = (age: number): void => {

console.log(age);

};

func4(4);

Wednesday, March 25, 2020

Releasing TIME_WAIT connection

If you are noticing that heavy connection piling up on a port and never released. Please try this kernel tuning

# Network tunning

net.ipv4.tcp_fin_timeout = 35

net.ipv4.tcp_keepalive_time = 1800

net.ipv4.tcp_keepalive_intvl = 35

net.ipv4.tcp_tw_recycle = 1

net.ipv4.tcp_tw_reuse = 1

[root@host1 hiuy]# netstat -atunlp | grep TIME
tcp 0 0 10.108.194.225:8443 10.119.209.8:62844 TIME_WAIT -
tcp 0 0 10.108.194.225:8443 10.119.209.8:62735 TIME_WAIT -
tcp 0 0 10.108.194.225:8443 10.119.209.8:62771 TIME_WAIT -
tcp 0 0 10.108.194.225:8443 10.119.209.8:62784 TIME_WAIT -
tcp 0 0 10.108.194.225:8443 10.119.209.8:62818 TIME_WAIT -
tcp 0 0 10.108.194.225:8443 10.119.209.8:62833 TIME_WAIT -

Sunday, November 17, 2019

Exposing system journal to HTTP REST endpoint.

Journal log naturally printing a lot of useful information, ranging from system kernel, critical, information, and etc log. If I would able to expose those log files onto a HTTP rest endpoint, then I could easily query all of the status of server without login to it. How cool, isn't it? Using the Flask and exposing any port(e.g. 5577).

from flask import Response
from flask import Flask
import json
import subprocess
import socket

app = Flask(__name__)

@app.route("/journal.json", methods = ['GET'])
def get_journal():
journal_dict = {}
cmd = "sudo journalctl -k -S today -o json"
(output, _) = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE, encoding='utf-8').communicate()
json_list = [line.strip() for line in output.split("\n") if len(line) != 0]
journal_dict["journal"] = json_list
js = json.dumps(journal_dict)
resp = Response(js, status=200, mimetype='application/json')
resp.headers['Link'] = 'http://{}'.format(socket.gethostname())
return resp

def main(port=5577):

app.run(host='0.0.0.0', port=port)

if __name__ == '__main__':

main()

Monday, November 4, 2019

Namenode migration in CDH cluster

Stop Namenode when they in the standby mode.
Backup the `dfs.name.dir, dfs.namenode.name.dir` directory.
Make sure the backup is restored at the same location on the target Namenode host and permission preserved.
Usually the permission is `hdfs:hadoop` if in case of missing it.
There are 5 important Namenode setting that need to be inherited to the target Namenode

1. `dfs.ha.automatic-failover.enabled` checked
2. `NameNode Nameservice`, usually it is `nameservice1`. But however it could vary depends on the case. It is good to record it before hand.
3. `Mount Points`, ditto as per above
4. `Quorum-based Storage Journal name` ditto as per above
5. Java Opts setting

Namenode migration will be resonating with failover controller as well. Make sure they are migrated at the same time, same node.
Once we are ready for all the backup and recorded information, then we are good to kick the tyre get rolling.
First of all, delete the "Namenode(Standby)" role and "Failover Controller" from the host.
Add new role > select "Namenode" and "Failover Controller" to the new host.
Make sure all above mentioned Namenode settings are in place.
Close your palms and pray when you hit the start button on both. Should you start with Failover Controller first, later followed by Namenode.
Now you are relief once Namenode and Failover Controller started, you should now followed with a series of services restart.
Do a rolling restart on Data nodes
Do a rolling restart on Hive Metastore Server
Do a rolling restart on Hive server
Do a rolling restart on Node Manager
Do a rolling restart on Resource Manager
Do a rolling restart on Oozie
Do a rolling restart on Httpfs
Do a rolling restart on Journal Nodes
Lastly Namenode
Show is completed. The End

Tuesday, November 20, 2018

Hadoop utils: YML to XML parser

Dealing with XML files especially those from hadoop configuration is really painful. So, I have an idea to keep all the configurations in the YAML format, and write a parser to convert them into XML format.

e.g. a hadoop-site.yml file

---
dfs.name.dir : /var/local/hadoop/hdfs/name
dfs.data.dir : /var/local/hadoop/hdfs/data
dfs.heartbeat.interval : 3
dfs.datanode.address : 0.0.0.0:1004
dfs.datanode.http.address : 0.0.0.0:1006 | Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.

It will later been converted to hadoop-site.xml

dfs.datanode.http.address
0.0.0.0:1006
Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.

dfs.datanode.address
0.0.0.0:1004

dfs.data.dir
/var/local/hadoop/hdfs/data

dfs.heartbeat.interval
3

dfs.name.dir
/var/local/hadoop/hdfs/name

Here is the small python codes that I wrote to do the conversion

import yaml
from xml.etree.ElementTree import Element, SubElement, Comment
from xml.etree import ElementTree
from xml.dom import minidom

def prettify(element):
rough_string = ElementTree.tostring(element, 'utf-8')
reparsed = minidom.parseString(rough_string)
return reparsed.toprettyxml(indent=" ")

def read_yaml_file(yaml_file):
with open(yaml_file, "r") as file:
site_config = yaml.load(file)
return site_config

def generate_xml(yaml_file):
config = read_yaml_file(yaml_file)
_top = Element('configuration')
for key, values in config.iteritems():
_property = SubElement(_top, 'property')
_name = SubElement(_property, 'name')
_value = SubElement(_property, 'value')
if "|" in str(values).split():
[value, description] = values.split("|")
else:
value = values
description = ""
_name.text = key
_value.text = str(value)
if description:
_description = SubElement(_property, 'description')
_description.text = description
xml_file = yaml_file.split(".")[0] + ".xml"
with open(xml_file, "w") as file:
file.write(prettify(_top))

To test it you can import function and use it like this

from hadoop_xml_parser import generate_xml

if __name__ == '__main__':
generate_xml("hadoop-site.yml")

Friday, July 13, 2018

Part 2: Docker networking domain sharing

Following up this post, I am giving out the solution to the problem.

Please revise the alipapa.yml, on the network part

networks:
papa.com:
driver: bridge

If I want to give a name to the network bridge, can I do thing like this?

networks:
papa.com:
driver: bridge
name: papa.com

Let's try out, and docker-compose it up!

ubuntu@ip-172-31-11-243:~$ docker-compose -f alipapa.yml up -d
Creating network "papa.com" with driver "bridge"
Recreating ali01 ... done
Recreating ali02 ... done

I can sense the smell of success. But, let's find out more.

ubuntu@ip-172-31-11-243:~$ docker exec -ti ali01 bash
root@ali01:/# hostname -f
ali01.papa.com
root@ali01:/# ping ali01
PING ali01.papa.com (172.18.0.2) 56(84) bytes of data.
64 bytes from ali01.papa.com (172.18.0.2): icmp_seq=1 ttl=64 time=0.036 ms
64 bytes from ali01.papa.com (172.18.0.2): icmp_seq=2 ttl=64 time=0.034 ms
root@ali01:/# ping ali02
PING ali02 (172.18.0.3) 56(84) bytes of data.
64 bytes from ali02.papa.com (172.18.0.3): icmp_seq=1 ttl=64 time=0.076 ms
64 bytes from ali02.papa.com (172.18.0.3): icmp_seq=2 ttl=64 time=0.066 ms
64 bytes from ali02.papa.com (172.18.0.3): icmp_seq=3 ttl=64 time=0.065 ms

ubuntu@ip-172-31-11-243:~$ docker network ls
NETWORK ID NAME DRIVER SCOPE
8e0851dec6fd papa.com bridge local
07cb97e27689 bridge bridge local
e81946364c3d host host local
b285c6c7236e none null local

5fb1bdc9f13e ubuntu_papa.com bridge local

Indeed! Problem has been solved. ali01.papa.com and ali02.papa.com is returning nicely to me as well. Thank you so much for the docker embedded DNS engine! The DNS resolution is working out of the box!