Friday, July 13, 2018

Part 2: Docker networking domain sharing

Following up this post, I am giving out the solution to the problem.

Please revise the alipapa.yml, on the network part

networks:
  papa.com:
    driver: bridge 

If I want to give a name to the network bridge, can I do thing like this?

networks:
  papa.com:
    driver: bridge
    name: papa.com

Let's try out, and docker-compose it up!

ubuntu@ip-172-31-11-243:~$ docker-compose -f alipapa.yml up -d
Creating network "papa.com" with driver "bridge"
Recreating ali01 ... done
Recreating ali02 ... done

I can sense the smell of success. But, let's find out more.

ubuntu@ip-172-31-11-243:~$ docker exec -ti ali01 bash
root@ali01:/# hostname -f
ali01.papa.com
root@ali01:/# ping ali01
PING ali01.papa.com (172.18.0.2) 56(84) bytes of data.
64 bytes from ali01.papa.com (172.18.0.2): icmp_seq=1 ttl=64 time=0.036 ms
64 bytes from ali01.papa.com (172.18.0.2): icmp_seq=2 ttl=64 time=0.034 ms
root@ali01:/# ping ali02
PING ali02 (172.18.0.3) 56(84) bytes of data.
64 bytes from ali02.papa.com (172.18.0.3): icmp_seq=1 ttl=64 time=0.076 ms
64 bytes from ali02.papa.com (172.18.0.3): icmp_seq=2 ttl=64 time=0.066 ms
64 bytes from ali02.papa.com (172.18.0.3): icmp_seq=3 ttl=64 time=0.065 ms 

ubuntu@ip-172-31-11-243:~$ docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
8e0851dec6fd        papa.com            bridge              local
07cb97e27689        bridge              bridge              local
e81946364c3d        host                host                local
b285c6c7236e        none                null                local

5fb1bdc9f13e        ubuntu_papa.com     bridge              local

 Indeed! Problem has been solved. ali01.papa.com and ali02.papa.com is returning nicely to me as well. Thank you so much for the docker embedded DNS engine! The DNS resolution is working out of the box!

Part 1: Docker networking domain sharing

Take a look at this test yaml file.

ubuntu@ip-172-31-11-243:~$ cat alipapa.yml
version: "3.5"
services:
  ali01:
    image: ubuntu:16.04
    hostname: ali01
    container_name: ali01
    domainname: papa.com
    networks:
      papa.com:
        aliases:
          - ali01.papa.com
    entrypoint: sleep infinity
  ali02:
    image: ubuntu:16.04
    hostname: ali02
    container_name: ali02
    domainname: papa.com
    networks:
      papa.com:
        aliases:
          - ali02.papa.com
    entrypoint: sleep infinity
networks:
  papa.com:
    driver: bridge

When you are bringing up all these containers, considering that you tap into ali01 and ping ali02, what is your expected result.

ubuntu@ip-172-31-11-243:~$ docker-compose -f alipapa.yml up -d
Creating network "ubuntu_papa.com" with driver "bridge"
Creating ali02 ... done
Creating ali01 ... done
ubuntu@ip-172-31-11-243:~$ docker ps -a
CONTAINER ID        IMAGE                  COMMAND                  CREATED             STATUS                       PORTS               NAMES
fdbfb1a48686        ubuntu:16.04           "sleep infinity"         7 seconds ago       Up 6 seconds                                     ali02
7566e8c7db8b        ubuntu:16.04           "sleep infinity"         7 seconds ago       Up 5 seconds                                     ali01

ubuntu@ip-172-31-11-243:~$ docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
07cb97e27689        bridge              bridge              local
e81946364c3d        host                host                local
b285c6c7236e        none                null                local
5fb1bdc9f13e        ubuntu_papa.com     bridge              local

Wait... I didn't actually creating ubuntu_papa.com domain? Why it is so? Should it be just baba.com?

root@ali01:/# ping ali02
PING ali02 (172.20.0.3) 56(84) bytes of data.
64 bytes from ali02.ubuntu_papa.com (172.20.0.3): icmp_seq=1 ttl=64 time=0.087 ms
64 bytes from ali02.ubuntu_papa.com (172.20.0.3): icmp_seq=2 ttl=64 time=0.053 ms
64 bytes from ali02.ubuntu_papa.com (172.20.0.3): icmp_seq=3 ttl=64 time=0.062 ms

root@ali01:/# ping ali02.papa.com
PING ali02.baba.com (172.20.0.3) 56(84) bytes of data.
64 bytes from ali02.ubuntu_papa.com (172.20.0.3): icmp_seq=1 ttl=64 time=0.063 ms
64 bytes from ali02.ubuntu_papa.com (172.20.0.3): icmp_seq=2 ttl=64 time=0.061 ms
64 bytes from ali02.ubuntu_papa.com (172.20.0.3): icmp_seq=3 ttl=64 time=0.065 ms

Scratching head moment. 😓😓😓




docker-compose dilemma, don't think twice, just upgrade it!

hi

I was been hitting the same problem on docker-compose lately. The problem sounds like this.

ubuntu@ip-172-31-11-243:~/apache-hadoop-docker$ docker-compose -v

docker-compose version 1.8.0, build unknown

ubuntu@ip-172-31-11-243:~/apache-hadoop-docker$ docker-compose -f hdfs-cluster-nonkerberized.yml  up
ERROR: Version in "./hdfs-cluster-nonkerberized.yml" is unsupported. You might be seeing this error because you're using the wrong Compose file version. Either specify a version of "2" (or "2.0") and place your service definitions under the `services` key, or omit the `version` key and place your service definitions at the root of the file to use version 1.
For more on the Compose file format versions, see https://docs.docker.com/compose/compose-file/
ubuntu@ip-172-31-11-243:~/apache-hadoop-docker$


There is nothing wrong when your docker-compose yml file started with version: 3.0 or above. Don't blame on your yml file, even you are the fresh guy to start to code docker-compose yml file. The problem is residing on the docker-compose that comes along with Ubuntu 16.04(Xenial). Don't think twice, please go ahead and upgrade it. Your life will be much better after all.

The steps to do upgrade. Considering that you are on Ubuntu Xenial, here are the steps

> mv /usr/bin/docker-compose /usr/bin/docker-compose-old

>curl -L https://github.com/docker/compose/releases/download/1.20.0/docker-compose-`uname -s`-`uname -m` -o /usr/bin/docker-compose


>chmod +x /usr/bin/docker-compose


Enjoy!

Sunday, November 12, 2017

Part 2: AWS pricing

Extension to the earlier post on AWS pricing. Here is a small script that you can find out the prices of each On Demand instances. Please checkout the requirements.txt for some pre-installed libraries.

Here is how it looks like.

(python3.6) nasilemak:aws yenonnhiu$ python3 aws_pricing.py
Asia Pacific (Singapore): server2 t2.medium 2017-10-11 06:16:19
 * Price OnDemand t2.medium effective from 2017-11-09T22:41:06Z: 0.0584000000 USD/hour
 * Total accumulated price in USD: 45.44
 * Monthly charged price in USD: 16.37
US East (N. Virginia): server1 t2.nano 2016-11-21 13:59:05
 * Price OnDemand t2.nano effective from 2017-11-09T22:41:06Z: 0.0058000000 USD/hour
 * Total accumulated price in USD: 49.57
 * Monthly charged price in USD: 1.63
US West (Oregon): ubuntu2 t2.micro 2017-01-23 04:28:35
 * Price OnDemand t2.micro effective from 2017-11-09T22:41:06Z: 0.0116000000 USD/hour
 * Total accumulated price in USD: 81.71
 * Monthly charged price in USD: 3.25
** Total monthly price for all instances in USD: 21.25
** Total accumulated price for all instances in USD: 176.72

Friday, November 10, 2017

Part 1: AWS Pricing

Hi

AWS pricing seem to be difficult to find out, because aws yields out a list of json files for user to parse on prices(https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/index.json). This index.json will eventually guide you to download another set of index.json files as per service Code, for example: AmazonEC2. Here is a link describing how you could achieve the purpose.

Recently, boto3 is allowing the developers to query is the json pricing objects online without downloading the json files. One of the most important blog that published recently from Amazon about the same objective. From the set of API libraries that you could call from boto3.client('pricing'). You can basically do full range of filtering based on the EC2 Attributes and values in order to nail down that is the instance price.

For example, I have an Linux instance located at US East (N. Virginia), the instance type is t2.nano
. So, the filter function call will be as below

Filters = [
         {'Type' :'TERM_MATCH', 'Field':'operatingSystem', 'Value':'Linux'},
         {'Type' :'TERM_MATCH', 'Field':'instanceType', 'Value':'t2.nano'},
         {'Type' :'TERM_MATCH', 'Field':'location', 'Value':'US East (N. Virginia)'}
     ],

However, the result of the json object is a bit overwhelming for us to parse the detail of the price per hours. With the helps of python library e.g. objectpath, it could help us to drill down the price easily. The small python code will be look like below.

import boto3
import json
import objectpath


pricing = boto3.client('pricing')
print("Selected EC2 Products")
print("=====================")
response = pricing.get_products(
     ServiceCode='AmazonEC2',
     Filters = [
         {'Type' :'TERM_MATCH', 'Field':'operatingSystem', 'Value':'Linux'},
         {'Type' :'TERM_MATCH', 'Field':'instanceType', 'Value':'t2.nano'},
         {'Type' :'TERM_MATCH', 'Field':'location', 'Value':'US East (N. Virginia)'}
     ],
     MaxResults=100
)
[price_info_dump] = response['PriceList']
price_tree = objectpath.Tree(json.loads(price_info_dump))
publish_date = price_tree.execute("$.publicationDate")
[sku] = price_tree.execute("$.terms.OnDemand")
[rateCode] = price_tree.execute("$.terms.OnDemand.'{}'.priceDimensions".format(sku))
print("Price per hours for OnDemand t2.nano effective from {}: {}".format(
        publish_date,
        price_tree.execute("$.terms.OnDemand.'{}'.priceDimensions.'{}'.pricePerUnit".format(sku, rateCode))
        ))

Here is output looks like

(python3-env1) nasilemak:Developments hiuy$ python3 aws_pricing.py
Selected EC2 Products
=====================
Price per hours for OnDemand t2.nano effective from 2017-11-09T22:41:06Z: {'USD': '0.0116000000'}

Here are the range of Amazon EC2 filtering attributes and values that you can call.

Selected EC2 Attributes & Values
================================
  volumeType: Cold HDD, General Purpose, Magnetic, Provisioned IOPS, Throughput Optimized HDD
  maxIopsvolume: 10000, 20000, 250 - based on 1 MiB I/O size, 40 - 200, 500 - based on 1 MiB I/O size
  instanceCapacity10xlarge: 1
  locationType: AWS Region
  instanceFamily: Compute optimized, GPU instance, General purpose, Memory optimized, Micro instances, Storage optimized
  operatingSystem: Linux, NA, RHEL, SUSE, Windows
  clockSpeed: 2 GHz, 2.3 GHz, 2.4  GHz, 2.4 GHz, 2.5 GHz, 2.6 GHz, 2.8 GHz, 2.9 GHz, 3.0 Ghz, Up to 3.0 GHz, Up to 3.3 GHz
  LeaseContractLength: 1 yr, 1yr, 3 yr, 3yr
  ecu: 0, 104, 108, 116, 124.5, 12, 132, 135, 139, 13, 14, 16, 188, 20, 26, 278, 27, 28, 2, 31, 33.5, 340, 349, 35, 3, 4, 52, 53.5, 53, 55, 56, 6.5, 62, 7, 88, 8, 94, 99, NA, Variable
  networkPerformance: 10 Gigabit, 20 Gigabit, 25 Gigabit, High, Low to Moderate, Low, Moderate, NA, Up to 10 Gigabit, Very Low
import boto3
  instanceCapacity8xlarge: 1, 2
  group: EBS I/O Requests, EBS IOPS, EC2-Dedicated Usage, ELB:Balancer, ELB:Balancing, ElasticIP:AdditionalAddress, ElasticIP:Address, ElasticIP:Remap, NGW:NatGateway
  maxThroughputvolume: 160 MB/sec, 250 MiB/s, 320 MB/sec, 40 - 90 MB/sec, 500 MiB/s
  ebsOptimized: Yes
  maxVolumeSize: 1 TiB, 16 TiB
  gpu: 16, 1, 2, 4, 8
  processorFeatures: Intel AVX, Intel AVX2, Intel AVX512, Intel Turbo, Intel AVX, Intel AVX2, Intel Turbo, Intel AVX; Intel AVX2; Intel Turbo, Intel AVX; Intel Turbo
  intelAvxAvailable: Yes
  instanceCapacity4xlarge: 2, 4
  servicecode: AmazonEC2
  groupDescription: Additional Elastic IP address attached to a running instance, Charge for per GB data processed by NAT Gateways with provisioned bandwidth, Charge for per GB data processed by NatGateways, Data processed by Elastic Load Balancer, Elastic IP address attached to a running instance, Elastic IP address remap, Fee for running at least one Dedicated Instance in the region, Hourly charge for NAT Gateways, IOPS, Input/Output Operation, LoadBalancer hourly usage by Application Load Balancer, LoadBalancer hourly usage by Network Load Balancer, Per hour and per Gbps charge for NAT Gateways with provisioned bandwidth, Standard Elastic Load Balancer, Used Application load balancer capacity units-hr, Used Network load balancer capacity units-hr
  processorArchitecture: 32-bit or 64-bit, 64-bit
  physicalCores: 20, 24, 36, 72
  productFamily: Compute Instance, Dedicated Host, Fee, IP Address, Load Balancer-Application, Load Balancer-Network, Load Balancer, NAT Gateway, Storage Snapshot, Storage, System Operation
  enhancedNetworkingSupported: Yes
  intelTurboAvailable: Yes
  memory: 0.5 GiB, 0.613 GiB, 1 GiB, 1,952 Gib, 1.7 GiB, 117 GiB, 122 GiB, 144 GiB, 15 GiB, 15.25 GiB, 16 GiB, 160 GiB, 17.1 GiB, 2 GiB, 22.5 GiB, 23 GiB, 244 GiB, 256 GiB, 3,904 GiB, 3.75 GiB, 30 GiB, 30.5 GiB, 32 GiB, 34.2 GiB, 4 GiB, 488 GiB, 60 GiB, 60.5 GiB, 61 GiB, 64 GiB, 68.4 GiB, 7 GiB, 7.5 GiB, 72 GiB, 768 GiB, 8 GiB, 976 Gib, NA
  dedicatedEbsThroughput: 1000 Mbps, 10000 Mbps, 12000 Mbps, 14000 Mbps, 1600 Mbps, 1750 Mbps, 2000 Mbps, 3000 Mbps, 3500 Mbps, 400 Mbps, 4000 Mbps, 425 Mbps, 450 Mbps, 4500 Mbps, 500 Mbps, 6000 Mbps, 7000 Mbps, 750 Mbps, 800 Mbps, 850 Mbps, 9000 Mbps, Upto 2250 Mbps
  vcpu: 128, 16, 17, 1, 2, 32, 36, 40, 4, 64, 72, 8
  OfferingClass: convertible, standard
  instanceCapacityLarge: 16, 22, 32, 36
  termType: OnDemand, Reserved
  storage: 1 x 0.475 NVMe SSD, 1 x 0.95 NVMe SSD, 1 x 1,920, 1 x 1.9 NVMe SSD, 1 x 160 SSD, 1 x 160, 1 x 32 SSD, 1 x 320 SSD, 1 x 350, 1 x 4 SSD, 1 x 410, 1 x 420, 1 x 60 SSD, 1 x 80 SSD, 1 x 800 SSD, 1 x 850, 12 x 2000 HDD, 2 x 1,920, 2 x 1.9 NVMe SSD, 2 x 1024 SSD, 2 x 120 SSD, 2 x 16 SSD, 2 x 160 SSD, 2 x 320 SSD, 2 x 40 SSD, 2 x 420, 2 x 80 SSD, 2 x 800 SSD, 2 x 840 GB, 2 x 840, 24 x 2000 HDD, 24 x 2000, 3 x 2000 HDD, 4 x 1.9 NVMe SSD, 4 x 420, 4 x 800 SSD, 4 x 840, 6 x 2000 HDD, 8 x 1.9 NVMe SSD, 8 x 800 SSD, EBS only, NA
  intelAvx2Available: Yes
  storageMedia: Amazon S3, HDD-backed, SSD-backed
  physicalProcessor: High Frequency Intel Xeon E7-8880 v3 (Haswell), Intel Xeon E5-2650, Intel Xeon E5-2666 v3 (Haswell), Intel Xeon E5-2670 (Sandy Bridge), Intel Xeon E5-2670 v2 (Ivy Bridge), Intel Xeon E5-2670 v2 (Ivy Bridge/Sandy Bridge), Intel Xeon E5-2670, Intel Xeon E5-2676 v3 (Haswell), Intel Xeon E5-2676v3 (Haswell), Intel Xeon E5-2680 v2 (Ivy Bridge), Intel Xeon E5-2686 v4 (Broadwell), Intel Xeon Family, Intel Xeon Platinum 8124M, Intel Xeon x5570, Variable
  provisioned: No, Yes
  servicename: Amazon Elastic Compute Cloud
  PurchaseOption: All Upfront, AllUpfront, No Upfront, NoUpfront, Partial Upfront, PartialUpfront
  instanceCapacity18xlarge: 1
  instanceType: c1.medium, c1.xlarge, c3.2xlarge, c3.4xlarge, c3.8xlarge, c3.large, c3.xlarge, c3, c4.2xlarge, c4.4xlarge, c4.8xlarge, c4.large, c4.xlarge, c4, c5.18xlarge, c5.2xlarge, c5.4xlarge, c5.9xlarge, c5.large, c5.xlarge, c5, cc1.4xlarge, cc2.8xlarge, cg1.4xlarge, cr1.8xlarge, d2.2xlarge, d2.4xlarge, d2.8xlarge, d2.xlarge, d2, f1.16xlarge, f1.2xlarge, f1, g2.2xlarge, g2.8xlarge, g2, g3.16xlarge, g3.4xlarge, g3.8xlarge, g3, hi1.4xlarge, hs1.8xlarge, i2.2xlarge, i2.4xlarge, i2.8xlarge, i2.xlarge, i2, i3.16xlarge, i3.2xlarge, i3.4xlarge, i3.8xlarge, i3.large, i3.xlarge, i3, m1.large, m1.medium, m1.small, m1.xlarge, m2.2xlarge, m2.4xlarge, m2.xlarge, m3.2xlarge, m3.large, m3.medium, m3.xlarge, m3, m4.10xlarge, m4.16xlarge, m4.2xlarge, m4.4xlarge, m4.large, m4.xlarge, m4, p2.16xlarge, p2.8xlarge, p2.xlarge, p2, p3.16xlarge, p3.2xlarge, p3.8xlarge, p3, r3.2xlarge, r3.4xlarge, r3.8xlarge, r3.large, r3.xlarge, r3, r4.16xlarge, r4.2xlarge, r4.4xlarge, r4.8xlarge, r4.large, r4.xlarge, r4, t1.micro, t2.2xlarge, t2.large, t2.medium, t2.micro, t2.nano
  tenancy: Dedicated, Host, NA, Reserved, Shared
  usagetype: APN1-BoxUsage:c1.medium, APN1-BoxUsage:c1.xlarge, APN1-BoxUsage:c3.2xlarge, APN1-BoxUsage:c3.4xlarge, APN1-BoxUsage:c3.8xlarge, APN1-BoxUsage:c3.large, APN1-BoxUsage:c3.xlarge, APN1-BoxUsage:c4.2xlarge, APN1-BoxUsage:c4.4xlarge, APN1-BoxUsage:c4.8xlarge, APN1-BoxUsage:c4.large, APN1-BoxUsage:c4.xlarge, APN1-BoxUsage:cc2.8xlarge, APN1-BoxUsage:cr1.8xlarge, APN1-BoxUsage:d2.2xlarge, APN1-BoxUsage:d2.4xlarge, APN1-BoxUsage:d2.8xlarge, APN1-BoxUsage:d2.xlarge, APN1-BoxUsage:g2.2xlarge, APN1-BoxUsage:g2.8xlarge, APN1-BoxUsage:g3.16xlarge, APN1-BoxUsage:g3.4xlarge, APN1-BoxUsage:g3.8xlarge, APN1-BoxUsage:hi1.4xlarge, APN1-BoxUsage:hs1.8xlarge, APN1-BoxUsage:i2.2xlarge, APN1-BoxUsage:i2.4xlarge, APN1-BoxUsage:i2.8xlarge, APN1-BoxUsage:i2.xlarge, APN1-BoxUsage:i3.16xlarge, APN1-BoxUsage:i3.2xlarge, APN1-BoxUsage:i3.4xlarge, APN1-BoxUsage:i3.8xlarge, APN1-BoxUsage:i3.large, APN1-BoxUsage:i3.xlarge, APN1-BoxUsage:m1.large, APN1-BoxUsage:m1.medium, APN1-BoxUsage:m1.xlarge, APN1-BoxUsage:m2.2xlarge, APN1-BoxUsage:m2.4xlarge, APN1-BoxUsage:m2.xlarge, APN1-BoxUsage:m3.2xlarge, APN1-BoxUsage:m3.large, APN1-BoxUsage:m3.medium, APN1-BoxUsage:m3.xlarge, APN1-BoxUsage:m4.10xlarge, APN1-BoxUsage:m4.16xlarge, APN1-BoxUsage:m4.2xlarge, APN1-BoxUsage:m4.4xlarge, APN1-BoxUsage:m4.large, APN1-BoxUsage:m4.xlarge, APN1-BoxUsage:p2.16xlarge, APN1-BoxUsage:p2.8xlarge, APN1-BoxUsage:p2.xlarge, APN1-BoxUsage:p3.16xlarge, APN1-BoxUsage:p3.2xlarge, APN1-BoxUsage:p3.8xlarge, APN1-BoxUsage:r3.2xlarge, APN1-BoxUsage:r3.4xlarge, APN1-BoxUsage:r3.8xlarge, APN1-BoxUsage:r3.large, APN1-BoxUsage:r3.xlarge, APN1-BoxUsage:r4.16xlarge, APN1-BoxUsage:r4.2xlarge, APN1-BoxUsage:r4.4xlarge, APN1-BoxUsage:r4.8xlarge, APN1-BoxUsage:r4.large, APN1-BoxUsage:r4.xlarge, APN1-BoxUsage:t1.micro, APN1-BoxUsage:t2.2xlarge, APN1-BoxUsage:t2.large, APN1-BoxUsage:t2.medium, APN1-BoxUsage:t2.micro, APN1-BoxUsage:t2.nano, APN1-BoxUsage:t2.small, APN1-BoxUsage:t2.xlarge, APN1-BoxUsage:x1.16xlarge, APN1-BoxUsage:x1.32xlarge, APN1-BoxUsage:x1e.32xlarge, APN1-BoxUsage, APN1-DataProcessing-Bytes, APN1-DedicatedUsage:c1.medium, APN1-DedicatedUsage:c1.xlarge, APN1-DedicatedUsage:c3.2xlarge, APN1-DedicatedUsage:c3.4xlarge, APN1-DedicatedUsage:c3.8xlarge, APN1-DedicatedUsage:c3.large, APN1-DedicatedUsage:c3.xlarge, APN1-DedicatedUsage:c4.2xlarge, APN1-DedicatedUsage:c4.4xlarge, APN1-DedicatedUsage:c4.8xlarge, APN1-DedicatedUsage:c4.large, APN1-DedicatedUsage:c4.xlarge, APN1-DedicatedUsage:cc2.8xlarge, APN1-DedicatedUsage:cr1.8xlarge, APN1-DedicatedUsage:d2.2xlarge, APN1-DedicatedUsage:d2.4xlarge, APN1-DedicatedUsage:d2.8xlarge, APN1-DedicatedUsage:d2.xlarge, APN1-DedicatedUsage:g2.2xlarge
  normalizationSizeFactor: 0.25, 0.5, 128, 144, 16, 1, 256, 2, 32, 4, 64, 72, 80, 8, NA
  instanceCapacity16xlarge: 1, 2
  instanceCapacity2xlarge: 4, 5, 8
  maxIopsBurstPerformance: 3000 for volumes <= 1 TiB, Hundreds
  instanceCapacity32xlarge: 1
  instanceCapacityXlarge: 11, 16, 18, 8
  licenseModel: Bring your own license, NA, No License required
  currentGeneration: No, Yes
  preInstalledSw: NA, SQL Ent, SQL Std, SQL Web
  location: AWS GovCloud (US), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), EU (Frankfurt), EU (Ireland), EU (London), South America (Sao Paulo), US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon)
  instanceCapacity9xlarge: 2
  instanceCapacityMedium: 32
  operation: Hourly, LoadBalancing:Application, LoadBalancing:Network, LoadBalancing, NatGateway, RunInstances:0002, RunInstances:0006, RunInstances:000g, RunInstances:0010, RunInstances:0102, RunInstances:0202, RunInstances:0800, RunInstances, Surcharge

Tuesday, October 17, 2017

Finding out more on aws instances

hi all,

There is a quick way for you print all of the aws instances. Here is the small python code to help you. However, you do need to install boto3 library before everything starts to work. Please read the README.md to set thing up.

https://github.com/yenonn/server-inventory/blob/master/aws/aws_lib.py

The small function of show_ec2_instances will help you to print out all of the instances. Enjoy!

(python3-env1) nasilemak:aws hiuy$ python3
Python 3.5.2 (default, Oct 11 2016, 04:59:56)
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.38)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from aws_lib import *
>>> show_ec2_instances()
###(Region:ap-south-1)###
###(Region:eu-west-2)###
###(Region:eu-west-1)###
###(Region:ap-northeast-2)###
###(Region:ap-northeast-1)###
  * linux1: stopped
  * linux2: stopped
  * linux3: stopped




Tuesday, September 19, 2017

Potential problem to start name node after kerberizing HDP2.6

hi all,

Not sure this could be the potential problem that you will be facing. But, I have faced the same problem for twice when enable the kerberos on HDP2.6. Here is the log that you can find out from the log file.

  1. 2017-09-19 02:56:17,375 - Execute['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start namenode''] {'environment': {'HADOOP_LIBEXEC_DIR': '/usr/hdp/current/hadoop-client/libexec'}, 'not_if': 'ambari-sudo.sh -H -E test -f /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid && ambari-sudo.sh -H -E pgrep -F /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid'}
  2. 2017-09-19 02:56:21,432 - Execute['/usr/bin/kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs-hiuy@EXAMPLE.COM'] {'user': 'hdfs'}
  3. 2017-09-19 02:56:25,454 - Waiting for this NameNode to leave Safemode due to the following conditions: HA: False, isActive: True, upgradeType: None
  4. 2017-09-19 02:56:25,454 - Waiting up to 19 minutes for the NameNode to leave Safemode...
  5. 2017-09-19 02:56:25,454 - Execute['/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://ip-172-31-9-254.ap-southeast-1.compute.internal:8020 -safemode get | grep 'Safe mode is OFF''] {'logoutput': True, 'tries': 115, 'user': 'hdfs', 'try_sleep': 10} safemode: Call From ip-172-31-9-254.ap-southeast-1.compute.internal/172.31.9.254 to ip-172-31-9-254.ap-southeast-1.compute.internal:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
  6. 2017-09-19 02:56:27,484 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://ip-172-31-9-254.ap-southeast-1.compute.internal:8020 -safemode get | grep 'Safe mode is OFF'' returned 1. safemode: Call From ip-172-31-9-254.ap-southeast-1.compute.internal/172.31.9.254 to ip-172-31-9-254.ap-southeast-1.compute.internal:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
After all, when you dig more details at the hadoop-hdfs.log. You will find out log file as below.

Caused by: javax.security.auth.login.LoginException: Receive timed out at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:808) at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755) at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195) at 


Please pay attention at the highlighted line. With the hint above, basically we can find out cause of problem due to the communication time out in between the KDC server. There is a useful link that documented the problem.



The solution on this problem will be adding a line to krb5.conf under the [libdefaults] section:

udp_preference_limit = 1

There is a small trick to make this setting available on all nodes within the cluster. You have to go to ambari > kerberos > Configs > Advanced krb5-conf to make the change.

Hope that helps. Thanks for reading.

  





Wednesday, August 2, 2017

Parsing XML hadoop files

hi all,

I find reading the cloudera hadoop xml files are a most tedious job in this world. Partly, some of the xml files, e.g. hdfs-site.xml and yarn-site.xml are too long and repetitive. So, I am thinking of an idea to read the xml files with a python parser and print it out to stdout and to an output file (.out). It has the name = value pattern. With that, my eyes will be way.. comfortable reading it. I would like to share the simple python code with you.

from xml.etree import ElementTree
import os
import sys

input_file_name = sys.argv[1]
full_file_name = os.path.abspath(input_file_name)
file_name = input_file_name.split('.')[0]+'.out'

dom = ElementTree.parse(full_file_name)
property = dom.findall('property')

with open(file_name, 'w') as write_xml_conf:
  for p in property:
    name = p.find('name').text
    value = p.find('value').text
    print("{} = {}".format(name, value))
    write_xml_conf.write("{} = {}\n".format(name, value))

Here is the output of the script. With the standard output printed on the screen, I have written the same output to a file too.

[root@ip-172-31-9-77 21-yarn-NODEMANAGER]# decode_xml.py hdfs-site.xml
dfs.namenode.name.dir = file:///dfs/nn
dfs.namenode.servicerpc-address = ip-172-31-14-234.ap-southeast-1.compute.internal:8022
dfs.https.address = ip-172-31-14-234.ap-southeast-1.compute.internal:50470
dfs.https.port = 50470
dfs.namenode.http-address = ip-172-31-14-234.ap-southeast-1.compute.internal:50070
dfs.replication = 3
dfs.blocksize = 134217728
dfs.client.use.datanode.hostname = false
fs.permissions.umask-mode = 022
dfs.namenode.acls.enabled = false
dfs.client.use.legacy.blockreader = false
dfs.client.read.shortcircuit = false
dfs.domain.socket.path = /var/run/hdfs-sockets/dn
dfs.client.read.shortcircuit.skip.checksum = false
dfs.client.domain.socket.data.traffic = false
dfs.datanode.hdfs-blocks-metadata.enabled = true

With this small script, I could discover more xml files from cloudera, especially those reside at /var/run/cloudera-scm-agent/process/

[root@ip-172-31-9-77 process]# cd /var/run/cloudera-scm-agent/process/
[root@ip-172-31-9-77 process]# find . -type f -name *.xml
./38-yarn-NODEMANAGER/yarn-site.xml
./38-yarn-NODEMANAGER/mapred-site.xml
./38-yarn-NODEMANAGER/ssl-server.xml
./38-yarn-NODEMANAGER/hdfs-site.xml
./38-yarn-NODEMANAGER/hadoop-policy.xml
./38-yarn-NODEMANAGER/ssl-client.xml
./38-yarn-NODEMANAGER/core-site.xml
./43-yarn-RESOURCEMANAGER/yarn-site.xml
./43-yarn-RESOURCEMANAGER/ssl-server.xml
./43-yarn-RESOURCEMANAGER/fair-scheduler.xml
./43-yarn-RESOURCEMANAGER/hdfs-site.xml
./43-yarn-RESOURCEMANAGER/core-site.xml
./43-yarn-RESOURCEMANAGER/capacity-scheduler.xml
./43-yarn-RESOURCEMANAGER/hadoop-policy.xml
./43-yarn-RESOURCEMANAGER/mapred-site.xml
./43-yarn-RESOURCEMANAGER/ssl-client.xml
./34-hdfs-DATANODE/hdfs-site.xml
./34-hdfs-DATANODE/ssl-client.xml
./34-hdfs-DATANODE/core-site.xml
./34-hdfs-DATANODE/hadoop-policy.xml
./34-hdfs-DATANODE/ssl-server.xml
./34-hdfs-DATANODE/hdfs-site-refreshable.xml
./40-yarn-JOBHISTORY/yarn-site.xml
./40-yarn-JOBHISTORY/mapred-site.xml
./40-yarn-JOBHISTORY/core-site.xml
./40-yarn-JOBHISTORY/ssl-server.xml
./40-yarn-JOBHISTORY/hdfs-site.xml
./40-yarn-JOBHISTORY/hadoop-policy.xml
./40-yarn-JOBHISTORY/ssl-client.xml

Monday, July 24, 2017

yarn application stops when uid is too low


hi all,

Just want to share the experience when the min.user.id has not changed by default, no yarn application could be submitted. Usually, you will have something like below.

Diagnostics: Application application_1500628462670_0096 initialization failed (exitCode=255) with output: main : command provided 0
main : run as user is hiuy
main : requested yarn user is hiuy
Requested user hiuy is not whitelisted and has id 501,which is below the minimum allowed 1000

Failing this attempt. Failing the application.
 ApplicationMaster host: N/A
 ApplicationMaster RPC port: -1
 queue: root.users.hiuy
 start time: 1500632228942
 final status: FAILED
 user: hiuy
I2017-07-21 06:17:10,181 Client:[ForkJoinPool-1-worker-3] Deleted staging directory hdfs://ip-172-31-16-195.ap-southeast-1.compute.internal:8020/user/hiuy/.sparkStaging/application_1500628462670_0096
E2017-07-21 06:17:10,185 SparkContext:[ForkJoinPool-1-worker-3] Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:85) ~[spark-yarn_2.11-2.1.1.jar:2.1.1]
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62) ~[spark-yarn_2.11-2.1.1.jar:2.1.1]
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156) ~[spark-core_2.11-2.1.1.jar:2.1.1]
at org.apache.spark.SparkContext.(SparkContext.scala:509) ~[spark-core_2.11-2.1.1.jar:2.1.1]
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2320) [spark-core_2.11-2.1.1.jar:2.1.1]
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868) [spark-sql_2.11-2.1.1.jar:2.1.1]


In order to solve the problem, you could either bump up the UID for the yarn application user to any integer value that above 1000, or you could set the min.user.id at YARN configuration. Make sure that you need to restart YARN when after the configuration alteration is done.

Hope that helps. Thanks!

Thursday, July 13, 2017

Python code: python acting like awk

hi all,

I am really love to use unix utility such as "awk". It is useful when I wish to retrieve a certain column in a delimited file, e.g. /etc/passwd. I believe everyone should know. It is simple and easy.

root@lynx-vm:~# cat /etc/passwd | awk -F":" '{print $1}'
root
daemon
bin
sys
sync
games
man
lp
mail
news
uucp
proxy


However, I would like to show you on how to get it done with Python. Personally, I have struggled a lot when doing a similar operation in unix command, as compared in Python. Hope that small tricks could save your times.

>>>
>>>
>>> passwdlines = [ line.split(":") for line in open("/etc/passwd") if "#" not in line]
>>> passwdlines[:5]
[['root', 'x', '0', '0', 'root', '/root', '/bin/bash\n'], ['daemon', 'x', '1', '1', 'daemon', '/usr/sbin', '/usr/sbin/nologin\n'], ['bin', 'x', '2', '2', 'bin', '/bin', '/usr/sbin/nologin\n'], ['sys', 'x', '3', '3', 'sys', '/dev', '/usr/sbin/nologin\n'], ['sync', 'x', '4', '65534', 'sync', '/bin', '/bin/sync\n']]
>>> 


If you would like to know the first column.
>>>
>>> firstcolumn = [column[0] for column in passwdlines]
>>>
>>>
>>> firstcolumn[:5]
['root', 'daemon', 'bin', 'sys', 'sync']
>>> 


If you would like to know the line contains "root".
>>>
>>> rootline = [ line for line in passwdlines if "root" in line]
>>>
>>>
>>> rootline
[['root', 'x', '0', '0', 'root', '/root', '/bin/bash\n']]