Sunday, November 12, 2017

Part 2: AWS pricing

Extension to the earlier post on AWS pricing. Here is a small script that you can find out the prices of each On Demand instances. Please checkout the requirements.txt for some pre-installed libraries.

Here is how it looks like.

(python3.6) nasilemak:aws yenonnhiu$ python3 aws_pricing.py
Asia Pacific (Singapore): server2 t2.medium 2017-10-11 06:16:19
 * Price OnDemand t2.medium effective from 2017-11-09T22:41:06Z: 0.0584000000 USD/hour
 * Total accumulated price in USD: 45.44
 * Monthly charged price in USD: 16.37
US East (N. Virginia): server1 t2.nano 2016-11-21 13:59:05
 * Price OnDemand t2.nano effective from 2017-11-09T22:41:06Z: 0.0058000000 USD/hour
 * Total accumulated price in USD: 49.57
 * Monthly charged price in USD: 1.63
US West (Oregon): ubuntu2 t2.micro 2017-01-23 04:28:35
 * Price OnDemand t2.micro effective from 2017-11-09T22:41:06Z: 0.0116000000 USD/hour
 * Total accumulated price in USD: 81.71
 * Monthly charged price in USD: 3.25
** Total monthly price for all instances in USD: 21.25
** Total accumulated price for all instances in USD: 176.72

Friday, November 10, 2017

Part 1: AWS Pricing

Hi

AWS pricing seem to be difficult to find out, because aws yields out a list of json files for user to parse on prices(https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/index.json). This index.json will eventually guide you to download another set of index.json files as per service Code, for example: AmazonEC2. Here is a link describing how you could achieve the purpose.

Recently, boto3 is allowing the developers to query is the json pricing objects online without downloading the json files. One of the most important blog that published recently from Amazon about the same objective. From the set of API libraries that you could call from boto3.client('pricing'). You can basically do full range of filtering based on the EC2 Attributes and values in order to nail down that is the instance price.

For example, I have an Linux instance located at US East (N. Virginia), the instance type is t2.nano
. So, the filter function call will be as below

Filters = [
         {'Type' :'TERM_MATCH', 'Field':'operatingSystem', 'Value':'Linux'},
         {'Type' :'TERM_MATCH', 'Field':'instanceType', 'Value':'t2.nano'},
         {'Type' :'TERM_MATCH', 'Field':'location', 'Value':'US East (N. Virginia)'}
     ],

However, the result of the json object is a bit overwhelming for us to parse the detail of the price per hours. With the helps of python library e.g. objectpath, it could help us to drill down the price easily. The small python code will be look like below.

import boto3
import json
import objectpath


pricing = boto3.client('pricing')
print("Selected EC2 Products")
print("=====================")
response = pricing.get_products(
     ServiceCode='AmazonEC2',
     Filters = [
         {'Type' :'TERM_MATCH', 'Field':'operatingSystem', 'Value':'Linux'},
         {'Type' :'TERM_MATCH', 'Field':'instanceType', 'Value':'t2.nano'},
         {'Type' :'TERM_MATCH', 'Field':'location', 'Value':'US East (N. Virginia)'}
     ],
     MaxResults=100
)
[price_info_dump] = response['PriceList']
price_tree = objectpath.Tree(json.loads(price_info_dump))
publish_date = price_tree.execute("$.publicationDate")
[sku] = price_tree.execute("$.terms.OnDemand")
[rateCode] = price_tree.execute("$.terms.OnDemand.'{}'.priceDimensions".format(sku))
print("Price per hours for OnDemand t2.nano effective from {}: {}".format(
        publish_date,
        price_tree.execute("$.terms.OnDemand.'{}'.priceDimensions.'{}'.pricePerUnit".format(sku, rateCode))
        ))

Here is output looks like

(python3-env1) nasilemak:Developments hiuy$ python3 aws_pricing.py
Selected EC2 Products
=====================
Price per hours for OnDemand t2.nano effective from 2017-11-09T22:41:06Z: {'USD': '0.0116000000'}

Here are the range of Amazon EC2 filtering attributes and values that you can call.

Selected EC2 Attributes & Values
================================
  volumeType: Cold HDD, General Purpose, Magnetic, Provisioned IOPS, Throughput Optimized HDD
  maxIopsvolume: 10000, 20000, 250 - based on 1 MiB I/O size, 40 - 200, 500 - based on 1 MiB I/O size
  instanceCapacity10xlarge: 1
  locationType: AWS Region
  instanceFamily: Compute optimized, GPU instance, General purpose, Memory optimized, Micro instances, Storage optimized
  operatingSystem: Linux, NA, RHEL, SUSE, Windows
  clockSpeed: 2 GHz, 2.3 GHz, 2.4  GHz, 2.4 GHz, 2.5 GHz, 2.6 GHz, 2.8 GHz, 2.9 GHz, 3.0 Ghz, Up to 3.0 GHz, Up to 3.3 GHz
  LeaseContractLength: 1 yr, 1yr, 3 yr, 3yr
  ecu: 0, 104, 108, 116, 124.5, 12, 132, 135, 139, 13, 14, 16, 188, 20, 26, 278, 27, 28, 2, 31, 33.5, 340, 349, 35, 3, 4, 52, 53.5, 53, 55, 56, 6.5, 62, 7, 88, 8, 94, 99, NA, Variable
  networkPerformance: 10 Gigabit, 20 Gigabit, 25 Gigabit, High, Low to Moderate, Low, Moderate, NA, Up to 10 Gigabit, Very Low
import boto3
  instanceCapacity8xlarge: 1, 2
  group: EBS I/O Requests, EBS IOPS, EC2-Dedicated Usage, ELB:Balancer, ELB:Balancing, ElasticIP:AdditionalAddress, ElasticIP:Address, ElasticIP:Remap, NGW:NatGateway
  maxThroughputvolume: 160 MB/sec, 250 MiB/s, 320 MB/sec, 40 - 90 MB/sec, 500 MiB/s
  ebsOptimized: Yes
  maxVolumeSize: 1 TiB, 16 TiB
  gpu: 16, 1, 2, 4, 8
  processorFeatures: Intel AVX, Intel AVX2, Intel AVX512, Intel Turbo, Intel AVX, Intel AVX2, Intel Turbo, Intel AVX; Intel AVX2; Intel Turbo, Intel AVX; Intel Turbo
  intelAvxAvailable: Yes
  instanceCapacity4xlarge: 2, 4
  servicecode: AmazonEC2
  groupDescription: Additional Elastic IP address attached to a running instance, Charge for per GB data processed by NAT Gateways with provisioned bandwidth, Charge for per GB data processed by NatGateways, Data processed by Elastic Load Balancer, Elastic IP address attached to a running instance, Elastic IP address remap, Fee for running at least one Dedicated Instance in the region, Hourly charge for NAT Gateways, IOPS, Input/Output Operation, LoadBalancer hourly usage by Application Load Balancer, LoadBalancer hourly usage by Network Load Balancer, Per hour and per Gbps charge for NAT Gateways with provisioned bandwidth, Standard Elastic Load Balancer, Used Application load balancer capacity units-hr, Used Network load balancer capacity units-hr
  processorArchitecture: 32-bit or 64-bit, 64-bit
  physicalCores: 20, 24, 36, 72
  productFamily: Compute Instance, Dedicated Host, Fee, IP Address, Load Balancer-Application, Load Balancer-Network, Load Balancer, NAT Gateway, Storage Snapshot, Storage, System Operation
  enhancedNetworkingSupported: Yes
  intelTurboAvailable: Yes
  memory: 0.5 GiB, 0.613 GiB, 1 GiB, 1,952 Gib, 1.7 GiB, 117 GiB, 122 GiB, 144 GiB, 15 GiB, 15.25 GiB, 16 GiB, 160 GiB, 17.1 GiB, 2 GiB, 22.5 GiB, 23 GiB, 244 GiB, 256 GiB, 3,904 GiB, 3.75 GiB, 30 GiB, 30.5 GiB, 32 GiB, 34.2 GiB, 4 GiB, 488 GiB, 60 GiB, 60.5 GiB, 61 GiB, 64 GiB, 68.4 GiB, 7 GiB, 7.5 GiB, 72 GiB, 768 GiB, 8 GiB, 976 Gib, NA
  dedicatedEbsThroughput: 1000 Mbps, 10000 Mbps, 12000 Mbps, 14000 Mbps, 1600 Mbps, 1750 Mbps, 2000 Mbps, 3000 Mbps, 3500 Mbps, 400 Mbps, 4000 Mbps, 425 Mbps, 450 Mbps, 4500 Mbps, 500 Mbps, 6000 Mbps, 7000 Mbps, 750 Mbps, 800 Mbps, 850 Mbps, 9000 Mbps, Upto 2250 Mbps
  vcpu: 128, 16, 17, 1, 2, 32, 36, 40, 4, 64, 72, 8
  OfferingClass: convertible, standard
  instanceCapacityLarge: 16, 22, 32, 36
  termType: OnDemand, Reserved
  storage: 1 x 0.475 NVMe SSD, 1 x 0.95 NVMe SSD, 1 x 1,920, 1 x 1.9 NVMe SSD, 1 x 160 SSD, 1 x 160, 1 x 32 SSD, 1 x 320 SSD, 1 x 350, 1 x 4 SSD, 1 x 410, 1 x 420, 1 x 60 SSD, 1 x 80 SSD, 1 x 800 SSD, 1 x 850, 12 x 2000 HDD, 2 x 1,920, 2 x 1.9 NVMe SSD, 2 x 1024 SSD, 2 x 120 SSD, 2 x 16 SSD, 2 x 160 SSD, 2 x 320 SSD, 2 x 40 SSD, 2 x 420, 2 x 80 SSD, 2 x 800 SSD, 2 x 840 GB, 2 x 840, 24 x 2000 HDD, 24 x 2000, 3 x 2000 HDD, 4 x 1.9 NVMe SSD, 4 x 420, 4 x 800 SSD, 4 x 840, 6 x 2000 HDD, 8 x 1.9 NVMe SSD, 8 x 800 SSD, EBS only, NA
  intelAvx2Available: Yes
  storageMedia: Amazon S3, HDD-backed, SSD-backed
  physicalProcessor: High Frequency Intel Xeon E7-8880 v3 (Haswell), Intel Xeon E5-2650, Intel Xeon E5-2666 v3 (Haswell), Intel Xeon E5-2670 (Sandy Bridge), Intel Xeon E5-2670 v2 (Ivy Bridge), Intel Xeon E5-2670 v2 (Ivy Bridge/Sandy Bridge), Intel Xeon E5-2670, Intel Xeon E5-2676 v3 (Haswell), Intel Xeon E5-2676v3 (Haswell), Intel Xeon E5-2680 v2 (Ivy Bridge), Intel Xeon E5-2686 v4 (Broadwell), Intel Xeon Family, Intel Xeon Platinum 8124M, Intel Xeon x5570, Variable
  provisioned: No, Yes
  servicename: Amazon Elastic Compute Cloud
  PurchaseOption: All Upfront, AllUpfront, No Upfront, NoUpfront, Partial Upfront, PartialUpfront
  instanceCapacity18xlarge: 1
  instanceType: c1.medium, c1.xlarge, c3.2xlarge, c3.4xlarge, c3.8xlarge, c3.large, c3.xlarge, c3, c4.2xlarge, c4.4xlarge, c4.8xlarge, c4.large, c4.xlarge, c4, c5.18xlarge, c5.2xlarge, c5.4xlarge, c5.9xlarge, c5.large, c5.xlarge, c5, cc1.4xlarge, cc2.8xlarge, cg1.4xlarge, cr1.8xlarge, d2.2xlarge, d2.4xlarge, d2.8xlarge, d2.xlarge, d2, f1.16xlarge, f1.2xlarge, f1, g2.2xlarge, g2.8xlarge, g2, g3.16xlarge, g3.4xlarge, g3.8xlarge, g3, hi1.4xlarge, hs1.8xlarge, i2.2xlarge, i2.4xlarge, i2.8xlarge, i2.xlarge, i2, i3.16xlarge, i3.2xlarge, i3.4xlarge, i3.8xlarge, i3.large, i3.xlarge, i3, m1.large, m1.medium, m1.small, m1.xlarge, m2.2xlarge, m2.4xlarge, m2.xlarge, m3.2xlarge, m3.large, m3.medium, m3.xlarge, m3, m4.10xlarge, m4.16xlarge, m4.2xlarge, m4.4xlarge, m4.large, m4.xlarge, m4, p2.16xlarge, p2.8xlarge, p2.xlarge, p2, p3.16xlarge, p3.2xlarge, p3.8xlarge, p3, r3.2xlarge, r3.4xlarge, r3.8xlarge, r3.large, r3.xlarge, r3, r4.16xlarge, r4.2xlarge, r4.4xlarge, r4.8xlarge, r4.large, r4.xlarge, r4, t1.micro, t2.2xlarge, t2.large, t2.medium, t2.micro, t2.nano
  tenancy: Dedicated, Host, NA, Reserved, Shared
  usagetype: APN1-BoxUsage:c1.medium, APN1-BoxUsage:c1.xlarge, APN1-BoxUsage:c3.2xlarge, APN1-BoxUsage:c3.4xlarge, APN1-BoxUsage:c3.8xlarge, APN1-BoxUsage:c3.large, APN1-BoxUsage:c3.xlarge, APN1-BoxUsage:c4.2xlarge, APN1-BoxUsage:c4.4xlarge, APN1-BoxUsage:c4.8xlarge, APN1-BoxUsage:c4.large, APN1-BoxUsage:c4.xlarge, APN1-BoxUsage:cc2.8xlarge, APN1-BoxUsage:cr1.8xlarge, APN1-BoxUsage:d2.2xlarge, APN1-BoxUsage:d2.4xlarge, APN1-BoxUsage:d2.8xlarge, APN1-BoxUsage:d2.xlarge, APN1-BoxUsage:g2.2xlarge, APN1-BoxUsage:g2.8xlarge, APN1-BoxUsage:g3.16xlarge, APN1-BoxUsage:g3.4xlarge, APN1-BoxUsage:g3.8xlarge, APN1-BoxUsage:hi1.4xlarge, APN1-BoxUsage:hs1.8xlarge, APN1-BoxUsage:i2.2xlarge, APN1-BoxUsage:i2.4xlarge, APN1-BoxUsage:i2.8xlarge, APN1-BoxUsage:i2.xlarge, APN1-BoxUsage:i3.16xlarge, APN1-BoxUsage:i3.2xlarge, APN1-BoxUsage:i3.4xlarge, APN1-BoxUsage:i3.8xlarge, APN1-BoxUsage:i3.large, APN1-BoxUsage:i3.xlarge, APN1-BoxUsage:m1.large, APN1-BoxUsage:m1.medium, APN1-BoxUsage:m1.xlarge, APN1-BoxUsage:m2.2xlarge, APN1-BoxUsage:m2.4xlarge, APN1-BoxUsage:m2.xlarge, APN1-BoxUsage:m3.2xlarge, APN1-BoxUsage:m3.large, APN1-BoxUsage:m3.medium, APN1-BoxUsage:m3.xlarge, APN1-BoxUsage:m4.10xlarge, APN1-BoxUsage:m4.16xlarge, APN1-BoxUsage:m4.2xlarge, APN1-BoxUsage:m4.4xlarge, APN1-BoxUsage:m4.large, APN1-BoxUsage:m4.xlarge, APN1-BoxUsage:p2.16xlarge, APN1-BoxUsage:p2.8xlarge, APN1-BoxUsage:p2.xlarge, APN1-BoxUsage:p3.16xlarge, APN1-BoxUsage:p3.2xlarge, APN1-BoxUsage:p3.8xlarge, APN1-BoxUsage:r3.2xlarge, APN1-BoxUsage:r3.4xlarge, APN1-BoxUsage:r3.8xlarge, APN1-BoxUsage:r3.large, APN1-BoxUsage:r3.xlarge, APN1-BoxUsage:r4.16xlarge, APN1-BoxUsage:r4.2xlarge, APN1-BoxUsage:r4.4xlarge, APN1-BoxUsage:r4.8xlarge, APN1-BoxUsage:r4.large, APN1-BoxUsage:r4.xlarge, APN1-BoxUsage:t1.micro, APN1-BoxUsage:t2.2xlarge, APN1-BoxUsage:t2.large, APN1-BoxUsage:t2.medium, APN1-BoxUsage:t2.micro, APN1-BoxUsage:t2.nano, APN1-BoxUsage:t2.small, APN1-BoxUsage:t2.xlarge, APN1-BoxUsage:x1.16xlarge, APN1-BoxUsage:x1.32xlarge, APN1-BoxUsage:x1e.32xlarge, APN1-BoxUsage, APN1-DataProcessing-Bytes, APN1-DedicatedUsage:c1.medium, APN1-DedicatedUsage:c1.xlarge, APN1-DedicatedUsage:c3.2xlarge, APN1-DedicatedUsage:c3.4xlarge, APN1-DedicatedUsage:c3.8xlarge, APN1-DedicatedUsage:c3.large, APN1-DedicatedUsage:c3.xlarge, APN1-DedicatedUsage:c4.2xlarge, APN1-DedicatedUsage:c4.4xlarge, APN1-DedicatedUsage:c4.8xlarge, APN1-DedicatedUsage:c4.large, APN1-DedicatedUsage:c4.xlarge, APN1-DedicatedUsage:cc2.8xlarge, APN1-DedicatedUsage:cr1.8xlarge, APN1-DedicatedUsage:d2.2xlarge, APN1-DedicatedUsage:d2.4xlarge, APN1-DedicatedUsage:d2.8xlarge, APN1-DedicatedUsage:d2.xlarge, APN1-DedicatedUsage:g2.2xlarge
  normalizationSizeFactor: 0.25, 0.5, 128, 144, 16, 1, 256, 2, 32, 4, 64, 72, 80, 8, NA
  instanceCapacity16xlarge: 1, 2
  instanceCapacity2xlarge: 4, 5, 8
  maxIopsBurstPerformance: 3000 for volumes <= 1 TiB, Hundreds
  instanceCapacity32xlarge: 1
  instanceCapacityXlarge: 11, 16, 18, 8
  licenseModel: Bring your own license, NA, No License required
  currentGeneration: No, Yes
  preInstalledSw: NA, SQL Ent, SQL Std, SQL Web
  location: AWS GovCloud (US), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), EU (Frankfurt), EU (Ireland), EU (London), South America (Sao Paulo), US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon)
  instanceCapacity9xlarge: 2
  instanceCapacityMedium: 32
  operation: Hourly, LoadBalancing:Application, LoadBalancing:Network, LoadBalancing, NatGateway, RunInstances:0002, RunInstances:0006, RunInstances:000g, RunInstances:0010, RunInstances:0102, RunInstances:0202, RunInstances:0800, RunInstances, Surcharge

Tuesday, October 17, 2017

Finding out more on aws instances

hi all,

There is a quick way for you print all of the aws instances. Here is the small python code to help you. However, you do need to install boto3 library before everything starts to work. Please read the README.md to set thing up.

https://github.com/yenonn/server-inventory/blob/master/aws/aws_lib.py

The small function of show_ec2_instances will help you to print out all of the instances. Enjoy!

(python3-env1) nasilemak:aws hiuy$ python3
Python 3.5.2 (default, Oct 11 2016, 04:59:56)
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.38)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from aws_lib import *
>>> show_ec2_instances()
###(Region:ap-south-1)###
###(Region:eu-west-2)###
###(Region:eu-west-1)###
###(Region:ap-northeast-2)###
###(Region:ap-northeast-1)###
  * linux1: stopped
  * linux2: stopped
  * linux3: stopped




Tuesday, September 19, 2017

Potential problem to start name node after kerberizing HDP2.6

hi all,

Not sure this could be the potential problem that you will be facing. But, I have faced the same problem for twice when enable the kerberos on HDP2.6. Here is the log that you can find out from the log file.

  1. 2017-09-19 02:56:17,375 - Execute['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start namenode''] {'environment': {'HADOOP_LIBEXEC_DIR': '/usr/hdp/current/hadoop-client/libexec'}, 'not_if': 'ambari-sudo.sh -H -E test -f /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid && ambari-sudo.sh -H -E pgrep -F /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid'}
  2. 2017-09-19 02:56:21,432 - Execute['/usr/bin/kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs-hiuy@EXAMPLE.COM'] {'user': 'hdfs'}
  3. 2017-09-19 02:56:25,454 - Waiting for this NameNode to leave Safemode due to the following conditions: HA: False, isActive: True, upgradeType: None
  4. 2017-09-19 02:56:25,454 - Waiting up to 19 minutes for the NameNode to leave Safemode...
  5. 2017-09-19 02:56:25,454 - Execute['/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://ip-172-31-9-254.ap-southeast-1.compute.internal:8020 -safemode get | grep 'Safe mode is OFF''] {'logoutput': True, 'tries': 115, 'user': 'hdfs', 'try_sleep': 10} safemode: Call From ip-172-31-9-254.ap-southeast-1.compute.internal/172.31.9.254 to ip-172-31-9-254.ap-southeast-1.compute.internal:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
  6. 2017-09-19 02:56:27,484 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://ip-172-31-9-254.ap-southeast-1.compute.internal:8020 -safemode get | grep 'Safe mode is OFF'' returned 1. safemode: Call From ip-172-31-9-254.ap-southeast-1.compute.internal/172.31.9.254 to ip-172-31-9-254.ap-southeast-1.compute.internal:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
After all, when you dig more details at the hadoop-hdfs.log. You will find out log file as below.

Caused by: javax.security.auth.login.LoginException: Receive timed out at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:808) at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755) at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195) at 


Please pay attention at the highlighted line. With the hint above, basically we can find out cause of problem due to the communication time out in between the KDC server. There is a useful link that documented the problem.



The solution on this problem will be adding a line to krb5.conf under the [libdefaults] section:

udp_preference_limit = 1

There is a small trick to make this setting available on all nodes within the cluster. You have to go to ambari > kerberos > Configs > Advanced krb5-conf to make the change.

Hope that helps. Thanks for reading.

  





Wednesday, August 2, 2017

Parsing XML hadoop files

hi all,

I find reading the cloudera hadoop xml files are a most tedious job in this world. Partly, some of the xml files, e.g. hdfs-site.xml and yarn-site.xml are too long and repetitive. So, I am thinking of an idea to read the xml files with a python parser and print it out to stdout and to an output file (.out). It has the name = value pattern. With that, my eyes will be way.. comfortable reading it. I would like to share the simple python code with you.

from xml.etree import ElementTree
import os
import sys

input_file_name = sys.argv[1]
full_file_name = os.path.abspath(input_file_name)
file_name = input_file_name.split('.')[0]+'.out'

dom = ElementTree.parse(full_file_name)
property = dom.findall('property')

with open(file_name, 'w') as write_xml_conf:
  for p in property:
    name = p.find('name').text
    value = p.find('value').text
    print("{} = {}".format(name, value))
    write_xml_conf.write("{} = {}\n".format(name, value))

Here is the output of the script. With the standard output printed on the screen, I have written the same output to a file too.

[root@ip-172-31-9-77 21-yarn-NODEMANAGER]# decode_xml.py hdfs-site.xml
dfs.namenode.name.dir = file:///dfs/nn
dfs.namenode.servicerpc-address = ip-172-31-14-234.ap-southeast-1.compute.internal:8022
dfs.https.address = ip-172-31-14-234.ap-southeast-1.compute.internal:50470
dfs.https.port = 50470
dfs.namenode.http-address = ip-172-31-14-234.ap-southeast-1.compute.internal:50070
dfs.replication = 3
dfs.blocksize = 134217728
dfs.client.use.datanode.hostname = false
fs.permissions.umask-mode = 022
dfs.namenode.acls.enabled = false
dfs.client.use.legacy.blockreader = false
dfs.client.read.shortcircuit = false
dfs.domain.socket.path = /var/run/hdfs-sockets/dn
dfs.client.read.shortcircuit.skip.checksum = false
dfs.client.domain.socket.data.traffic = false
dfs.datanode.hdfs-blocks-metadata.enabled = true

With this small script, I could discover more xml files from cloudera, especially those reside at /var/run/cloudera-scm-agent/process/

[root@ip-172-31-9-77 process]# cd /var/run/cloudera-scm-agent/process/
[root@ip-172-31-9-77 process]# find . -type f -name *.xml
./38-yarn-NODEMANAGER/yarn-site.xml
./38-yarn-NODEMANAGER/mapred-site.xml
./38-yarn-NODEMANAGER/ssl-server.xml
./38-yarn-NODEMANAGER/hdfs-site.xml
./38-yarn-NODEMANAGER/hadoop-policy.xml
./38-yarn-NODEMANAGER/ssl-client.xml
./38-yarn-NODEMANAGER/core-site.xml
./43-yarn-RESOURCEMANAGER/yarn-site.xml
./43-yarn-RESOURCEMANAGER/ssl-server.xml
./43-yarn-RESOURCEMANAGER/fair-scheduler.xml
./43-yarn-RESOURCEMANAGER/hdfs-site.xml
./43-yarn-RESOURCEMANAGER/core-site.xml
./43-yarn-RESOURCEMANAGER/capacity-scheduler.xml
./43-yarn-RESOURCEMANAGER/hadoop-policy.xml
./43-yarn-RESOURCEMANAGER/mapred-site.xml
./43-yarn-RESOURCEMANAGER/ssl-client.xml
./34-hdfs-DATANODE/hdfs-site.xml
./34-hdfs-DATANODE/ssl-client.xml
./34-hdfs-DATANODE/core-site.xml
./34-hdfs-DATANODE/hadoop-policy.xml
./34-hdfs-DATANODE/ssl-server.xml
./34-hdfs-DATANODE/hdfs-site-refreshable.xml
./40-yarn-JOBHISTORY/yarn-site.xml
./40-yarn-JOBHISTORY/mapred-site.xml
./40-yarn-JOBHISTORY/core-site.xml
./40-yarn-JOBHISTORY/ssl-server.xml
./40-yarn-JOBHISTORY/hdfs-site.xml
./40-yarn-JOBHISTORY/hadoop-policy.xml
./40-yarn-JOBHISTORY/ssl-client.xml

Monday, July 24, 2017

yarn application stops when uid is too low


hi all,

Just want to share the experience when the min.user.id has not changed by default, no yarn application could be submitted. Usually, you will have something like below.

Diagnostics: Application application_1500628462670_0096 initialization failed (exitCode=255) with output: main : command provided 0
main : run as user is hiuy
main : requested yarn user is hiuy
Requested user hiuy is not whitelisted and has id 501,which is below the minimum allowed 1000

Failing this attempt. Failing the application.
 ApplicationMaster host: N/A
 ApplicationMaster RPC port: -1
 queue: root.users.hiuy
 start time: 1500632228942
 final status: FAILED
 user: hiuy
I2017-07-21 06:17:10,181 Client:[ForkJoinPool-1-worker-3] Deleted staging directory hdfs://ip-172-31-16-195.ap-southeast-1.compute.internal:8020/user/hiuy/.sparkStaging/application_1500628462670_0096
E2017-07-21 06:17:10,185 SparkContext:[ForkJoinPool-1-worker-3] Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:85) ~[spark-yarn_2.11-2.1.1.jar:2.1.1]
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62) ~[spark-yarn_2.11-2.1.1.jar:2.1.1]
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156) ~[spark-core_2.11-2.1.1.jar:2.1.1]
at org.apache.spark.SparkContext.(SparkContext.scala:509) ~[spark-core_2.11-2.1.1.jar:2.1.1]
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2320) [spark-core_2.11-2.1.1.jar:2.1.1]
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868) [spark-sql_2.11-2.1.1.jar:2.1.1]


In order to solve the problem, you could either bump up the UID for the yarn application user to any integer value that above 1000, or you could set the min.user.id at YARN configuration. Make sure that you need to restart YARN when after the configuration alteration is done.

Hope that helps. Thanks!

Thursday, July 13, 2017

Python code: python acting like awk

hi all,

I am really love to use unix utility such as "awk". It is useful when I wish to retrieve a certain column in a delimited file, e.g. /etc/passwd. I believe everyone should know. It is simple and easy.

root@lynx-vm:~# cat /etc/passwd | awk -F":" '{print $1}'
root
daemon
bin
sys
sync
games
man
lp
mail
news
uucp
proxy


However, I would like to show you on how to get it done with Python. Personally, I have struggled a lot when doing a similar operation in unix command, as compared in Python. Hope that small tricks could save your times.

>>>
>>>
>>> passwdlines = [ line.split(":") for line in open("/etc/passwd") if "#" not in line]
>>> passwdlines[:5]
[['root', 'x', '0', '0', 'root', '/root', '/bin/bash\n'], ['daemon', 'x', '1', '1', 'daemon', '/usr/sbin', '/usr/sbin/nologin\n'], ['bin', 'x', '2', '2', 'bin', '/bin', '/usr/sbin/nologin\n'], ['sys', 'x', '3', '3', 'sys', '/dev', '/usr/sbin/nologin\n'], ['sync', 'x', '4', '65534', 'sync', '/bin', '/bin/sync\n']]
>>> 


If you would like to know the first column.
>>>
>>> firstcolumn = [column[0] for column in passwdlines]
>>>
>>>
>>> firstcolumn[:5]
['root', 'daemon', 'bin', 'sys', 'sync']
>>> 


If you would like to know the line contains "root".
>>>
>>> rootline = [ line for line in passwdlines if "root" in line]
>>>
>>>
>>> rootline
[['root', 'x', '0', '0', 'root', '/root', '/bin/bash\n']]

Wednesday, July 12, 2017

Python code: Calculating stack of balanced brackets in a string

Hi

I am solving the problem of a string of brackets, to detect if the string of brackets are matching and balanced. e.g. a string like following:


[]][{]{(({{)[})(}[[))}{}){[{]}{})()[{}]{{]]]){{}){({(}](({[{[{)]{)}}}({[)}}([{{]]({{
or 
(}{(()[][[){{}{{[}][]{{{{[{{[](}{)}](}}()]}(}(}}]}[](]]){{{()}({[[}}{{[]}(]}{(]{}}[()(}]{[[]{){{

I felt this is something fun and challenging to do. if you have good and creative answers, or detecting my code is buggy for some cases please keep me posted. Thanks!

def is_matched(expression):
    DictBracket = {"{":"}", "[":"]", "(":")"}
    OpenBracket = []
    for counter, char in iter(enumerate(expression)):
        if char in DictBracket.keys():
            OpenBracket.append(char)
        else:
            if counter == 0 and char not in DictBracket.keys():
                return False            
            if counter > 0 and char in DictBracket.values():
                if len(OpenBracket) and DictBracket[OpenBracket.pop()] in char:
                    continue                
            else:
                    return False    
     return len(OpenBracket) == 0
expression = input().strip()
if is_matched(expression) == True:
 print("YES")
else:
 print("NO")

Wednesday, March 22, 2017

HDFS Explorer

Hi all,

Hadoop HDFS command is good. But somehow it becomes hair-wired when we have a long arguments inputs. Those are error-prone and tedious to structure one. So with this notion of in mind, I just developed a wrapper script to facilitate this operation. Easy to install, easy to use too. Here is more descriptions and details about the tool.

Thank you.

https://github.com/yenonn/hdfs-explorer

Example of the output:

Hadoop explorer
HDFS > ls /
 * Running: hdfs dfs -ls /

Found 4 items
drwxr-xr-x   - hdfs supergroup          0 2016-12-01 11:56 /system
drwxr-xr-x   - kite kite                0 2015-11-04 16:36 /testing
drwxrwxrwt   - hdfs supergroup          0 2017-03-23 11:00 /tmp
drwxr-xr-x   - hdfs supergroup          0 2016-02-19 16:17 /user


Thursday, December 1, 2016

Cloudera HDFS kerberized failure: GSSException No Valid credentials

hi hadooper,

This is the problem that you will be facing once you have enabled kerberos on a cloudera server.

Here are the log of the hadoop looks like:

2016-12-01 20:28:21,650 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 for port 8022: readAndProcess from client 172.31.6.120 threw exception [javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled)]]
2016-12-01 20:28:23,457 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 for port 8022: readAndProcess from client 172.31.0.159 threw exception [javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled)]]
2016-12-01 20:28:23,627 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 for port 8022: readAndProcess from client 172.31.0.158 threw exception [javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled)]]

The quick remedy on this will be applying the Java Cryptograhy Extension (JCE) on all the nodes in the cluster.

Here is the step to apply the JCE jar files.

1. Download the tarball which contents the following jar files.

US_export_policy.jar
local_policy.jar

2. Copying them onto each nodes, and overwrite the existing jar files.

/usr/java/jdk1.7.0_67-cloudera/jre/lib/security/US_export_policy.jar
/usr/java/jdk1.7.0_67-cloudera/jre/lib/security/local_policy.jar

3. Make sure, the permission and ownership of the files are retained.

4. Restart hadoop HDFS services.

5. Verify the log file, if there are the same logs appears as before: /var/log/hadoop-hdfs/*

6. Verify your kerberized HDFS is working properly.

[root@ip-172-31-0-157 ~]# kinit cloudera-scm/admin
Password for cloudera-scm/admin@EXAMPLE.COM:
[root@ip-172-31-0-157 ~]# hadoop fs -ls /
Found 1 items
drwxrwxrwt   - hdfs supergroup          0 2016-11-30 21:59 /tmp

7. If you wish to increase the verbosity of the output, you can always export the environment e.g.

export HADOOP_OPTS="-Dsun.security.krb5.debug=true"

8. If you wish to renew the token ticket

kinit -R

9. The krb client configuration /etc/krb5.conf is also important to specific the type e.g. otherwise, you will have this type of errors.

[root@ip-172-31-11-158 197-hdfs-NAMENODE]# cat /etc/krb5.conf
[libdefaults]
default_realm = EXAMPLE.COM
dns_lookup_kdc = false
dns_lookup_realm = false
ticket_lifetime = 86400
renew_lifetime = 604800
forwardable = true
default_tgs_enctypes = rc4-hmac
default_tkt_enctypes = rc4-hmac
permitted_enctypes = rc4-hmac
udp_preference_limit = 1
kdc_timeout = 3000
[realms]
EXAMPLE.COM = {
kdc = ip-172-31-25-156.ap-southeast-1.compute.internal
admin_server = ip-172-31-25-156.ap-southeast-1.compute.internal
}

[root@ip-172-31-25-156 ~]# hadoop fs -ls /
Java config name: null Native config name: /etc/krb5.conf
Loaded from native config
>>>KinitOptions cache name is /tmp/krb5cc_0
>>>DEBUG  client principal is hiuy@EXAMPLE.COM
>>>DEBUG server principal is krbtgt/EXAMPLE.COM@EXAMPLE.COM
>>>DEBUG key type: 18
>>>DEBUG auth time: Fri Dec 02 03:42:32 EST 2016
>>>DEBUG start time: Fri Dec 02 03:42:32 EST 2016
>>>DEBUG end time: Sat Dec 03 03:42:32 EST 2016
>>>DEBUG renew_till time: Fri Dec 02 03:42:32 EST 2016
>>> CCacheInputStream: readFlags()  FORWARDABLE; RENEWABLE; INITIAL;
>>>DEBUG  client principal is hiuy@EXAMPLE.COM
>>>DEBUG server principal is X-CACHECONF:/krb5_ccache_conf_data/fast_avail/krbtgt/EXAMPLE.COM@EXAMPLE.COM
>>>DEBUG key type: 0
>>>DEBUG auth time: Wed Dec 31 19:00:00 EST 1969
>>>DEBUG start time: null
>>>DEBUG end time: Wed Dec 31 19:00:00 EST 1969
>>>DEBUG renew_till time: null
>>> CCacheInputStream: readFlags()
>>> unsupported key type found the default TGT: 18
16/12/02 04:23:11 WARN security.UserGroupInformation: PriviledgedActionException as:root (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
16/12/02 04:23:11 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
16/12/02 04:23:11 WARN security.UserGroupInformation: PriviledgedActionException as:root (auth:KERBEROS) cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]