Monday, July 24, 2017

yarn application stops when uid is too low


hi all,

Just want to share the experience when the min.user.id has not changed by default, no yarn application could be submitted. Usually, you will have something like below.

Diagnostics: Application application_1500628462670_0096 initialization failed (exitCode=255) with output: main : command provided 0
main : run as user is hiuy
main : requested yarn user is hiuy
Requested user hiuy is not whitelisted and has id 501,which is below the minimum allowed 1000

Failing this attempt. Failing the application.
 ApplicationMaster host: N/A
 ApplicationMaster RPC port: -1
 queue: root.users.hiuy
 start time: 1500632228942
 final status: FAILED
 user: hiuy
I2017-07-21 06:17:10,181 Client:[ForkJoinPool-1-worker-3] Deleted staging directory hdfs://ip-172-31-16-195.ap-southeast-1.compute.internal:8020/user/hiuy/.sparkStaging/application_1500628462670_0096
E2017-07-21 06:17:10,185 SparkContext:[ForkJoinPool-1-worker-3] Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:85) ~[spark-yarn_2.11-2.1.1.jar:2.1.1]
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62) ~[spark-yarn_2.11-2.1.1.jar:2.1.1]
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156) ~[spark-core_2.11-2.1.1.jar:2.1.1]
at org.apache.spark.SparkContext.(SparkContext.scala:509) ~[spark-core_2.11-2.1.1.jar:2.1.1]
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2320) [spark-core_2.11-2.1.1.jar:2.1.1]
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868) [spark-sql_2.11-2.1.1.jar:2.1.1]


In order to solve the problem, you could either bump up the UID for the yarn application user to any integer value that above 1000, or you could set the min.user.id at YARN configuration. Make sure that you need to restart YARN when after the configuration alteration is done.

Hope that helps. Thanks!

Thursday, July 13, 2017

Python code: python acting like awk

hi all,

I am really love to use unix utility such as "awk". It is useful when I wish to retrieve a certain column in a delimited file, e.g. /etc/passwd. I believe everyone should know. It is simple and easy.

root@lynx-vm:~# cat /etc/passwd | awk -F":" '{print $1}'
root
daemon
bin
sys
sync
games
man
lp
mail
news
uucp
proxy


However, I would like to show you on how to get it done with Python. Personally, I have struggled a lot when doing a similar operation in unix command, as compared in Python. Hope that small tricks could save your times.

>>>
>>>
>>> passwdlines = [ line.split(":") for line in open("/etc/passwd") if "#" not in line]
>>> passwdlines[:5]
[['root', 'x', '0', '0', 'root', '/root', '/bin/bash\n'], ['daemon', 'x', '1', '1', 'daemon', '/usr/sbin', '/usr/sbin/nologin\n'], ['bin', 'x', '2', '2', 'bin', '/bin', '/usr/sbin/nologin\n'], ['sys', 'x', '3', '3', 'sys', '/dev', '/usr/sbin/nologin\n'], ['sync', 'x', '4', '65534', 'sync', '/bin', '/bin/sync\n']]
>>> 


If you would like to know the first column.
>>>
>>> firstcolumn = [column[0] for column in passwdlines]
>>>
>>>
>>> firstcolumn[:5]
['root', 'daemon', 'bin', 'sys', 'sync']
>>> 


If you would like to know the line contains "root".
>>>
>>> rootline = [ line for line in passwdlines if "root" in line]
>>>
>>>
>>> rootline
[['root', 'x', '0', '0', 'root', '/root', '/bin/bash\n']]

Wednesday, July 12, 2017

Python code: Calculating stack of balanced brackets in a string

Hi

I am solving the problem of a string of brackets, to detect if the string of brackets are matching and balanced. e.g. a string like following:


[]][{]{(({{)[})(}[[))}{}){[{]}{})()[{}]{{]]]){{}){({(}](({[{[{)]{)}}}({[)}}([{{]]({{
or 
(}{(()[][[){{}{{[}][]{{{{[{{[](}{)}](}}()]}(}(}}]}[](]]){{{()}({[[}}{{[]}(]}{(]{}}[()(}]{[[]{){{

I felt this is something fun and challenging to do. if you have good and creative answers, or detecting my code is buggy for some cases please keep me posted. Thanks!

def is_matched(expression):
    DictBracket = {"{":"}", "[":"]", "(":")"}
    OpenBracket = []
    for counter, char in iter(enumerate(expression)):
        if char in DictBracket.keys():
            OpenBracket.append(char)
        else:
            if counter == 0 and char not in DictBracket.keys():
                return False            
            if counter > 0 and char in DictBracket.values():
                if len(OpenBracket) and DictBracket[OpenBracket.pop()] in char:
                    continue                
            else:
                    return False    
     return len(OpenBracket) == 0
expression = input().strip()
if is_matched(expression) == True:
 print("YES")
else:
 print("NO")