Posts

Showing posts from May, 2018

Unix Tutorial 17 : Advance Unix

Unix includes commands for: Testing various conditions associated with specified files. Testing various conditions associated with specified strings. Performing file read/write operations. Arrays Arrays are used to store a series of values in an indexed list. Items in an array are stored and retrieved using an index. Note that Arrays are not supported by the original Bourne Shell, but are supported by bash and other newer shells. File Test Operators Shell scripts often need to check various properties of files as a part of the control flow. Unix provides a number of options for this purpose. File existence checks: -f file True if the file exists and is an ordinary file. -d file True if the file exists and is a directory. -s file True if the file exists and is not empty. -c file True if the file exists and is a character device file. -b file True if the file exists and is a block devise file. File access checks: -r file True if the file exists and has read perm...

Unix Tutorial 16 : Command Line Arguments

While running a command, the user can pass a variable number of parameters in the command line. Within the command script, the passed parameters are accessible using ‘positional parameters’.  These range from $0 to $9, where $0 refers to the name of the command itself, and $1 to $9 are the first through to the ninth parameter, depending on how many parameters were actually passed. Example: $ sh hello how to do you do Here $0 would be assigned sh $1 would be assigned hello $2 would be assigned how And so on … We will now look at some additional commands to process these parameters. #1) set Each process is also associated with a priority. This is used to ensure that the OS is able to fairly allocate time to various processing tasks. The ‘nice’ command can be used to reduced the priority of a process and thus be ‘nice’ to the other processes, i.e. $ nice <command> This line will run the specified command at a lower priority – by default, the priority wi...

Unix Tutorial 15 : Text Processing Commands 2

Pipes in Unix A series of filter commands can be piped together using the pipe symbol: ‘|’. When two commands are piped together, the stdin of the second program is read from the stdout of the first program. This creates a powerful mechanism for running complex commands quickly. Command sort: this command is used to sort the contents of the file. This command is also useful to merge the sorted files and store the result in some file. The contents of the original file remain unaltered. Common Syntax: sort[OPTION]…[FILE] Example1: sort file1 This command will sort the contents of file1 Example2: sort -o output_file file1 file2 This will sort the contents of file1 and file2 and save the result in output_file file. Command cut – this command is used to cut a given number of characters or columns from a file. For cutting a certain number of columns it is important to specify the delimiter. A delimiter specifies how the columns are separated in a text file e.g. number of spaces,...

Unix Tutorial 14 : Text Processing Commands 1

Unix Filter Commands grep:  Find lines in stdin that match a pattern and print them to stdout. sort:  Sort the lines in stdin, and print the result to stdout. uniq:  Read from stdin and print unique (that are different from the adjacent line) to stdout. cat:  Read lines from stdin (and more files), and concatenate them to stdout. more:  Read lines from stdin, and provide a paginated view to stdout. cut:  Cut specified byte, character or field from each line of stdin and print to stdout. paste:  Read lines from stdin (and more files), and paste them together line-by-line to stdout. head:  Read the first few lines from stdin (and more files) and print them to stdout. tail:  Read the last few lines from stdin (and more files) and print them to stdout. wc:  Read from stdin, and print the number of newlines, words, and bytes to stdout. tr:  Translate or delete characters read from stdin and print to stdout. Command grep...

Useful links to learn Data Science in Python and R for beginners

 Python: 1) Python Data Science Handbook:  https://lnkd.in/gxrUQKm 2) Analytics Vidhya:  https://lnkd.in/gg9ruwi 3) Data Science from scratch: https://lnkd.in/g8Xu2GX 4)  Youtube playlists by the Engineering world: https://lnkd.in/gsTwfkE 5) Elite Data science:  https://lnkd.in/gv8a5Dd R : 1) R for Data Science:  http://r4ds.had.co.nz/ 2) Analytics Vidhya: https://lnkd.in/gAFHxXN 3) Analytics edge: https://lnkd.in/g_sBRdD 4) R Programming for Data Science by Robert D Peng: https://lnkd.in/gJ55fDD 5) Youtube videos by edureka: https://lnkd.in/ghRuUjA In addition, We would highly recommend to follow blogs, posts and videos by Kate, Favio, Aparna, Karthikeyan P.T.R., Randy, Kyle, Tarry, Sarah, Eric, Nic, Beau, Abhishek, Megan, Maribeth, Kristen. They have done great work in the data science field and provide valuable knowledge to all.

Unix Tutorial 13 : Functions

Syntax for defining functions: function_name() { … <statements> … } To invoke a function, simply use the function name as a command. Example: $ function_name To pass parameters to the function, add space separated arguments like other commands. Example: $ function_name $arg1 $arg2 $arg3 The passed parameters can be accessed inside the function using the standard positional variables i.e. $0, $1, $2, $3 etc. Example: function_name() { … c = $1 + $2 … } Functions can return values using any one of the three methods: #1)  Change the state of a variable or variables. #2)  Use the return command to end the function and return the supplied value to the calling section of the shell script. Example: function_name() { echo “hello $1” return 1 } Running the function with a single parameter will echo the value. $ function_name ram hello ram Capturing the return value (stored in $?) as follows: $ echo $? 1 #3)  Capture the ou...

Unix Tutorial 12 : Loop Types

Loops in Unix You may use different loops based on the situation. They are: #1) Unix For loop statement Example:  This program will add 1+2+3+4+5 and result will be 15 for i in 1 2 3 4 5 do sum=`expr $sum + $i` done echo $sum #2) Unix While loop statement Example:  This program will print the value of ‘a’ five times, from 1 to 5. a=1 while [ $a -le 5 ] do echo “value of a=” $a a=`expr $a + 1` done #3) Unix Until loop statement This program will print the value of ‘a’ two times from 1 to 2. a=1 until [ $a -ge 3 ] do echo “value of a=” $a a=`expr $a + 1` done While running these loops, there may be a need to break out of the loop in some condition before completing all the iterations or to restart the loop before completing the remaining statements. This can be achieved with the ‘break’ and ‘continue’ statements. The following program illustrates the ‘break’ operation: num=1 while [ $num -le 5 ] do read var if [ $var -lt 0 ] then ...

Unix Tutorial 11 : Conditional Statements

Unix Conditional Statements The if-elif-fi Unix provides a number of relational operators in addition to the logical operators mentioned earlier. These can be used to compare numeric values. -lt less than -le less than or equal to -gt greater than -ge greater than or equal to -eq equal to -ne not equal to Unix provides a number of ways for conditionally executing the other commands. These are covered below: #1) The if statements Example: if <control command> then <statements> fi #2) The if…else statements Example: if <control command> then <statements> else <statements> fi #3) The if…elif…else…fi statement Example: if <control command> then <statements> elif then <statements> else <statements fi Given below are some example programs that illustrate these conditional statements: #1) Check if an input number is positive: $ echo “Enter a number” $ read num $ if [ $num -gt 0 ] $ t...

Unix Tutorial 10 : Variables

Defining a Variable A variable is defined by simply assigning a value to a name using the ‘=’ operator. A variable name is a series of alphanumeric characters starting with a letter or ‘_’.  Variables are all treated as text strings unless the context requires them to be treated as a numeric value. Example: $ myvar=hello $ mycount=1 In case the value needs to include spaces, it can be surrounded by double quotes. Example: $ myvar1=”hello there!” Accessing a Variable A variable name is de-referenced by simply adding a ‘$’ prefix to it.  The ‘echo’ command is often used to print the value of a variable. Example: $ echo $myvar hello A variable will be treated as a numeric value if it is used with the ‘expr’ command. Example: $ expr $mycount + 2 3 The variable name needs to be surrounded by braces if in case it is not followed by a space. Example: $ echo ${myvar}oo hellooo When using double quotes, special characters like ‘$’ retain their special m...

Unix Tutorial 9 : Vi Editor

Modes of Unix Vi Editor The vi editor has three modes of operation viz. the command mode, the insert mode, and the ex-command mode. #1) Command mode: In this mode, all the keys work as commands. These keys are used for inserting, appending, deleting, opening new lines, moving the cursor over the paragraphs and sentences, etc. In this mode, the keys are not displayed but each key performs an operation. By default the vi editor is in command mode, hence we cannot type text in command mode. In order to write programs or text in vi editor, we need to switch to the insert mode which can be done by pressing the escape button. #2) Insert mode: In this mode, we can insert, append, edit or replace texts. We can switch from the command mode to Insert mode by pressing the escape button and then press I or A to enter into insert mode. #3) Ex command mode: This mode is used for entering commands at the bottom line of the vi editor called as a command line. To switch to Ex command ...

Unix Tutorial 8 : Shell scripting basics

Shells read configuration files on multiple circumstances that differ depending on the shell. These files usually contain commands for that  particular shell and are executed when loaded; they are usually used to set important variables that are used to find executables, like $PATH, and others that control the behavior and appearance of the shell. The Bourne Shell (sh):  This was one of the first shell programs that came with Unix and is also the most widely used one. It was developed by Stephen Bourne.  The ~/.profile file is used as a configuration file for sh. This is also the standard shell used for scripting. The C Shell (csh):   The C-Shell was developed by Bill Joy, modeled on the C programming language. It was intended to improve interactivity with features such as listing the command history and editing commands. The ~/.cshrc and the ~/.login files are used as configuration files by csh. The Bourne Again Shell (bash):  The bash shell was developed...

Hive Tutorial 35 : Export hive table data into a file

There are couple of ways to do this Process 1: hive -e 'select books from table' | sed 's/[[:space:]]\+/,/g' > /home/lvermeer/temp.csv Process 2 : INSERT OVERWRITE LOCAL DIRECTORY 'dir_path' SELECT FIELD1, FIELD2, FIELD3 FROM TABLE1;

Hive Tutorial 34 : Update Feature example

The update feature is complex as it involves modifying of existing data instead of simple append. Like insert operation, the first step in update is to load the data into an un-partitioned staging table from local data source. a. Insert into Un-Partitioned table: Inserting into un-partitioned table is pretty straightforward. The following is a code to insert into un-partitioned table in which you take the input data file path and load it into the un-partitioned table. The records in the file will get appended to the already existing records in the table. insert into un-partitioned table(tableName, inputDataPath) { run the query : "load data inpath '" + inputDataPath + "/" + "' into table " + tableName; clie nt.e xecu te(i nser tQue ry); } The query will look like: load data inpath ‘/ho me/h adoo p/in putD ata/ inpu t.tx t’ into table EmployeeData  Once the query is executed, the three records present in the input.txt file get appended ...