Skip to content

Chandresh Singh #16

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions Chandresh singh/BIgData-1.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
Ques 1- How to show the folders and files in human readable format ?
hdfs dfs –ls –h /chan

Ques 2- How to check the total disk usage ?
hdfs dfs –du –h

Ques 3- How to remove the file using –rm ?
hdfs dfs –rm –r /chan/file.txt

Ques 4- How to remove the file permanently ?
hdfs dfs –rm –skipTrash /chan/demo.txt

Ques 5- How to retrieve the file from Hadoop fiel location to Local system ?
hdfs dfs –copyToLocal /chan/test.txt

Ques 6- How to change the replication factor of the file from 3 to 5 ?
hdfs dfs –setrep –R 5 /chan
hdfs dfs –setrep –w 5 /chan/test.txt

Ques 7- How to show the details of NameNode?
hdfs dfsadmin -report

Ques 8- How to change the owner of the file from hdfs to cloudera ?
hdfs dfs –chown cloudera /chan/test.txt
54 changes: 54 additions & 0 deletions Chandresh singh/BigData-2.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
Ques 1- Difference between Hadoop1 and Hadoop2 ?
Apache Hadoop V.1.x has the following two major Components
HDFS (HDFS V1)
MapReduce (MR V1)

Apache Hadoop V.2.x has the following three major Components
HDFS V.2
YARN (MR V2)
MapReduce (MR V1)

Hadoop 1.x supports only one namespace for managing HDFS filesystem whereas Hadoop 2.x supports multiple namespaces.
It supports one and only one programming model: MapReduce. Hadoop 2.x supports multiple programming models with YARN Component like MapReduce, Interative, Streaming, Graph, Spark, Storm etc.
It has lot of limitations in Scalability. Hadoop 2.x has overcome that limitation with new architecture.
It has Multi-tenancy Support, but Hadoop 1.x doesn’t.
It HDFS uses fixed-size Slots mechanism for storage purpose whereas Hadoop 2.x uses variable-sized Containers.
It supports maximum 4,000 nodes per cluster where Hadoop 2.x supports more than 10,000 nodes per cluster.

How Hadoop 2.x solves Hadoop 1.x Limitations
Hadoop 2.x has resolved most of the Hadoop 1.x limitations by using new architecture.

By decoupling MapReduce component responsibilities into different components.
By Introducing new YARN component for Resource management.
By decoupling component’s responsibilities, it supports multiple namespace, Multi-tenancy, Higher Availability and Higher Scalability.

Hadoop 2.x YARN Benefits

Highly Scalability
Highly Availability
Supports Multiple Programming Models
Supports Multi-Tenancy
Supports Multiple Namespaces
Improved Cluster Utilization
Supports Horizontal Scalability

Ques 2- In Hadoop2 why the block size is 128 MB ?
In Hadoop, input data files are divided into blocks of a prticular size(128 mb by default) and then these blocks of data are stored on different data nodes.
Hadoop is designed to process large volumes of data. Each block’s information(its address ie on which data node it is stored) is placed in namenode. So if the
block size is too small, then there will be a large no. of blocks to be stored on data nodes as well as a large amount of metadata information needs to be stored
on namenode, Also each block of data is processed by a Mapper task. If there are large no. of small blocks, a lot of mapper tasks will be required. So having small
block size is not very efficient.
Also the block size should not be very large such that , parallelism cant be achieved. It should not be such that the system is waiting a very long time for one unit
of data processing to finish its work.
A balance needs to be maintained. That’s why the default block size is 128 MB. It can be changed as well depending on the size of input files.

Ques 3- Why the namenode is relay on memory rather than datanode ?
Name Node only store metadata which is related to the different blocks and because of this reason it needs high memory space. Data Nodes don’t need large memory space.

Ques 4- Suppose you have 10 PB of data. Metadata is actually store object of file and folder(each obj 200 B).
How much min Namenode RAM memory you need for your data node in a cluster to manage the metadata?
Estimate minimum Namenode RAM size for HDFS with 10 PB capacity, block size 64 MB, average metadata size for each block is 200 B, replication factor is 3.
10 PB/(64MB *3) * 200B = (10 * 10^15)/(64 * 10^6 * 3) * 200 B = 10^10/(64 * 3) * 300B = 1.5625e10 B

Ques 5- At the time of failure, Which will recover first DataNode or NameNode ?
Datanode
5 changes: 5 additions & 0 deletions Chandresh singh/Data Brick Scala 1.url
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[{000214A0-0000-0000-C000-000000000046}]
Prop3=19,11
[InternetShortcut]
IDList=
URL=https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/4508079072012638/803859408819685/4184702918332430/latest.html
1 change: 1 addition & 0 deletions Chandresh singh/Data Science/Hive-1/Ques 1/Creation.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
CREATE DATABASE employe_backup
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
create external table IF NOT EXISTS emp1(name string, age int, email string)
row format delimited fields terminated by ',' STORED AS TEXTFILE LOCATION '/test';
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Chandresh singh/Data Science/Hive-2/partition.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
16 changes: 16 additions & 0 deletions Chandresh singh/Data Science/Scala 1.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
Ques 1-
Write a program in Scala to print Magic Numbers in the range 1 to 250
var x=0
var a=0
for(a <- 1 to 250)
{
var prev=a
var sum=0
while(prev!=0)
{
sum+=prev%10
prev=prev/10
}
if(sum==10)
println(a)
}
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Chandresh singh/Data Science/Talend/Filtering.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Chandresh singh/Data Science/Talend/Replace.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Chandresh singh/Data Science/Talend/csv data.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Chandresh singh/Data Science/Talend/first csv.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions Chandresh singh/Hive-1/Ques 1/Creation.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
CREATE DATABASE employe_backup
2 changes: 2 additions & 0 deletions Chandresh singh/Hive-1/Ques 2/External File Creation.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
create external table IF NOT EXISTS emp1(name string, age int, email string)
row format delimited fields terminated by ',' STORED AS TEXTFILE LOCATION '/test';
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Chandresh singh/Hive-1/Ques 3/csv file$.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Chandresh singh/Hive-1/Ques 3/where clause.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 5 additions & 0 deletions Chandresh singh/Hive-1/open in DataBricks.url
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[{000214A0-0000-0000-C000-000000000046}]
Prop3=19,11
[InternetShortcut]
IDList=
URL=https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/4508079072012638/940631139913891/4184702918332430/latest.html
Binary file added Chandresh singh/Hive-2/detailed view.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 5 additions & 0 deletions Chandresh singh/Hive-2/open in DataBricks.url
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[{000214A0-0000-0000-C000-000000000046}]
Prop3=19,11
[InternetShortcut]
IDList=
URL=https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/4508079072012638/940631139913891/4184702918332430/latest.html
Binary file added Chandresh singh/Hive-2/partition.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions Chandresh singh/Linux/6 April.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Diffrence between adduser and useradd
adduser command - adduser command creates user as well as set up the user's directory in home folder with account's other default settings.
useradd command - useradd olny creates the user & doesnot creates user's directory automatically if not specified with -m.
25 changes: 25 additions & 0 deletions Chandresh singh/Linux/7 April.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
1) To change umask value permanently
Go to /etc/profile file then change default umask value then save & exit.
Now your umask value has been changed permanently you can check it on terminal.
(vi /etc/profile. edit umask value in it. save & exit.)

2) Add new user without using adduser & useradd command
Step 1 : add an entry for user in /etc/passwd
file\username:password:UID:GID:Comments:Home_Directory:Login Shell
# vi /etc/passwd
user:x:501:501:test user:/home/user:/bin/bash
Step 2 : add an entry for group in /etc/group file
Step 3 : create home directory for added user with mkdir command
Step 4 : set new user password using passwd command

3)Can we change the umask value to 0888 ?
Maximum umask is 0666.

4)To add new user with unique user id & check it
useradd [OPTIONS] USERNAME
to check uid : id -u abc

5)Change group of any folder
chgrp [OPTIONS] GROUP FILE..
GROUP, name of the new group or the group ID (GID). Numeric GID must be prefixed with the + symbol.
FILE.., name of one or more files.
11 changes: 11 additions & 0 deletions Chandresh singh/Linux/8 April.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
1)Create new user also add it into the group that you already have
useradd -g grpname usrname

2)To unzip bzip2 file
bzip2 -d filename.bz2

3)Archieve & compress bz2 file with some files with data & a folder
tar -czvf xyz.tar.bz2 /home/ubuntu/d1/ /home/ubuntu/file1.txt /home/ubuntu/file2.txt

4)Add a user & at the same time change its shell to /bin/sh
useradd -s /bin/sh usrname
27 changes: 27 additions & 0 deletions Chandresh singh/Python/python 1.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
1) What is jpython & cpython ?
The default implementation of the Python programming language is Cpython.
As the name suggests Cpython is written in C language.
Cpython compiles the python source code into intermediate bytecode, which is executed by the Cpython virtual machine.
CPython is distributed with a large standard library written in a mixture of C and Python. CPython provides the highest level of compatibility with Python packages and C extension modules.
All versions of the Python language are implemented in C because CPython is the reference implementation.
Jython is an implementation of the Python programming language that can run on the Java platform.
Jython programs use Java classes instead of Python modules.
Jython compiles into Java byte code, which can then be run by Java virtual machine.
Jython enables the use of Java class library functions from the Python program
Jython is slow as compared to Cpython and lacks compatibility with CPython libraries.


2) Difference between python2 & python3
Python 3 syntax is simpler and easily understandable whereas Python 2 syntax is comparatively difficult to understand.
Python 3 default storing of strings is Unicode whereas Python 2 stores need to define Unicode string value with "u."
Python 3 value of variables never changes whereas in Python 2 value of the global variable will be changed while using it inside for-loop.
Python 3 exceptions should be enclosed in parenthesis while Python 2 exceptions should be enclosed in notations.
Python 3 rules of ordering comparisons are simplified whereas Python 2 rules of ordering comparison are complex.
Python 3 offers Range() function to perform iterations whereas, In Python 2, the xrange() is used for iterations.

3) Diffrence between ASCII & Unicode
ASCII defines 128 characters, which map to the numbers 0–127. Unicode defines (less than) 221 characters, which, similarly, map to numbers 0–221 (though not all numbers are currently assigned, and some are reserved).

Unicode is a superset of ASCII, and the numbers 0–127 have the same meaning in ASCII as they have in Unicode. For example, the number 65 means "Latin capital 'A'".

Because Unicode characters don't generally fit into one 8-bit byte, there are numerous ways of storing Unicode characters in byte sequences, such as UTF-32 and UTF-8.
21 changes: 21 additions & 0 deletions Chandresh singh/Python/python 2.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
4054

string="hello this side regex"
count=0
for i in string
for j in ['a','e','i','o','u']
count++;
print count


h=int(input("Enter the height of the triangle"))
b=int(input("Enter the base of the triangle"))
print("the area of the triangle is: ")
area=(1/2)*h*b
print(area)

import calendar
print("Which Year ")
y = int(input())
print("Calendar of year "+y+" is ")
print(calendar.calendar(y))
29 changes: 29 additions & 0 deletions Chandresh singh/Python/python 3.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
Ques 1-

Take input from user and find the armstrong number in the range?
What is Armstrong number : 153 -> 1*1*1 + 5*5*5 + 3*3*3

n1 = int(input("Enter the first number"))
n2 = int(input("Enter the last number"))

for i in range(n1, n2+1):
sum = 0
temp = i
while temp > 0:
digit = temp % 10
sum = sum + digit ** 3
temp = temp // 10

if i == sum:
print (i)
Ques 2-

You have a list with words - [“Apple”, “Mango”, “Banana”,”Grapes”]


ls = [“Apple”, “Mango”, “Banana”,”Grapes”]
sortls = sorted(ls)
print (sortls)



49 changes: 49 additions & 0 deletions Chandresh singh/Python/python 4.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
Ques 1-

You have to run your Program at 9:00PM on 14th April 2020.
Whatsapp Automation


import datetime
x = datetime.datetime.now()
a = str(x)
ls = a.split(" ")

import time
while True:
x = time.ctime()
ls = x.split(" ")
print(ls)
if ls == ['Wed', 'Apr', '14', '21:00:00', '2020']:
print("It's 9 PM")
break
else:
continue


Ques 2-

GIve a tuple:

t=()
t = ('R','e','g','e','X')

for i in t:
print(t[i])

Ques 3-

Create a list of integers print number greater than 20 also separate the numbers greater than 10 to another list

a=[]
b=[]

ls=[1,2,3,4,5,6,7,8,9,10,11,13,14,15,25,26,27,28,29,30]
for i in ls:
if i>20:
print(ls[i])
a.append(i)
elif i <= 10:
b.append(i)
print(a)
print(b)
64 changes: 64 additions & 0 deletions Chandresh singh/Python/python 5.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
Ques 1-

Time Module loop
Loading.
Loading..
Loading...
Loading....
Loading.....

import time
print("Loading", end = "")
i = 1
while(i < 6):
print('.', end = "")
time.sleep(1)
i += 1

Ques 2-
Difference between Return and Yield?
The keyword return causes the function to exit and hands back a value to its caller.The return statement is used when a function is ready to exit and return a value to its caller.
The keyword yield causes the function to hand back a generator object to its caller. yield will not cause the function to exit nor terminate the loop. A generator can be parsed into a list.

Ques 3-

Create a digital clock that run for 5 seconds
import time

i=0
while i < 5:
t = time.ctime()
print(t[11:19])
time.sleep(1)
i = (i+1)

Ques 4-

Whatsapp Automation

import webbrowser
import time
import pyautogui
send = False
number = int(input("Enter the number to send message: "))
message = input("Enter the message")
YourTime = input("Enter time at which u want to send message as format (hh:mm:ss): ")
print(time.ctime())
while(True):
t = time.ctime()
str(t)
a = t.split(" ")
if a[3] == YourTime:
send = True
if send == True:
print("send")
send = False
time.sleep(1)
url = 'https://wa.me/'+str(number)+'?text='+str(message)
webbrowser.open(url)
time.sleep(5)
pyautogui.moveTo(670, 315)
pyautogui.click()
time.sleep(5)
pyautogui.press('enter')

Loading