HADOOP
1. version
Print the Hadoop version
hadoopuser@ub16043lts00:~$ hadoop version
Hadoop 3.0.0
Source code repository https://git-wip-us.apache.org/repos/asf/hadoop.git -r c25427ceca461ee979d30edd7a4b0f50718e6533
Compiled by andrew on 2017-12-08T19:16Z
Compiled with protoc 2.5.0
From source with checksum 397832cb5529187dc8cd74ad54ff22
This command was run using /usr/local/hadoop-3.0.0/share/hadoop/common/hadoop-common-3.0.0.jar
2. checknative
hadoopuser@ub16043lts00:~$ hadoop checknative
2018-02-17 11:45:24,249 INFO bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native
2018-02-17 11:45:24,264 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
2018-02-17 11:45:24,308 WARN erasurecode.ErasureCodeNative: ISA-L support is not available in your platform... using builtin-java codec where applicable
Native library checking:
hadoop: true /usr/local/hadoop-3.0.0/lib/native/libhadoop.so.1.0.0
zlib: true /lib/x86_64-linux-gnu/libz.so.1
zstd : false
snappy: true /usr/lib/x86_64-linux-gnu/libsnappy.so.1
lz4: true revision:10301
bzip2: true /lib/x86_64-linux-gnu/libbz2.so.1
openssl: true /usr/lib/x86_64-linux-gnu/libcrypto.so
ISA-L: false libhadoop was built without ISA-L support
2. jar
pi (Hadoop PI Estimation example)
Usage: pi <num_maps> <num_samples>
hadoopuser@ub16043lts00:~$ hadoop jar /usr/local/hadoop-3.0.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0.jar pi 10 20
teragen (Hadoop Teragen example to generate data for the terasort)
Usage: teragen <number of 100-byte rows> <output dir>
The actual TeraGen data format per row to clear things up:
<10 bytes key><10 bytes rowid><78 bytes filler>\r\n
where:
The keys are random characters from the set ‘ ‘ .. ‘~’.
The rowid is the right justified row id as a int.
The filler consists of 7 runs of 10 characters from ‘A’ to ‘Z’.
terasort (Hadoop Terasort example to run the terasort)
Usage: terasort <input dir> <output dir>
hadoopuser@ub16043lts00:~$ hadoop jar /usr/local/hadoop-3.0.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0.jar terasort /teragen_output_directory /terasort_output_directory
teravalidate (Hadoop Teravalidate example to check the results of terasort)
Usage: teravalidate <terasort output dir (= input data)> <teravalidate output dir>
hadoopuser@ub16043lts00:~$ hadoop jar /usr/local/hadoop-3.0.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0.jar teravalidate /terasort_output_directory /teravalidate_output_directory
wordcount (Hadoop Word Count example)
Usage: <input_file> <output_dir>
hadoopuser@ub16043lts00:~$ hadoop jar /usr/local/hadoop-3.0.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0.jar wordcount /gauravb/inputData/hadoop_installation /gauravb/outputData/wordCountOutput01
2017-09-27 13:45:50,135 INFO client.RMProxy: Connecting to ResourceManager at ub16043lts00/10.0.1.1:8032
2017-09-27 13:45:51,726 INFO input.FileInputFormat: Total input files to process : 1
2017-09-27 13:45:52,003 INFO mapreduce.JobSubmitter: number of splits:1
2017-09-27 13:45:52,224 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
2017-09-27 13:45:52,517 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1506498574236_0003
2017-09-27 13:45:53,254 INFO impl.YarnClientImpl: Submitted application application_1506498574236_0003
2017-09-27 13:45:53,373 INFO mapreduce.Job: The url to track the job: http://ub16043lts00:8088/proxy/application_1506498574236_0003/
2017-09-27 13:45:53,374 INFO mapreduce.Job: Running job: job_1506498574236_0003
2017-09-27 13:46:04,815 INFO mapreduce.Job: Job job_1506498574236_0003 running in uber mode : false
2017-09-27 13:46:04,816 INFO mapreduce.Job: map 0% reduce 0%
2017-09-27 13:46:31,512 INFO mapreduce.Job: map 100% reduce 0%
2017-09-27 13:46:54,959 INFO mapreduce.Job: map 100% reduce 100%
2017-09-27 13:46:56,990 INFO mapreduce.Job: Job job_1506498574236_0003 completed successfully
2017-09-27 13:46:57,353 INFO mapreduce.Job: Counters: 53
File System Counters
FILE: Number of bytes read=12215
FILE: Number of bytes written=405925
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=16392
HDFS: Number of bytes written=9392
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=47810
Total time spent by all reduces in occupied slots (ms)=52560
Total time spent by all map tasks (ms)=23905
Total time spent by all reduce tasks (ms)=17520
Total vcore-milliseconds taken by all map tasks=23905
Total vcore-milliseconds taken by all reduce tasks=17520
Total megabyte-milliseconds taken by all map tasks=48957440
Total megabyte-milliseconds taken by all reduce tasks=53821440
Map-Reduce Framework
Map input records=365
Map output records=1701
Map output bytes=22630
Map output materialized bytes=12215
Input split bytes=127
Combine input records=1701
Combine output records=708
Reduce input groups=708
Reduce shuffle bytes=12215
Reduce input records=708
Reduce output records=708
Spilled Records=1416
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=264
CPU time spent (ms)=3620
Physical memory (bytes) snapshot=778432512
Virtual memory (bytes) snapshot=7249113088
Total committed heap usage (bytes)=593760256
Peak Map Physical memory (bytes)=641290240
Peak Map Virtual memory (bytes)=2796630016
Peak Reduce Physical memory (bytes)=137142272
Peak Reduce Virtual memory (bytes)=4452483072
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=16265
File Output Format Counters
Bytes Written=9392
MAPRED
1. version
hadoopuser@ub16043lts00:~$ mapred version
Hadoop 3.0.0
Source code repository https://git-wip-us.apache.org/repos/asf/hadoop.git -r c25427ceca461ee979d30edd7a4b0f50718e6533
Compiled by andrew on 2017-12-08T19:16Z
Compiled with protoc 2.5.0
From source with checksum 397832cb5529187dc8cd74ad54ff22
This command was run using /usr/local/hadoop-3.0.0/share/hadoop/common/hadoop-common-3.0.0.jar
2. classpath
hadoopuser@ub16043lts00:~$ mapred classpath
/usr/local/hadoop-3.0.0/etc/hadoop:/usr/local/hadoop-3.0.0/share/hadoop/common/lib/*:/usr/local/hadoop-3.0.0/share/hadoop/common/*:/usr/local/hadoop-3.0.0/share/hadoop/hdfs:/usr/local/hadoop-3.0.0/share/hadoop/hdfs/lib/*:/usr/local/hadoop-3.0.0/share/hadoop/hdfs/*:/usr/local/hadoop-3.0.0/share/hadoop/mapreduce/*:/usr/local/hadoop-3.0.0/share/hadoop/yarn:/usr/local/hadoop-3.0.0/share/hadoop/yarn/lib/*:/usr/local/hadoop-3.0.0/share/hadoop/yarn/*
3. envvars
hadoopuser@ub16043lts00:~$ mapred envvars
JAVA_HOME='/usr/lib/jvm/java-8-openjdk-amd64/jre/'
HADOOP_MAPRED_HOME='/usr/local/hadoop-3.0.0'
MAPRED_DIR='share/hadoop/mapreduce'
MAPRED_LIB_JARS_DIR='share/hadoop/mapreduce/lib'
HADOOP_CONF_DIR='/usr/local/hadoop-3.0.0/etc/hadoop'
HADOOP_TOOLS_HOME='/usr/local/hadoop-3.0.0'
HADOOP_TOOLS_DIR='share/hadoop/tools'
HADOOP_TOOLS_LIB_JARS_DIR='share/hadoop/tools/lib'
4. historyserver
hadoopuser@ub16043lts00:~$ mapred historyserver
STARTUP_MSG: Starting JobHistoryServer
STARTUP_MSG: host = ub16043lts00/10.0.1.1
STARTUP_MSG: args = []
STARTUP_MSG: version = 3.0.0
STARTUP_MSG: classpath = /usr/local/hadoop-3.0.0/etc/hadoop:/usr/local/hadoop-3.0.0/share/hadoop/common/lib/commons-logging-1.1.3.jar:/usr/local/hadoop-3.0.0/share/hadoop/common/lib/jackson-core-2.7.8.jar:/usr/local/hadoop-3.0.0/share/hadoop/common/lib/htrace-core4-4.1.0-incubating.jar:/usr/local/hadoop-3.0.0/share/hadoop/common/lib/jetty-servlet-9.3.19.v20170502.jar:/usr/local/hadoop-3.0.0/share/hadoop/common/lib/accessors-smart-1.2.jar
:/usr/local/hadoop-3.0.0/share/hadoop/yarn/hadoop-yarn-server-web-proxy-3.0.0.jar:/usr/local/hadoop-3.0.0/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-3.0.0.jar:/usr/local/hadoop-3.0.0/share/hadoop/yarn/hadoop-yarn-server-timelineservice-hbase-3.0.0.jar
STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r c25427ceca461ee979d30edd7a4b0f50718e6533; compiled by 'andrew' on 2017-12-08T19:16Z
STARTUP_MSG: java = 1.8.0_151
************************************************************/
2018-02-18 21:13:00,901 INFO hs.JobHistoryServer: registered UNIX signal handlers for [TERM, HUP, INT]
2018-02-18 21:13:02,855 INFO beanutils.FluentPropertyBeanIntrospector: Error when creating PropertyDescriptor for public final void org.apache.commons.configuration2.AbstractConfiguration.setProperty(java.lang.String,java.lang.Object)! Ignoring this property.
2018-02-18 21:13:02,954 INFO impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2018-02-18 21:13:03,207 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
2018-02-18 21:13:03,208 INFO impl.MetricsSystemImpl: JobHistoryServer metrics system started
2018-02-18 21:13:03,236 INFO hs.JobHistory: JobHistory Init
2018-02-18 21:13:05,189 INFO jobhistory.JobHistoryUtils: Default file system [hdfs://ub16043lts00:9820]
2018-02-18 21:13:05,946 INFO hs.HistoryFileManager: Perms after creating 504, Expected: 504
2018-02-18 21:13:05,968 INFO jobhistory.JobHistoryUtils: Default file system [hdfs://ub16043lts00:9820]
2018-02-18 21:13:06,017 INFO hs.HistoryFileManager: Initializing Existing Jobs...
2018-02-18 21:13:06,047 INFO hs.HistoryFileManager: Found 0 directories to load
2018-02-18 21:13:06,047 INFO hs.HistoryFileManager: Existing job initialization finished. 0.0% of cache is occupied.
2018-02-18 21:13:06,051 INFO hs.CachedHistoryStorage: CachedHistoryStorage Init
2018-02-18 21:13:06,192 INFO ipc.CallQueueManager: Using callQueue: class java.util.concurrent.LinkedBlockingQueue queueCapacity: 100 scheduler: class org.apache.hadoop.ipc.DefaultRpcScheduler
2018-02-18 21:13:06,231 INFO ipc.Server: Starting Socket Reader #1 for port 10033
2018-02-18 21:13:06,653 INFO delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens
2018-02-18 21:13:06,656 INFO delegation.AbstractDelegationTokenSecretManager: Starting expired delegation token remover thread, tokenRemoverScanInterval=60 min(s)
2018-02-18 21:13:06,656 INFO delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens
2018-02-18 21:13:06,877 INFO util.log: Logging initialized @6980ms
2018-02-18 21:13:07,127 INFO server.AuthenticationFilter: Unable to initialize FileSignerSecretProvider, falling back to use random secrets.
2018-02-18 21:13:07,132 INFO http.HttpRequestLog: Http request log for http.requests.jobhistory is not defined
2018-02-18 21:13:07,172 INFO http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2018-02-18 21:13:07,181 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context jobhistory
2018-02-18 21:13:07,181 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2018-02-18 21:13:07,181 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2018-02-18 21:13:07,202 INFO http.HttpServer2: adding path spec: /jobhistory/*
2018-02-18 21:13:07,202 INFO http.HttpServer2: adding path spec: /ws/*
2018-02-18 21:13:08,306 INFO webapp.WebApps: Registered webapp guice modules
2018-02-18 21:13:08,307 INFO http.HttpServer2: Jetty bound to port 19888
2018-02-18 21:13:08,309 INFO server.Server: jetty-9.3.19.v20170502
2018-02-18 21:13:08,488 INFO handler.ContextHandler: Started o.e.j.s.ServletContextHandler@780ec4a5{/logs,file:///usr/local/hadoop-3.0.0/logs/,AVAILABLE}
2018-02-18 21:13:08,489 INFO handler.ContextHandler: Started o.e.j.s.ServletContextHandler@5aabbb29{/static,jar:file:/usr/local/hadoop-3.0.0/share/hadoop/yarn/hadoop-yarn-common-3.0.0.jar!/webapps/static,AVAILABLE}
Feb 18, 2018 9:13:08 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.mapreduce.v2.hs.webapp.HsWebServices as a root resource class
Feb 18, 2018 9:13:08 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.mapreduce.v2.hs.webapp.JAXBContextResolver as a provider class
Feb 18, 2018 9:13:08 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as a provider class
Feb 18, 2018 9:13:08 PM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
INFO: Initiating Jersey application, version 'Jersey: 1.19 02/11/2015 03:25 AM'
Feb 18, 2018 9:13:09 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.mapreduce.v2.hs.webapp.JAXBContextResolver to GuiceManagedComponentProvider with the scope "Singleton"
Feb 18, 2018 9:13:09 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to GuiceManagedComponentProvider with the scope "Singleton"
Feb 18, 2018 9:13:10 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.mapreduce.v2.hs.webapp.HsWebServices to GuiceManagedComponentProvider with the scope "PerRequest"
2018-02-18 21:13:10,714 INFO handler.ContextHandler: Started o.e.j.w.WebAppContext@4f0f7849{/,file:///tmp/jetty-ub16043lts00-19888-jobhistory-_-any-4369259852477421612.dir/webapp/,AVAILABLE}{/jobhistory}
2018-02-18 21:13:10,724 INFO server.AbstractConnector: Started ServerConnector@716a412{HTTP/1.1,[http/1.1]}{ub16043lts00:19888}
2018-02-18 21:13:10,725 INFO server.Server: Started @10830ms
2018-02-18 21:13:10,725 INFO webapp.WebApps: Web app jobhistory started at 19888
2018-02-18 21:13:10,773 INFO ipc.CallQueueManager: Using callQueue: class java.util.concurrent.LinkedBlockingQueue queueCapacity: 1000 scheduler: class org.apache.hadoop.ipc.DefaultRpcScheduler
2018-02-18 21:13:10,796 INFO ipc.Server: Starting Socket Reader #1 for port 10020
2018-02-18 21:13:10,867 INFO pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.HSClientProtocolPB to the server
2018-02-18 21:13:10,867 INFO ipc.Server: IPC Server Responder: starting
2018-02-18 21:13:10,870 INFO ipc.Server: IPC Server listener on 10020: starting
2018-02-18 21:13:10,872 INFO hs.HistoryClientService: Instantiated HistoryClientService at ub16043lts00/10.0.1.1:10020
2018-02-18 21:13:10,873 INFO ipc.Server: IPC Server Responder: starting
2018-02-18 21:13:10,873 INFO ipc.Server: IPC Server listener on 10033: starting
2018-02-18 21:13:10,882 INFO util.JvmPauseMonitor: Starting JVM pause monitor
2018-02-18 21:13:36,659 INFO hs.JobHistory: History Cleaner started
2018-02-18 21:13:36,666 INFO hs.JobHistory: History Cleaner complete
2018-02-18 21:13:51,847 INFO webapp.View: Getting list of all Jobs.
2018-02-18 21:13:52,704 INFO jobhistory.JobSummary: jobId=job_1518966178519_0001,submitTime=1518966449827,launchTime=1518966473634,firstMapTaskLaunchTime=1518966476862,firstReduceTaskLaunchTime=1518966687322,finishTime=1518966722432,resourcesPerMap=1536,resourcesPerReduce=3072,numMaps=1,numReduces=1,succededMaps=1,succeededReduces=1,failedMaps=0,failedReduces=0,killedMaps=0,killedReduces=0,user=hadoopuser,queue=default,status=SUCCEEDED,mapSlotSeconds=278,reduceSlotSeconds=101,jobName=word count
2018-02-18 21:13:52,705 INFO hs.HistoryFileManager: Deleting JobSummary file: [hdfs://ub16043lts00:9820/mr-history/tmp/hadoopuser/job_1518966178519_0001.summary]
2018-02-18 21:13:52,798 INFO hs.HistoryFileManager: Perms after creating 504, Expected: 504
2018-02-18 21:13:52,799 INFO hs.HistoryFileManager: Moving hdfs://ub16043lts00:9820/mr-history/tmp/hadoopuser/job_1518966178519_0001-1518966449827-hadoopuser-word+count-1518966722432-1-1-SUCCEEDED-default-1518966473634.jhist to hdfs://ub16043lts00:9820/mr-history/done/2018/02/18/000000/job_1518966178519_0001-1518966449827-hadoopuser-word+count-1518966722432-1-1-SUCCEEDED-default-1518966473634.jhist
2018-02-18 21:13:53,144 INFO hs.HistoryFileManager: Moving hdfs://ub16043lts00:9820/mr-history/tmp/hadoopuser/job_1518966178519_0001_conf.xml to hdfs://ub16043lts00:9820/mr-history/done/2018/02/18/000000/job_1518966178519_0001_conf.xml
2018-02-18 21:14:35,630 INFO hs.CompletedJob: Loading job: job_1518966178519_0001 from file: hdfs://ub16043lts00:9820/mr-history/done/2018/02/18/000000/job_1518966178519_0001-1518966449827-hadoopuser-word+count-1518966722432-1-1-SUCCEEDED-default-1518966473634.jhist
2018-02-18 21:14:35,630 INFO hs.CompletedJob: Loading history file: [hdfs://ub16043lts00:9820/mr-history/done/2018/02/18/000000/job_1518966178519_0001-1518966449827-hadoopuser-word+count-1518966722432-1-1-SUCCEEDED-default1518966473634.jhist]
STARTUP_MSG: host = ub16043lts00/10.0.1.1
STARTUP_MSG: args = []
STARTUP_MSG: version = 3.0.0
STARTUP_MSG: classpath = /usr/local/hadoop-3.0.0/etc/hadoop:/usr/local/hadoop-3.0.0/share/hadoop/common/lib/commons-logging-1.1.3.jar:/usr/local/hadoop-3.0.0/share/hadoop/common/lib/jackson-core-2.7.8.jar:/usr/local/hadoop-3.0.0/share/hadoop/common/lib/htrace-core4-4.1.0-incubating.jar:/usr/local/hadoop-3.0.0/share/hadoop/common/lib/jetty-servlet-9.3.19.v20170502.jar:/usr/local/hadoop-3.0.0/share/hadoop/common/lib/accessors-smart-1.2.jar
:/usr/local/hadoop-3.0.0/share/hadoop/yarn/hadoop-yarn-server-web-proxy-3.0.0.jar:/usr/local/hadoop-3.0.0/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-3.0.0.jar:/usr/local/hadoop-3.0.0/share/hadoop/yarn/hadoop-yarn-server-timelineservice-hbase-3.0.0.jar
STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r c25427ceca461ee979d30edd7a4b0f50718e6533; compiled by 'andrew' on 2017-12-08T19:16Z
STARTUP_MSG: java = 1.8.0_151
************************************************************/
2018-02-18 21:13:00,901 INFO hs.JobHistoryServer: registered UNIX signal handlers for [TERM, HUP, INT]
2018-02-18 21:13:02,855 INFO beanutils.FluentPropertyBeanIntrospector: Error when creating PropertyDescriptor for public final void org.apache.commons.configuration2.AbstractConfiguration.setProperty(java.lang.String,java.lang.Object)! Ignoring this property.
2018-02-18 21:13:02,954 INFO impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2018-02-18 21:13:03,207 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
2018-02-18 21:13:03,208 INFO impl.MetricsSystemImpl: JobHistoryServer metrics system started
2018-02-18 21:13:03,236 INFO hs.JobHistory: JobHistory Init
2018-02-18 21:13:05,189 INFO jobhistory.JobHistoryUtils: Default file system [hdfs://ub16043lts00:9820]
2018-02-18 21:13:05,946 INFO hs.HistoryFileManager: Perms after creating 504, Expected: 504
2018-02-18 21:13:05,968 INFO jobhistory.JobHistoryUtils: Default file system [hdfs://ub16043lts00:9820]
2018-02-18 21:13:06,017 INFO hs.HistoryFileManager: Initializing Existing Jobs...
2018-02-18 21:13:06,047 INFO hs.HistoryFileManager: Found 0 directories to load
2018-02-18 21:13:06,047 INFO hs.HistoryFileManager: Existing job initialization finished. 0.0% of cache is occupied.
2018-02-18 21:13:06,051 INFO hs.CachedHistoryStorage: CachedHistoryStorage Init
2018-02-18 21:13:06,192 INFO ipc.CallQueueManager: Using callQueue: class java.util.concurrent.LinkedBlockingQueue queueCapacity: 100 scheduler: class org.apache.hadoop.ipc.DefaultRpcScheduler
2018-02-18 21:13:06,231 INFO ipc.Server: Starting Socket Reader #1 for port 10033
2018-02-18 21:13:06,653 INFO delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens
2018-02-18 21:13:06,656 INFO delegation.AbstractDelegationTokenSecretManager: Starting expired delegation token remover thread, tokenRemoverScanInterval=60 min(s)
2018-02-18 21:13:06,656 INFO delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens
2018-02-18 21:13:06,877 INFO util.log: Logging initialized @6980ms
2018-02-18 21:13:07,127 INFO server.AuthenticationFilter: Unable to initialize FileSignerSecretProvider, falling back to use random secrets.
2018-02-18 21:13:07,132 INFO http.HttpRequestLog: Http request log for http.requests.jobhistory is not defined
2018-02-18 21:13:07,172 INFO http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2018-02-18 21:13:07,181 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context jobhistory
2018-02-18 21:13:07,181 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2018-02-18 21:13:07,181 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2018-02-18 21:13:07,202 INFO http.HttpServer2: adding path spec: /jobhistory/*
2018-02-18 21:13:07,202 INFO http.HttpServer2: adding path spec: /ws/*
2018-02-18 21:13:08,306 INFO webapp.WebApps: Registered webapp guice modules
2018-02-18 21:13:08,307 INFO http.HttpServer2: Jetty bound to port 19888
2018-02-18 21:13:08,309 INFO server.Server: jetty-9.3.19.v20170502
2018-02-18 21:13:08,488 INFO handler.ContextHandler: Started o.e.j.s.ServletContextHandler@780ec4a5{/logs,file:///usr/local/hadoop-3.0.0/logs/,AVAILABLE}
2018-02-18 21:13:08,489 INFO handler.ContextHandler: Started o.e.j.s.ServletContextHandler@5aabbb29{/static,jar:file:/usr/local/hadoop-3.0.0/share/hadoop/yarn/hadoop-yarn-common-3.0.0.jar!/webapps/static,AVAILABLE}
Feb 18, 2018 9:13:08 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.mapreduce.v2.hs.webapp.HsWebServices as a root resource class
Feb 18, 2018 9:13:08 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.mapreduce.v2.hs.webapp.JAXBContextResolver as a provider class
Feb 18, 2018 9:13:08 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as a provider class
Feb 18, 2018 9:13:08 PM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
INFO: Initiating Jersey application, version 'Jersey: 1.19 02/11/2015 03:25 AM'
Feb 18, 2018 9:13:09 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.mapreduce.v2.hs.webapp.JAXBContextResolver to GuiceManagedComponentProvider with the scope "Singleton"
Feb 18, 2018 9:13:09 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to GuiceManagedComponentProvider with the scope "Singleton"
Feb 18, 2018 9:13:10 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.mapreduce.v2.hs.webapp.HsWebServices to GuiceManagedComponentProvider with the scope "PerRequest"
2018-02-18 21:13:10,714 INFO handler.ContextHandler: Started o.e.j.w.WebAppContext@4f0f7849{/,file:///tmp/jetty-ub16043lts00-19888-jobhistory-_-any-4369259852477421612.dir/webapp/,AVAILABLE}{/jobhistory}
2018-02-18 21:13:10,724 INFO server.AbstractConnector: Started ServerConnector@716a412{HTTP/1.1,[http/1.1]}{ub16043lts00:19888}
2018-02-18 21:13:10,725 INFO server.Server: Started @10830ms
2018-02-18 21:13:10,725 INFO webapp.WebApps: Web app jobhistory started at 19888
2018-02-18 21:13:10,773 INFO ipc.CallQueueManager: Using callQueue: class java.util.concurrent.LinkedBlockingQueue queueCapacity: 1000 scheduler: class org.apache.hadoop.ipc.DefaultRpcScheduler
2018-02-18 21:13:10,796 INFO ipc.Server: Starting Socket Reader #1 for port 10020
2018-02-18 21:13:10,867 INFO pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.HSClientProtocolPB to the server
2018-02-18 21:13:10,867 INFO ipc.Server: IPC Server Responder: starting
2018-02-18 21:13:10,870 INFO ipc.Server: IPC Server listener on 10020: starting
2018-02-18 21:13:10,872 INFO hs.HistoryClientService: Instantiated HistoryClientService at ub16043lts00/10.0.1.1:10020
2018-02-18 21:13:10,873 INFO ipc.Server: IPC Server Responder: starting
2018-02-18 21:13:10,873 INFO ipc.Server: IPC Server listener on 10033: starting
2018-02-18 21:13:10,882 INFO util.JvmPauseMonitor: Starting JVM pause monitor
2018-02-18 21:13:36,659 INFO hs.JobHistory: History Cleaner started
2018-02-18 21:13:36,666 INFO hs.JobHistory: History Cleaner complete
2018-02-18 21:13:51,847 INFO webapp.View: Getting list of all Jobs.
2018-02-18 21:13:52,704 INFO jobhistory.JobSummary: jobId=job_1518966178519_0001,submitTime=1518966449827,launchTime=1518966473634,firstMapTaskLaunchTime=1518966476862,firstReduceTaskLaunchTime=1518966687322,finishTime=1518966722432,resourcesPerMap=1536,resourcesPerReduce=3072,numMaps=1,numReduces=1,succededMaps=1,succeededReduces=1,failedMaps=0,failedReduces=0,killedMaps=0,killedReduces=0,user=hadoopuser,queue=default,status=SUCCEEDED,mapSlotSeconds=278,reduceSlotSeconds=101,jobName=word count
2018-02-18 21:13:52,705 INFO hs.HistoryFileManager: Deleting JobSummary file: [hdfs://ub16043lts00:9820/mr-history/tmp/hadoopuser/job_1518966178519_0001.summary]
2018-02-18 21:13:52,798 INFO hs.HistoryFileManager: Perms after creating 504, Expected: 504
2018-02-18 21:13:52,799 INFO hs.HistoryFileManager: Moving hdfs://ub16043lts00:9820/mr-history/tmp/hadoopuser/job_1518966178519_0001-1518966449827-hadoopuser-word+count-1518966722432-1-1-SUCCEEDED-default-1518966473634.jhist to hdfs://ub16043lts00:9820/mr-history/done/2018/02/18/000000/job_1518966178519_0001-1518966449827-hadoopuser-word+count-1518966722432-1-1-SUCCEEDED-default-1518966473634.jhist
2018-02-18 21:13:53,144 INFO hs.HistoryFileManager: Moving hdfs://ub16043lts00:9820/mr-history/tmp/hadoopuser/job_1518966178519_0001_conf.xml to hdfs://ub16043lts00:9820/mr-history/done/2018/02/18/000000/job_1518966178519_0001_conf.xml
2018-02-18 21:14:35,630 INFO hs.CompletedJob: Loading job: job_1518966178519_0001 from file: hdfs://ub16043lts00:9820/mr-history/done/2018/02/18/000000/job_1518966178519_0001-1518966449827-hadoopuser-word+count-1518966722432-1-1-SUCCEEDED-default-1518966473634.jhist
2018-02-18 21:14:35,630 INFO hs.CompletedJob: Loading history file: [hdfs://ub16043lts00:9820/mr-history/done/2018/02/18/000000/job_1518966178519_0001-1518966449827-hadoopuser-word+count-1518966722432-1-1-SUCCEEDED-default1518966473634.jhist]
JobHistory
(http://machine_hostname:port) -- Default port is 19888
(http://machine_hostname:port) -- Default port is 19888
HDFS
1. hdfs jmxget
hadoopuser@ub16043lts00:~$ hdfs jmxget
init: server=localhost;port=;service=NameNode;localVMUrl=null
Domains:
Domain = JMImplementation
Domain = com.sun.management
Domain = java.lang
Domain = java.nio
Domain = java.util.logging
MBeanServer default domain = DefaultDomain
MBean count = 22
Query MBeanServer MBeans:
List of all the available keys:
2. getconf -namenodes
hadoopuser@ub16043lts00:~$ hdfs getconf -namenodes
ub16043lts00
3. getconf -secondaryNameNodes
hadoopuser@ub16043lts00:~$ hdfs getconf -secondaryNameNodes
0.0.0.0
4. getconf -backupNodes
hadoopuser@ub16043lts00:~$ hdfs getconf -backupNodes
0.0.0.0
5. getconf -nnRpcAddresses
hadoopuser@ub16043lts00:~$ hdfs getconf -nnRpcAddresses
ub16043lts00:9820
6. groups
hadoopuser@ub16043lts00:~$ hdfs groups
hadoopuser : hadoopgroup sudo
7. balancer
Run a cluster balancing utility
hadoopuser@ub16043lts00:~$ hdfs balancer
2017-08-31 14:08:10,602 INFO balancer.Balancer: namenodes = [hdfs://10.0.1.1]
2017-08-31 14:08:10,623 INFO balancer.Balancer: parameters = Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 10.0, max idle iteration = 5, #excluded nodes = 0, #included nodes = 0, #source nodes = 0, #blockpools = 0, run during upgrade = false]
2017-08-31 14:08:10,623 INFO balancer.Balancer: included nodes = []
2017-08-31 14:08:10,623 INFO balancer.Balancer: excluded nodes = []
2017-08-31 14:08:10,623 INFO balancer.Balancer: source nodes = []
Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved
2017-08-31 14:08:17,754 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)
2017-08-31 14:08:17,754 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)
2017-08-31 14:08:17,754 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)
2017-08-31 14:08:17,754 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 50 (default=50)
2017-08-31 14:08:17,754 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)
2017-08-31 14:08:17,754 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)
2017-08-31 14:08:17,829 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)
2017-08-31 14:08:17,829 INFO balancer.Balancer: dfs.blocksize = 134217728 (default=134217728)
2017-08-31 14:08:17,910 INFO net.NetworkTopology: Adding a new node: /default-rack/10.0.1.2:9866
2017-08-31 14:08:17,911 INFO net.NetworkTopology: Adding a new node: /default-rack/10.0.1.1:9866
2017-08-31 14:08:17,915 INFO balancer.Balancer: 0 over-utilized: []
2017-08-31 14:08:17,915 INFO balancer.Balancer: 0 underutilized: []
The cluster is balanced. Exiting...
31 Aug, 2017 2:08:17 PM 0 0 B 0 B 0 B
31 Aug, 2017 2:08:18 PM Balancing took 8.444 seconds
8. dfsadmin -report
hadoopuser@ub16043lts00:~$ hdfs dfsadmin -printTopology
Rack: /default-rack
127.0.0.1:50010 (localhost)
9. dfsadmin -report
hadoopuser@ub16043lts00:~$ hdfs dfsadmin -report
Configured Capacity: 142186881024 (132.42 GB)
Present Capacity: 123893272576 (115.38 GB)
DFS Remaining: 123893207040 (115.38 GB)
DFS Used: 65536 (64 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Pending deletion blocks: 0
-------------------------------------------------
Live datanodes (2):
Name: 10.0.1.1:9866 (ub16043lts00)
Hostname: ub16043lts00
Decommission Status : Normal
Configured Capacity: 95145664512 (88.61 GB)
DFS Used: 32768 (32 KB)
Non DFS Used: 5825781760 (5.43 GB)
DFS Remaining: 84463058944 (78.66 GB)
DFS Used%: 0.00%
DFS Remaining%: 88.77%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Aug 31 14:09:22 IST 2017
Last Block Report: Thu Aug 31 13:52:52 IST 2017
Name: 10.0.1.2:9866 (ub16043lts01)
Hostname: ub16043lts01
Decommission Status : Normal
Configured Capacity: 47041216512 (43.81 GB)
DFS Used: 32768 (32 KB)
Non DFS Used: 5222047744 (4.86 GB)
DFS Remaining: 39430148096 (36.72 GB)
DFS Used%: 0.00%
DFS Remaining%: 83.82%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Aug 31 14:09:23 IST 2017
Last Block Report: Thu Aug 31 13:52:52 IST 2017
10. dfs -mkdir
hadoopuser@ub16043lts00:~$ hdfs dfs -mkdir /user_gauravb
hadoopuser@ub16043lts00:~$ hdfs dfs -mkdir /user_hadoop3x
11. dfs -ls
List the contents of the root directory in HDFS
hadoopuser@ub16043lts00:~$ hdfs dfs -ls /
Found 4 items
drwxr-xr-x - hadoopuser supergroup 0 2017-08-31 14:08 /system
drwxr-xr-x - hadoopuser supergroup 0 2017-08-31 14:32 /user
drwxr-xr-x - hadoopuser supergroup 0 2017-08-31 14:32 /user_gauravb
drwxr-xr-x - hadoopuser supergroup 0 2017-08-31 14:32 /user_hadoop3x
12. dfs -ls -R
Behaves like -ls, but recursively displays entries in all subdirectories of path.
hadoopuser@ub16043lts00:~$ hdfs dfs -ls -R /
Found 20 items
drwxr-xr-x - hadoopuser supergroup 0 2014-05-08 13:00 /user/hdfs
-rw-r--r-- 1 hadoopuser supergroup 0 2014-05-02 11:17 /user/hdfs/_SUCCESS
-rw-r--r-- 10 hadoopuser supergroup 0 2014-05-02 11:14 /user/hdfs/_partition.lst
-rw-r--r-- 1 hadoopuser supergroup 100000000 2014-05-02 11:17 /user/hdfs/part-r-00000
drwxr-xr-x - hadoopuser supergroup 0 2014-04-17 17:03 /user/hdfs/terasort-input
drwxr-xr-x - hadoopuser supergroup 0 2014-04-22 15:51 /user/hdfs/terasort-input01
-rw-r--r-- 1 hadoopuser supergroup 0 2014-04-22 15:51 /user/hdfs/terasort-input01/_SUCCESS
-rw-r--r-- 1 hadoopuser supergroup 50000 2014-04-22 15:50 /user/hdfs/terasort-input01/part-m-00000
-rw-r--r-- 1 hadoopuser supergroup 50000 2014-04-22 15:50 /user/hdfs/terasort-input01/part-m-00001
drwxr-xr-x - hadoopuser supergroup 0 2014-05-02 11:08 /user/hdfs/terasort-input02
-rw-r--r-- 1 hadoopuser supergroup 0 2014-05-02 11:08 /user/hdfs/terasort-input02/_SUCCESS
-rw-r--r-- 1 hadoopuser supergroup 50000000 2014-05-02 11:07 /user/hdfs/terasort-input02/part-m-00000
-rw-r--r-- 1 hadoopuser supergroup 50000000 2014-05-02 11:07 /user/hdfs/terasort-input02/part-m-00001
drwxr-xr-x - hadoopuser supergroup 0 2014-05-02 11:28 /user/hdfs/terasort-output02
-rw-r--r-- 1 hadoopuser supergroup 0 2014-05-02 11:28 /user/hdfs/terasort-output02/_SUCCESS
-rw-r--r-- 10 hadoopuser supergroup 0 2014-05-02 11:25 /user/hdfs/terasort-output02/_partition.lst
-rw-r--r-- 1 hadoopuser supergroup 100000000 2014-05-02 11:28 /user/hdfs/terasort-output02/part-r-00000
drwxr-xr-x - hadoopuser supergroup 0 2014-05-02 11:30 /user/hdfs/teravalidate-output02
-rw-r--r-- 1 hadoopuser supergroup 0 2014-05-02 11:30 /user/hdfs/teravalidate-output02/_SUCCESS
-rw-r--r-- 1 hadoopuser supergroup 23 2014-05-02 11:30 /user/hdfs/teravalidate-output02/part-r-00000
13. dfs -copyFromLocal
copyFromLocal is similar to put command, except that the source is restricted to a local file reference. So, basically you can do with put, all that you do with copyFromLocal, but not vice-versa.
hadoopuser@ub16043lts00:~$ hdfs dfs -copyFromLocal /home/hadoopuser/Documents/hadoop_installation /user_gauravb
14. dfs -put
hadoopuser@ub16043lts00:~$ hdfs dfs -put /home/hadoopuser/Documents/hadoop_installation /user_gauravb
hadoopuser@ub16043lts00:~$ hdfs dfs -put /home/gb/pagecounts-20081001-000000.gz /user/HiveInputData
hadoopuser@ub16043lts00:~$ hdfs dfs -put /home/gb/hadoop/tables.ddl /user/HiveInputData
Note: Adding different extension files such as .gz and .ddl
Difference between "copyFromLocal" and "put"
Difference between "copyFromLocal" and "put"
If
your HDFS contains the path: /tmp/files/file_name.txt And if your local disk
also contains this path then the hdfs API won't know which one you mean,
unless you specify a scheme like file:// or hdfs://.
15. chown
hadoopuser@ub16043lts00:~$ hdfs dfs -chown root:supergroup /user/HDFSInputData
16. chmod
hadoopuser@ub16043lts00:~$ hdfs dfs -chmod -R 777 /user/HDFSInputData
17. fsck
hadoopuser@ub16043lts00:~$ hdfs fsck /user/hive/warehouse/
....
....
/user/hive/warehouse/movielens.db/users/users.txt: Under replicated BP-1678988347-10.0.1.2-1524751491718:blk_1073741944_1120. Target Replicas is 2 but found 1 replica(s).
Status: HEALTHY
Total size: 41460685 B
Total dirs: 12
Total files: 9
Total symlinks: 0
Total blocks (validated): 9 (avg. block size 4606742 B)
Minimally replicated blocks: 9 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 9 (100.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 1.0
Corrupt blocks: 0
Missing replicas: 9 (50.0 %)
Number of data-nodes: 1
Number of racks: 1
FSCK ended at Tue May 08 13:06:03 IST 2018 in 4 milliseconds
The filesystem under path '/user/hive/warehouse' is HEALTHY
18. fsck
hadoopuser@ub16043lts00:~$ hdfs fsck /user/hive/warehouse/ -locations -blocks -files
Connecting to namenode via http://localhost:50070/fsck?ugi=bigdatauser&locations=1&blocks=1&files=1&path=%2Fuser%2Fhive%2Fwarehouse
FSCK started by bigdatauser (auth:SIMPLE) from /127.0.0.1 for path /user/hive/warehouse at Tue May 08 13:09:09 IST 2018
/user/hive/warehouse <dir>
/user/hive/warehouse/movielens.db <dir>
/user/hive/warehouse/movielens.db/movies <dir>
/user/hive/warehouse/movielens.db/movies/movies.txt 163542 bytes, 1 block(s): Under replicated BP-1678988347-10.0.1.2-1524751491718:blk_1073741946_1122. Target Replicas is 2 but found 1 replica(s).
0. BP-1678988347-10.0.1.2-1524751491718:blk_1073741946_1122 len=163542 repl=1 [DatanodeInfoWithStorage[127.0.0.1:50010,DS-0842bbd5-3c1f-4ad2-a792-afc01460a63d,DISK]]
/user/hive/warehouse/movielens.db/occupations <dir>
/user/hive/warehouse/movielens.db/occupations/occupations.txt 345 bytes, 1 block(s): Under replicated BP-1678988347-10.0.1.2-1524751491718:blk_1073741945_1121. Target Replicas is 2 but found 1 replica(s).
0. BP-1678988347-10.0.1.2-1524751491718:blk_1073741945_1121 len=345 repl=1 [DatanodeInfoWithStorage[127.0.0.1:50010,DS-0842bbd5-3c1f-4ad2-a792-afc01460a63d,DISK]]
/user/hive/warehouse/movielens.db/users <dir>
/user/hive/warehouse/movielens.db/users/users.txt 110208 bytes, 1 block(s): Under replicated BP-1678988347-10.0.1.2-1524751491718:blk_1073741944_1120. Target Replicas is 2 but found 1 replica(s).
0. BP-1678988347-10.0.1.2-1524751491718:blk_1073741944_1120 len=110208 repl=1 [DatanodeInfoWithStorage[127.0.0.1:50010,DS-0842bbd5-3c1f-4ad2-a792-afc01460a63d,DISK]]
Status: HEALTHY
Total size: 41460685 B
Total dirs: 12
Total files: 9
Total symlinks: 0
Total blocks (validated): 9 (avg. block size 4606742 B)
Minimally replicated blocks: 9 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 9 (100.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 1.0
Corrupt blocks: 0
Missing replicas: 9 (50.0 %)
Number of data-nodes: 1
Number of racks: 1
FSCK ended at Tue May 08 13:09:09 IST 2018 in 41 milliseconds
The filesystem under path '/user/hive/warehouse' is HEALTHY
19. fsck
hadoopuser@ub16043lts00:~$ hdfs fsck -list-corruptfileblocks
Connecting to namenode via http://localhost:50070/fsck?ugi=bigdatauser&listcorruptfileblocks=1&path=%2F
The filesystem under path '/' has 0 CORRUPT files
20. setrep [-W (write), -R (recursive)]
hadoopuser@ub16043lts00:~$ hdfs dfs -setrep 4 /user/hive/warehouse/movielens.db/users/users.txt
Replication 4 set: /user/hive/warehouse/movielens.db/users/users.txt
hadoopuser@ub16043lts00:~$ hdfs dfs -setrep -W 4 /user/hive/warehouse/movielens.db/users/users.txt
hadoopuser@ub16043lts00:~$ hdfs dfs -setrep -R 4 /user/hive/warehouse/movielens.db/users/users.txt
21. du
hadoopuser@ub16043lts00:~$ hdfs dfs -du -h /
1.3 K /README.txt
3.9 M /cleaned_user_train.csv
1.3 K /g
1.3 K /gauravb
1.3 K /gauravb1
3.3 M /mr-history
575 /pig_data
223 /pig_output
223.7 M /spark-jars
181.9 M /tmp
39.5 M /user
22. df
hadoopuser@ub16043lts00:~$ hdfs dfs -df -h /
Filesystem Size Used Available Use%
hdfs://localhost:8020 43.8 G 457.5 M 3.4 G 1%
23. storagepolicies
hadoopuser@ub16043lts00:~$ hdfs storagepolicies -listPolicies
Block Storage Policies:
BlockStoragePolicy{COLD:2, storageTypes=[ARCHIVE], creationFallbacks=[], replicationFallbacks=[]}
BlockStoragePolicy{WARM:5, storageTypes=[DISK, ARCHIVE], creationFallbacks=[DISK, ARCHIVE], replicationFallbacks=[DISK, ARCHIVE]}
BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
BlockStoragePolicy{ONE_SSD:10, storageTypes=[SSD, DISK], creationFallbacks=[SSD, DISK], replicationFallbacks=[SSD, DISK]}
BlockStoragePolicy{ALL_SSD:12, storageTypes=[SSD], creationFallbacks=[DISK], replicationFallbacks=[DISK]}
BlockStoragePolicy{LAZY_PERSIST:15, storageTypes=[RAM_DISK, DISK], creationFallbacks=[DISK], replicationFallbacks=[DISK]}
15. chown
hadoopuser@ub16043lts00:~$ hdfs dfs -chown root:supergroup /user/HDFSInputData
16. chmod
hadoopuser@ub16043lts00:~$ hdfs dfs -chmod -R 777 /user/HDFSInputData
17. fsck
hadoopuser@ub16043lts00:~$ hdfs fsck /user/hive/warehouse/
....
....
/user/hive/warehouse/movielens.db/users/users.txt: Under replicated BP-1678988347-10.0.1.2-1524751491718:blk_1073741944_1120. Target Replicas is 2 but found 1 replica(s).
Status: HEALTHY
Total size: 41460685 B
Total dirs: 12
Total files: 9
Total symlinks: 0
Total blocks (validated): 9 (avg. block size 4606742 B)
Minimally replicated blocks: 9 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 9 (100.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 1.0
Corrupt blocks: 0
Missing replicas: 9 (50.0 %)
Number of data-nodes: 1
Number of racks: 1
FSCK ended at Tue May 08 13:06:03 IST 2018 in 4 milliseconds
The filesystem under path '/user/hive/warehouse' is HEALTHY
18. fsck
hadoopuser@ub16043lts00:~$ hdfs fsck /user/hive/warehouse/ -locations -blocks -files
Connecting to namenode via http://localhost:50070/fsck?ugi=bigdatauser&locations=1&blocks=1&files=1&path=%2Fuser%2Fhive%2Fwarehouse
FSCK started by bigdatauser (auth:SIMPLE) from /127.0.0.1 for path /user/hive/warehouse at Tue May 08 13:09:09 IST 2018
/user/hive/warehouse <dir>
/user/hive/warehouse/movielens.db <dir>
/user/hive/warehouse/movielens.db/movies <dir>
/user/hive/warehouse/movielens.db/movies/movies.txt 163542 bytes, 1 block(s): Under replicated BP-1678988347-10.0.1.2-1524751491718:blk_1073741946_1122. Target Replicas is 2 but found 1 replica(s).
0. BP-1678988347-10.0.1.2-1524751491718:blk_1073741946_1122 len=163542 repl=1 [DatanodeInfoWithStorage[127.0.0.1:50010,DS-0842bbd5-3c1f-4ad2-a792-afc01460a63d,DISK]]
/user/hive/warehouse/movielens.db/occupations <dir>
/user/hive/warehouse/movielens.db/occupations/occupations.txt 345 bytes, 1 block(s): Under replicated BP-1678988347-10.0.1.2-1524751491718:blk_1073741945_1121. Target Replicas is 2 but found 1 replica(s).
0. BP-1678988347-10.0.1.2-1524751491718:blk_1073741945_1121 len=345 repl=1 [DatanodeInfoWithStorage[127.0.0.1:50010,DS-0842bbd5-3c1f-4ad2-a792-afc01460a63d,DISK]]
/user/hive/warehouse/movielens.db/users <dir>
/user/hive/warehouse/movielens.db/users/users.txt 110208 bytes, 1 block(s): Under replicated BP-1678988347-10.0.1.2-1524751491718:blk_1073741944_1120. Target Replicas is 2 but found 1 replica(s).
0. BP-1678988347-10.0.1.2-1524751491718:blk_1073741944_1120 len=110208 repl=1 [DatanodeInfoWithStorage[127.0.0.1:50010,DS-0842bbd5-3c1f-4ad2-a792-afc01460a63d,DISK]]
Status: HEALTHY
Total size: 41460685 B
Total dirs: 12
Total files: 9
Total symlinks: 0
Total blocks (validated): 9 (avg. block size 4606742 B)
Minimally replicated blocks: 9 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 9 (100.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 1.0
Corrupt blocks: 0
Missing replicas: 9 (50.0 %)
Number of data-nodes: 1
Number of racks: 1
FSCK ended at Tue May 08 13:09:09 IST 2018 in 41 milliseconds
The filesystem under path '/user/hive/warehouse' is HEALTHY
19. fsck
hadoopuser@ub16043lts00:~$ hdfs fsck -list-corruptfileblocks
Connecting to namenode via http://localhost:50070/fsck?ugi=bigdatauser&listcorruptfileblocks=1&path=%2F
The filesystem under path '/' has 0 CORRUPT files
20. setrep [-W (write), -R (recursive)]
hadoopuser@ub16043lts00:~$ hdfs dfs -setrep 4 /user/hive/warehouse/movielens.db/users/users.txt
Replication 4 set: /user/hive/warehouse/movielens.db/users/users.txt
hadoopuser@ub16043lts00:~$ hdfs dfs -setrep -W 4 /user/hive/warehouse/movielens.db/users/users.txt
hadoopuser@ub16043lts00:~$ hdfs dfs -setrep -R 4 /user/hive/warehouse/movielens.db/users/users.txt
21. du
hadoopuser@ub16043lts00:~$ hdfs dfs -du -h /
1.3 K /README.txt
3.9 M /cleaned_user_train.csv
1.3 K /g
1.3 K /gauravb
1.3 K /gauravb1
3.3 M /mr-history
575 /pig_data
223 /pig_output
223.7 M /spark-jars
181.9 M /tmp
39.5 M /user
22. df
hadoopuser@ub16043lts00:~$ hdfs dfs -df -h /
Filesystem Size Used Available Use%
hdfs://localhost:8020 43.8 G 457.5 M 3.4 G 1%
23. storagepolicies
hadoopuser@ub16043lts00:~$ hdfs storagepolicies -listPolicies
Block Storage Policies:
BlockStoragePolicy{COLD:2, storageTypes=[ARCHIVE], creationFallbacks=[], replicationFallbacks=[]}
BlockStoragePolicy{WARM:5, storageTypes=[DISK, ARCHIVE], creationFallbacks=[DISK, ARCHIVE], replicationFallbacks=[DISK, ARCHIVE]}
BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
BlockStoragePolicy{ONE_SSD:10, storageTypes=[SSD, DISK], creationFallbacks=[SSD, DISK], replicationFallbacks=[SSD, DISK]}
BlockStoragePolicy{ALL_SSD:12, storageTypes=[SSD], creationFallbacks=[DISK], replicationFallbacks=[DISK]}
BlockStoragePolicy{LAZY_PERSIST:15, storageTypes=[RAM_DISK, DISK], creationFallbacks=[DISK], replicationFallbacks=[DISK]}
24. fetchImage
hadoopuser@ub16043lts00:~$ hdfs dfsadmin -fetchImage /home/bigdatauser/hadoop_image_08_may_2018
18/05/08 13:32:05 INFO namenode.TransferFsImage: Opening connection to http://localhost:50070/imagetransfer?getimage=1&txid=latest
18/05/08 13:32:05 INFO namenode.TransferFsImage: Image Transfer timeout configured to 60000 milliseconds
18/05/08 13:32:05 INFO namenode.TransferFsImage: Transfer took 0.14s at 240.88 KB/s
hadoopuser@ub16043lts00:~$ hdfs dfsadmin -fetchImage /home/bigdatauser/hadoop_image_08_may_2018
18/05/08 13:32:05 INFO namenode.TransferFsImage: Opening connection to http://localhost:50070/imagetransfer?getimage=1&txid=latest
18/05/08 13:32:05 INFO namenode.TransferFsImage: Image Transfer timeout configured to 60000 milliseconds
18/05/08 13:32:05 INFO namenode.TransferFsImage: Transfer took 0.14s at 240.88 KB/s
YARN
1. version
hadoopuser@ub16043lts00:~$ yarn version
Hadoop 3.0.0
Source code repository https://git-wip-us.apache.org/repos/asf/hadoop.git -r c25427ceca461ee979d30edd7a4b0f50718e6533
Compiled by andrew on 2017-12-08T19:16Z
Compiled with protoc 2.5.0
From source with checksum 397832cb5529187dc8cd74ad54ff22
This command was run using /usr/local/hadoop-3.0.0/share/hadoop/common/hadoop-common-3.0.0.jar
2. node -list
hadoopuser@ub16043lts00:~$ yarn node -list
2017-08-31 14:14:12,233 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
Total Nodes:1
Node-Id Node-State Node-Http-Address Number-of-Running-Containers
ub16043lts00:35581 RUNNING ub16043lts00:8042 0
3. classpath
hadoopuser@ub16043lts00:~$ yarn classpath
/usr/local/hadoop-3.0.0/etc/hadoop:/usr/local/hadoop-3.0.0/share/hadoop/common/lib/*:/usr/local/hadoop-3.0.0/share/hadoop/common/*:/usr/local/hadoop-3.0.0/share/hadoop/hdfs:/usr/local/hadoop-3.0.0/share/hadoop/hdfs/lib/*:/usr/local/hadoop-3.0.0/share/hadoop/hdfs/*:/usr/local/hadoop-3.0.0/share/hadoop/mapreduce/*:/usr/local/hadoop-3.0.0/share/hadoop/yarn/lib/*:/usr/local/hadoop-3.0.0/share/hadoop/yarn/*
4.