User Tools

Site Tools


tier3:hadoop

Hadoop

The NameNode (SE) keeps an image of the entire file system namespace and file Blockmap in memory. When the NameNode starts up, it reads the FsImage and EditLog from disk, applies all the transactions from the EditLog to the in-memory representation of the FsImage, and flushes out this new version into a new FsImage on disk. It can then truncate the old EditLog because its transactions have been applied to the persistent FsImage. In the current implementation, a checkpoint only occurs when the NameNode starts up.

The conf/hadoop-defaults.xml file contains default values for every parameter in Hadoop. This file is considered read-only. You override this configuration by setting new values in conf/hadoop-site.xml. This file should be replicated consistently across all machines in the cluster.

Hadoop Administration

Put the cluster in Safemode

$ hadoop dfsadmin -safemode enter

(enter, leave, get, wait)

Generate a list of DataNodes

$ hadoop dfsadmin -report

Decommission DataNode datanode

$ hadoop dfsadmin -decommission datanodename

Provide more usage information about a particular command

$ hadoop dfsadmin -help cmd 

Balance data nodes

$ hadoop balancer

Check Filesystem

$ hadoop fsck /

Move corrupt files to /lost+found

$ hadoop fsck / -move

Delete corrupted files

$ hadoop fsck / -delete

Decomission a data node, first add datanode to /etc/hadoop/hosts_exclude then run

$ hadoop dfsadmin -refreshNodes

The key dfs.hosts.exclude in conf/hadoop-site.xml defines the hosts_exclude file

tier3/hadoop.txt · Last modified: 2012/12/14 14:59 by tlknight