Expandmenu Shrunk


  • Category Archives 未分类
  • Hadoop安装配置文件以及Hdfs初始化

    --core-site.xml
    
     
       fs.default.name
       hdfs://hadoop1:54321
     
     
     hadoop.tmp.dir
     /data/hdfs/tmp
     
    
    
    
     
    --hdfs-site.xml
    
     
       dfs.name.dir
       /data/hdfs/name
     
    
    
     
       dfs.data.dir
       /data/hdfs/data
     
    
    
     
       fs.checkpoint.dir
       /data/hdfs/namesecondary
     
    
    
    
    --mapred-site.xml
    
     
       mapred.job.tracker
       hadoop1:54320
     
    
    
     
       mapred.local.dir
       /data/mapred/local
     
    
    
     
       mapred.system.dir
       /data/mapred/system
     
    
     
       mapred.tasktracker.map.tasks.maximum
       7
     
    
    
     
       mapred.tasktracker.reducre.tasks.maximum
       7
     
    
     
       mapred.child.java.opts
       -Xmx400m
     
    
    
    
    
    
    --yarn-site.xml
      
       
      
    yarn.resourcemanager.address  
    hadoop1:8080
      
       
      
    yarn.resourcemanager.scheduler.address  
    hadoop1:8081  
      
       
      
    yarn.resourcemanager.resource-tracker.address  
    hadoop1:8082  
      
       
      
    yarn.nodemanager.aux-services  
    mapreduce_shuffle   
      
       
      
    yarn.nodemanager.aux-services.mapreduce_shuffle.class  
    org.apache.hadoop.mapred.ShuffleHandler  
         
    
      
    
    
    
    ======================================================================================
    =========================格式化hdfs===================================================
    
    [hadoop@hadoop1 sbin]$ hdfs namenode -format                                                 //hadoop namenode -format  语法被废弃
    14/04/21 20:43:41 INFO namenode.NameNode: STARTUP_MSG: 
    /************************************************************
    STARTUP_MSG: Starting NameNode
    STARTUP_MSG:   host = hadoop1/192.168.0.201
    STARTUP_MSG:   args = [-format]
    STARTUP_MSG:   version = 2.2.0
    ....
    
    ************************************************************/
    14/04/21 20:43:41 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
    14/04/21 20:43:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    14/04/21 20:43:41 WARN common.Util: Path /data/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
    14/04/21 20:43:41 WARN common.Util: Path /data/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
    Formatting using clusterid: CID-99c8a03a-d159-45f3-9493-5737343c5209
    14/04/21 20:43:41 INFO namenode.HostFileManager: read includes:
    HostSet(
    )
    14/04/21 20:43:41 INFO namenode.HostFileManager: read excludes:
    HostSet(
    )
    14/04/21 20:43:41 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
    14/04/21 20:43:41 INFO util.GSet: Computing capacity for map BlocksMap
    14/04/21 20:43:41 INFO util.GSet: VM type       = 32-bit
    14/04/21 20:43:41 INFO util.GSet: 2.0% max memory = 966.7 MB
    14/04/21 20:43:41 INFO util.GSet: capacity      = 2^22 = 4194304 entries
    14/04/21 20:43:43 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
    14/04/21 20:43:43 INFO blockmanagement.BlockManager: defaultReplication         = 3
    14/04/21 20:43:43 INFO blockmanagement.BlockManager: maxReplication             = 512
    14/04/21 20:43:43 INFO blockmanagement.BlockManager: minReplication             = 1
    14/04/21 20:43:43 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2
    14/04/21 20:43:43 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks  = false
    14/04/21 20:43:43 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
    14/04/21 20:43:43 INFO blockmanagement.BlockManager: encryptDataTransfer        = false
    14/04/21 20:43:43 INFO namenode.FSNamesystem: fsOwner             = hadoop (auth:SIMPLE)
    14/04/21 20:43:43 INFO namenode.FSNamesystem: supergroup          = supergroup
    14/04/21 20:43:43 INFO namenode.FSNamesystem: isPermissionEnabled = true
    14/04/21 20:43:43 INFO namenode.FSNamesystem: HA Enabled: false
    14/04/21 20:43:43 INFO namenode.FSNamesystem: Append Enabled: true
    14/04/21 20:43:43 INFO util.GSet: Computing capacity for map INodeMap
    14/04/21 20:43:43 INFO util.GSet: VM type       = 32-bit
    14/04/21 20:43:43 INFO util.GSet: 1.0% max memory = 966.7 MB
    14/04/21 20:43:43 INFO util.GSet: capacity      = 2^21 = 2097152 entries
    14/04/21 20:43:44 INFO namenode.NameNode: Caching file names occuring more than 10 times
    14/04/21 20:43:44 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
    14/04/21 20:43:44 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
    14/04/21 20:43:44 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension     = 30000
    14/04/21 20:43:44 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
    14/04/21 20:43:44 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
    14/04/21 20:43:44 INFO util.GSet: Computing capacity for map Namenode Retry Cache
    14/04/21 20:43:44 INFO util.GSet: VM type       = 32-bit
    14/04/21 20:43:44 INFO util.GSet: 0.029999999329447746% max memory = 966.7 MB
    14/04/21 20:43:44 INFO util.GSet: capacity      = 2^16 = 65536 entries
    
    Re-format filesystem in Storage Directory /data/hdfs/name ? (Y or N) Y
    14/04/21 20:46:18 INFO common.Storage: Storage directory /data/hdfs/name has been successfully formatted.
    14/04/21 20:46:19 INFO namenode.FSImage: Saving image file /data/hdfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
    14/04/21 20:46:19 INFO namenode.FSImage: Image file /data/hdfs/name/current/fsimage.ckpt_0000000000000000000 of size 198 bytes saved in 0 seconds.
    14/04/21 20:46:19 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
    14/04/21 20:46:19 INFO util.ExitUtil: Exiting with status 0
    14/04/21 20:46:19 INFO namenode.NameNode: SHUTDOWN_MSG: 
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at hadoop1/192.168.0.201
    ************************************************************/
    
    
    ===============================================================================================
    ===========================================启动hdfs============================================
    
    
    
    [hadoop@hadoop1 hadoop]$ start-dfs.sh 
    14/04/21 20:54:26 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Starting namenodes on [hadoop1]
    hadoop1: Error: JAVA_HOME is not set and could not be found.
    hadoop2: Error: JAVA_HOME is not set and could not be found.
    Starting secondary namenodes [0.0.0.0]
    
    解决办法,设置hadoop_env.sh的JAVA_HOME为新安装的jdk的目录
    JAVA_HOME=/opt/jdk
    
    
    
    
    [hadoop@hadoop1 hadoop]$ start-dfs.sh 
    14/04/21 21:50:45 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)], about=, always=false, type=DEFAULT, sampleName=Ops)
    14/04/21 21:50:45 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)], about=, always=false, type=DEFAULT, sampleName=Ops)
    14/04/21 21:50:45 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group related metrics
    14/04/21 21:50:45 DEBUG security.Groups:  Creating new Groups object
    14/04/21 21:50:45 DEBUG util.NativeCodeLoader: Trying to load the custom-built native-hadoop library...
    14/04/21 21:50:45 DEBUG util.NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: /usr/local/hadoop2.2/lib/native/libhadoop.so.1.0.0: /lib/libc.so.6: version `GLIBC_2.6' not found (required by /usr/local/hadoop2.2/lib/native/libhadoop.so.1.0.0)
    14/04/21 21:50:45 DEBUG util.NativeCodeLoader: java.library.path=/usr/local/hadoop2.2/lib/native
    14/04/21 21:50:45 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    14/04/21 21:50:45 DEBUG security.JniBasedUnixGroupsMappingWithFallback: Falling back to shell based
    14/04/21 21:50:45 DEBUG security.JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
    14/04/21 21:50:45 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000
    14/04/21 21:50:45 DEBUG security.UserGroupInformation: hadoop login
    14/04/21 21:50:45 DEBUG security.UserGroupInformation: hadoop login commit
    14/04/21 21:50:45 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: hadoop
    14/04/21 21:50:45 DEBUG security.UserGroupInformation: UGI loginUser:hadoop (auth:SIMPLE)
    14/04/21 21:50:45 DEBUG security.UserGroupInformation: PrivilegedAction as:hadoop (auth:SIMPLE) from:org.apache.hadoop.hdfs.tools.GetConf.run(GetConf.java:314)
    14/04/21 21:50:45 DEBUG impl.MetricsSystemImpl: StartupProgress, NameNode startup progress
    Starting namenodes on [hadoop1]
    hadoop1: starting namenode, logging to /usr/local/hadoop2.2/logs/hadoop-hadoop-namenode-hadoop1.out
    hadoop2: starting datanode, logging to /usr/local/hadoop2.2/logs/hadoop-hadoop-datanode-hadoop2.out
    Starting secondary namenodes [0.0.0.0]
    The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
    RSA key fingerprint is a1:60:f5:71:da:5a:ca:75:f8:e5:8a:d5:eb:84:95:60.
    Are you sure you want to continue connecting (yes/no)? yes
    0.0.0.0: Warning: Permanently added '0.0.0.0' (RSA) to the list of known hosts.
    0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop2.2/logs/hadoop-hadoop-secondarynamenode-hadoop1.out
    14/04/21 21:52:55 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)], about=, always=false, type=DEFAULT, sampleName=Ops)
    14/04/21 21:52:55 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)], about=, always=false, type=DEFAULT, sampleName=Ops)
    14/04/21 21:52:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group related metrics
    14/04/21 21:52:56 DEBUG security.Groups:  Creating new Groups object
    14/04/21 21:52:56 DEBUG util.NativeCodeLoader: Trying to load the custom-built native-hadoop library...
    14/04/21 21:52:56 DEBUG util.NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: /usr/local/hadoop2.2/lib/native/libhadoop.so.1.0.0: /lib/libc.so.6: version `GLIBC_2.6' not found (required by /usr/local/hadoop2.2/lib/native/libhadoop.so.1.0.0)
    14/04/21 21:52:56 DEBUG util.NativeCodeLoader: java.library.path=/usr/local/hadoop2.2/lib/native
    14/04/21 21:52:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    14/04/21 21:52:56 DEBUG security.JniBasedUnixGroupsMappingWithFallback: Falling back to shell based
    14/04/21 21:52:56 DEBUG security.JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
    14/04/21 21:52:56 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000
    14/04/21 21:52:56 DEBUG security.UserGroupInformation: hadoop login
    14/04/21 21:52:56 DEBUG security.UserGroupInformation: hadoop login commit
    14/04/21 21:52:56 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: hadoop
    14/04/21 21:52:56 DEBUG security.UserGroupInformation: UGI loginUser:hadoop (auth:SIMPLE)
    14/04/21 21:52:56 DEBUG security.UserGroupInformation: PrivilegedAction as:hadoop (auth:SIMPLE) from:org.apache.hadoop.hdfs.tools.GetConf.run(GetConf.java:314)
    
    
    
    
    
    
    
    


  • Hadoop安装手册(一)

    
    ########################################################
    配置host 
    Namenode的机器,需要配置集群中所有机器的ip 
    修改/etc/hosts 
    10.10.236.190   master  
    10.10.236.191   slave-A  
    10.10.236.193   slave-B
    其他的datanode的/etc/hosts 只需要配置namenode的机器ip和本机ip 
    10.10.236.190   master  
    10.10.236.191   slave-A
    #######################################################
    
    
    1. 拓扑图
    
    目录结构:
    /usr/local/hadoop2.2               --hadoop主目录
    /data/hdfs                               --hdfs数据主目录
    架构:namenode做成raid5
    
    
    2.创建独立的hadoop用户
    useradd -u 1001 hadoop
    password hadoop
    
    3.下载安装软件
       1).jkd1.6以上(/opt/jdk1.6)
       2)hadoop-xx.tar.gz
    
    mkdir -p /usr/local/hadoop2.2
    tar -zxvf hadoop-2.2.0.tar.gz -C /usr/local/hadoop2.2
    
    chown hadoop:hadoop -R /usr/local/hadoop2.2
    chmod 755 -R /usr/local/hadoop2.2
    
    4.创建hadoop用户间互信
    
    
    4.1)每个节点产生自己的秘钥对
    
    --node1:
    mkdir -p ~/.ssh
    chmod 755 ~/.ssh
    --node2:
    mkdir -p ~/.ssh
    chmod 755 ~/.ssh
    
    --node1:
    
    ssh hadoop1 cat ~/.ssh/id_dsa.pub >> authorized_keys
    ssh hadoop1 cat ~/.ssh/id_rsa.pub >> authorized_keys
    ssh hadoop2 cat ~/.ssh/id_dsa.pub >> authorized_keys
    ssh hadoop2 cat ~/.ssh/id_rsa.pub >> authorized_keys
    
    4.2)将authoried_keys分发到另外的节点
    将node1的authoried_keys拷贝到远程node2节点
    scp authorized_keys hadoop2:~/.ssh/
    --node2:
    chmod 600 authoried_keys
    
    4.3)测试
    --node1
    [hadoop@hadoop1 .ssh]$ ssh hadoop1 date
    Mon Apr 21 18:37:09 PDT 2014
    [hadoop@hadoop1 .ssh]$ ssh hadoop2 date
    Mon Apr 21 18:48:02 PDT 2014
    
    --node2
    ssh hadoop1 date
    The authenticity of host 'hadoop1 (192.168.0.201)' can't be established.
    RSA key fingerprint is a1:60:f5:71:da:5a:ca:75:f8:e5:8a:d5:eb:84:95:60.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added 'hadoop1,192.168.0.201' (RSA) to the list of known hosts. --由于没有node2没有known_hosts第一次需要添加
    
    [hadoop@hadoop2 .ssh]$ ssh hadoop1 date
    Mon Apr 21 18:37:37 PDT 2014                                                                               --证明添加成功
    
    [hadoop@hadoop2 .ssh]$ ssh hadoop2 date
    The authenticity of host 'hadoop2 (192.168.0.202)' can't be established.
    RSA key fingerprint is 7c:06:f6:12:09:ce:33:1b:8b:ad:88:94:f5:14:f5:15.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added 'hadoop2,192.168.0.202' (RSA) to the list of known hosts.  ---添加到known_hosts
    Mon Apr 21 18:48:32 PDT 2014
    [hadoop@hadoop2 .ssh]$ ssh hadoop2 date                                 
    Mon Apr 21 18:48:34 PDT 2014                                                                                --证明添加成功
    
    
    
    5.修改hadoop用户默认配置.bash_profile
    
    export JAVA_HOME=/opt/jdk1.6
    export JRE_HOME=$JAVA_HOME/jre
    export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH  
    export PATH==$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
    export HADOOP_HOME=/usr/local/hadoop2.2
    export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:HADOOP_HOME/sbin
    export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
    
    6. 配置hadoop
    hadoop1 为name node
    hadoop2..hadoop2为data node
    
    7. 配置hadoop 
    cd $HADOOP_HOME/etc/hadoop
    7.1.1).修改hadoop-env.sh,添加jdk支持export JAVA_HOME=/opt/jdk1.6    //如果在用户环境变量配置,则不需要配置
    
    修改hadoop_env.sh
    JAVA_HOME=/opt/jdk1.6                           // 必须更改
    --  如果ssh端口不是默认的22,在conf/hadoop-env.sh里改下。如:
    --export HADOOP_SSH_OPTS="-p 1234"       //默认是22则不需要更改
    7.1.2)配置yarn-env.sh
    更改
    export JAVA_HOME=/opt/jdk1.6
    
    7.2).修改core-site.xml,增加下面内容 
    fs.default.name    hdfs://master:54321                                                         //这个才是真正决定namenode  
    hadoop.tmp.dir    /data/hdfs/tmp                                                               //临时文件,有问题的时候,可以删除  ,A base for other temporary directories.  
    7.3).修改hdfs-site.xml,增加下面内容 
    dfs.name.dir  /data/hdfs/name                                   //namenode持久存储名字空间,事务日志的本地路径,name node存放数据  
    dfs.data.dir   /data/hdfs/data                                     //datanode存放数据的路径  
    
    --dfs.datanode.max.xcievers  4096  
    --dfs.replication  1                                                      //数据备份的个数,默认是3  
    --mapred.job.tracker  master:54311                              //jobTracker的主机  
    
    
    7.4).修改mapred-site.xml,增加下面内容
    mapred.job.tracker  hadoop1:54320                              
    
    7.5). 修改masters,这个决定那个是secondarynamenode   //手动创建文件
    hadoop1
    7.6) .修改slaves,这个是所有datanode的机器                    //手动创建文件
    hadoop2
    hadoop3(如果有节点就追加)
    
    7.7)配置yarn资源管理
    
    
    7.8) 将配置好的hadoop配置文件拷贝到所有的datanode 
     
    
    
    
    7. 格式化hdfs文件系统的namenode 
    hadoop namenode -fomat
    y
    
    8.启动hadoop集群
    --$HADOOP_HOME/ bin/start-all.sh       --已经废弃
    $HADOOP_HOME/sbin/start-dfs.sh
    
    上面命令等价如下命令:
     hdfs namenode  
     hdfs secondarynamenode  
     hdfs datanode  
    9.验证:
    --name node
    [hadoop@hadoop1 hadoop]$ start-yarn.sh 
    starting yarn daemons
    starting resourcemanager, logging to /usr/local/hadoop2.2/logs/yarn-hadoop-resourcemanager-hadoop1.out
    hadoop2: starting nodemanager, logging to /usr/local/hadoop2.2/logs/yarn-hadoop-nodemanager-hadoop2.out
    
    [hadoop@hadoop1 hadoop]$ jps 
    32318 Jps
    32054 SecondaryNameNode
    31885 NameNode
    32254 ResourceManager
    
    --datanode
    [hadoop@hadoop2 sbin]$ jps 
    29711 DataNode
    29822 Jps
    
    
    10. 网页验证yarn默认
    http://hadoop1:8088 
    http://192.168.0.201:8088       //管理窗口, 查看mr
    http://192.168.0.201:50070     //查看节点信息
    //http://192.168.0.201:50030     //Jobtracker
    
    10.hdfs操作
    
    建立目录 
    $HADOOP_HOME/bin/hdfs dfs -mkdir /testdir                    //注意此处的'/'和传统的不一样,这里为 hdfs 的根目录
    查看现有文件 
    $HADOOP_HOME/bin/hdfs dfs -ls /
    Found 1 items
    drwxr-xr-x   - hadoop supergroup          0 2014-04-21 23:19 /testdir
    
    
    10.关闭Hdfs
    $HADOOP_HOME/sbin/stop-dfs.sh
    
     
    
    11.测试hadoop运算
     hadoop jar  /usr/local/hadoop2.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar    pi   2 5
    
    
    
    
    
    
    
    
    
    
    



香港马会开奖记录|香港马会开奖资料|香港马会开奖现场|香港马会走势图|香港马会开奖结果直播|香港马会n730|