hadoop配置自动清理日志

hadoop在运行时间长了之后,日志文件,会占用很大,极端情况,会导致硬盘满。影响业务的正常运行。

解决方式:

 

步骤一、修改core-site.xml配置文件

 

<property>
    <name>hadoop.logfile.size</name>
    <value>10000000</value>
    <description>每个日志文件的最大值,单位:bytes </description>
</property>

 

<property>
    <name>hadoop.logfile.count</name>
    <value>10</value>
    <description>日志文件的最大数量</description>
</property>

 

 

 

 

 

 

 

 

步骤二、修改hdfs、yarn、MapReduce对应的log4j的配置文件。设定文件输出数量

(官方网站上对log4j做了说明,还支持:ZooKeeper、Kafka、HBase、Hive、Storm 、Oozie、Hdfs、Yarn,操作方式类似)

1. HDFS log4j配置

2.Yarn log4j 配置

3.MapReduce log4j配置

 

 

Ambari方式安装的hadoop日志路径以及种类如图:

hadoop日志文件目录:

 

/var/log/hadoop  
    |-    hdfs
        |- gc.log
        |- hadoop-hdfs-datanode-bj-rack001-hadoop002.
        |- hadoop-hdfs-datanode-bj-rack001-hadoop002.out
        |- hadoop-hdfs-namenode-bj-rack001-hadoop002.log
        |- hadoop-hdfs-namenode-bj-rack001-hadoop002.out
        |- hdfs-audit.log
        |- SecurityAuth.audit

/var/log/hadoop-mapreduce  
    |-    mapred


/var/log/hadoop-hdfs 

    |-    暂无


/var/log/hadoop-yarn
    |-    nodemanager
    |-  yarn   
        |- nm-audit.log
        |- yarn-yarn-nodemanager-bj-rack001-hadoop002.log
        |- yarn-yarn-nodemanager-bj-rack001-hadoop002.out

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

当前文件统计:


1. /var/log/hadoop-yarn/yarn目录文件 按文件大小进行排序 ( 操作系统默认单位kb ):


[hadoop@bj-rack001-hadoop002 yarn]$ du -a /var/log/hadoop-yarn/yarn | sort -n -r | head -n 10
            256M    /var/log/hadoop-yarn/yarn    
            256M    /var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-bj-rack001-hadoop002.log.3
            256M    /var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-bj-rack001-hadoop002.log.2
            256M    /var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-bj-rack001-hadoop002.log.7
            256M    /var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-bj-rack001-hadoop002.log.6
            256M    /var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-bj-rack001-hadoop002.log.5
            256M    /var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-bj-rack001-hadoop002.log.4
            256M    /var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-bj-rack001-hadoop002.log.1
            230M    /var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-bj-rack001-hadoop002.log
            1M       /var/log/hadoop-yarn/yarn/nm-audit.log.2018-12-03


2.  /var/log/hadoop/hdfs目录文件 按文件大小进行排序 ( 操作系统默认单位kb ):


[hadoop@bj-rack001-hadoop002 hdfs]$ du -a /var/log/hadoop/hdfs | sort -n -r | head -n 10
            1.15G    /var/log/hadoop/hdfs/hdfs-audit.log.2018-12-03
            1.05G    /var/log/hadoop/hdfs/hdfs-audit.log.2018-11-29
            1.04G    /var/log/hadoop/hdfs/hdfs-audit.log.2018-12-29
            491MB    /var/log/hadoop/hdfs/hdfs-audit.log.2018-11-30
            489MB    /var/log/hadoop/hdfs/hdfs-audit.log.2018-12-01
            488MB    /var/log/hadoop/hdfs/hdfs-audit.log.2018-12-02
            404.MB    /var/log/hadoop/hdfs/hdfs-audit.log.2018-12-13
            389MB    /var/log/hadoop/hdfs/hdfs-audit.log.2018-12-14
            328MB    /var/log/hadoop/hdfs/hdfs-audit.log.2018-12-27
 

 

验证:

 

为了验证这两个参数,我们做一下测试。

1.备份数据

创建备份文件夹:
mkdir -p /data/temp/zl/yarn-log-bak/var/log/hadoop/hdfs
mkdir -p /data/temp/zl/yarn-log-bak/var/log/hadoop-yarn/nodemanager
mkdir -p /data/temp/zl/yarn-log-bak/var/log/hadoop-yarn/yarn 

备份原有数据(mapreduce由于没有数据,就不进行备份了):
sudo mv /var/log/hadoop/hdfs/* /data/temp/zl/yarn-log-bak/var/log/hadoop/hdfs
sudo mv /var/log/hadoop-yarn/nodemanager/* /data/temp/zl/yarn-log-bak/var/log/hadoop-yarn/nodemanager
sudo mv /var/log/hadoop-yarn/yarn/* /data/temp/zl/yarn-log-bak/var/log/hadoop-yarn/yarn 
 

 

2.清理目录

rm -fr /var/log/hadoop/hdfs/*
rm -fr /var/log/hadoop-yarn/nodemanager/*
rm -fr /var/log/hadoop-yarn/yarn/*
 

 

3.调整参数

 

<property>
    <name>hadoop.logfile.size</name>
    <value>10000</value>
    <description>每个日志文件的最大值,单位:bytes ,当前调整为: 9.76KB </description>
</property>

<property>
    <name>hadoop.logfile.count</name>
    <value>5</value>
    <description>日志文件的最大数量</description>
</property>

4.启动

 

 

5.观察文件

 

cd  /var/log/hadoop/hdfs/
cd /var/log/hadoop-yarn/yarn

 

 

 

从结果上看

yarn-yarn-nodemanager-bj-rack001-hadoop006.log的文件大小为257M. 这个跟上面限制不符。

所以这个文件的输出,是由什么控制的呢?

 

 

hadoop是由log4j记录日志的,所以需要对其进行修改。
 

过了几天,

yarn日志如下:

HDFS日志:

 

 

 

 

ambari官方链接:

https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.2.0/bk_ambari-upgrade/content/upgrading_log_rotation_configuration.html

 

 

 

 

 

 

 

 

 

 

 

 

 

 

已标记关键词 清除标记
©️2020 CSDN 皮肤主题: 鲸 设计师:meimeiellie 返回首页