hadoop在运行时间长了之后,日志文件,会占用很大,极端情况,会导致硬盘满。影响业务的正常运行。
解决方式:
步骤一、修改core-site.xml配置文件
<property>
<property> |
步骤二、修改hdfs、yarn、MapReduce对应的log4j的配置文件。设定文件输出数量
(官方网站上对log4j做了说明,还支持:ZooKeeper、Kafka、HBase、Hive、Storm 、Oozie、Hdfs、Yarn,操作方式类似)
1. HDFS log4j配置
2.Yarn log4j 配置
3.MapReduce log4j配置
Ambari方式安装的hadoop日志路径以及种类如图:
hadoop日志文件目录:
/var/log/hadoop /var/log/hadoop-mapreduce
|- 暂无
|
当前文件统计:
1. /var/log/hadoop-yarn/yarn目录文件 按文件大小进行排序 ( 操作系统默认单位kb ):
[hadoop@bj-rack001-hadoop002 yarn]$ du -a /var/log/hadoop-yarn/yarn | sort -n -r | head -n 10
256M /var/log/hadoop-yarn/yarn
256M /var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-bj-rack001-hadoop002.log.3
256M /var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-bj-rack001-hadoop002.log.2
256M /var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-bj-rack001-hadoop002.log.7
256M /var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-bj-rack001-hadoop002.log.6
256M /var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-bj-rack001-hadoop002.log.5
256M /var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-bj-rack001-hadoop002.log.4
256M /var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-bj-rack001-hadoop002.log.1
230M /var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-bj-rack001-hadoop002.log
1M /var/log/hadoop-yarn/yarn/nm-audit.log.2018-12-03
2. /var/log/hadoop/hdfs目录文件 按文件大小进行排序 ( 操作系统默认单位kb ):
[hadoop@bj-rack001-hadoop002 hdfs]$ du -a /var/log/hadoop/hdfs | sort -n -r | head -n 10
1.15G /var/log/hadoop/hdfs/hdfs-audit.log.2018-12-03
1.05G /var/log/hadoop/hdfs/hdfs-audit.log.2018-11-29
1.04G /var/log/hadoop/hdfs/hdfs-audit.log.2018-12-29
491MB /var/log/hadoop/hdfs/hdfs-audit.log.2018-11-30
489MB /var/log/hadoop/hdfs/hdfs-audit.log.2018-12-01
488MB /var/log/hadoop/hdfs/hdfs-audit.log.2018-12-02
404.MB /var/log/hadoop/hdfs/hdfs-audit.log.2018-12-13
389MB /var/log/hadoop/hdfs/hdfs-audit.log.2018-12-14
328MB /var/log/hadoop/hdfs/hdfs-audit.log.2018-12-27
验证:
为了验证这两个参数,我们做一下测试。
1.备份数据
创建备份文件夹:
mkdir -p /data/temp/zl/yarn-log-bak/var/log/hadoop/hdfs
mkdir -p /data/temp/zl/yarn-log-bak/var/log/hadoop-yarn/nodemanager
mkdir -p /data/temp/zl/yarn-log-bak/var/log/hadoop-yarn/yarn
备份原有数据(mapreduce由于没有数据,就不进行备份了):
sudo mv /var/log/hadoop/hdfs/* /data/temp/zl/yarn-log-bak/var/log/hadoop/hdfs
sudo mv /var/log/hadoop-yarn/nodemanager/* /data/temp/zl/yarn-log-bak/var/log/hadoop-yarn/nodemanager
sudo mv /var/log/hadoop-yarn/yarn/* /data/temp/zl/yarn-log-bak/var/log/hadoop-yarn/yarn
2.清理目录
rm -fr /var/log/hadoop/hdfs/*
rm -fr /var/log/hadoop-yarn/nodemanager/*
rm -fr /var/log/hadoop-yarn/yarn/*
3.调整参数
<property>
<name>hadoop.logfile.size</name>
<value>10000</value>
<description>每个日志文件的最大值,单位:bytes ,当前调整为: 9.76KB </description>
</property>
<property>
<name>hadoop.logfile.count</name>
<value>5</value>
<description>日志文件的最大数量</description>
</property>
4.启动
5.观察文件
cd /var/log/hadoop/hdfs/
cd /var/log/hadoop-yarn/yarn
从结果上看
yarn-yarn-nodemanager-bj-rack001-hadoop006.log的文件大小为257M. 这个跟上面限制不符。
所以这个文件的输出,是由什么控制的呢?
hadoop是由log4j记录日志的,所以需要对其进行修改。
过了几天,
yarn日志如下:
HDFS日志:
ambari官方链接: