Using VCSA as remote syslog - Don't forget the log rotation!
Important note: It seems that vCenter Server Appliance updates revert the changes. Please check the settings after each update!
The VMware vCenter Server Appliance (VCSA) can act as a remote syslog destition for ESXi hosts. This is very handy for troubleshooting and I really recommend to use this feature. But VMware ESXi hosts can be really chatty and therefore it’s a good idea to keep an eye on the free disk space of the VCSA.
Yesterday, a colleague had an interesting support case. A customer reported that his Veeam Backup & Replication jobs failed and that he was unable to login to the vCenter with the vSphere Client and vSphere Web Client. My colleague checked the VCSA VM and noticed that the VPXD failed to start (“Waiting for vpxd to initialize: ….failed”). Together we checked the appliance and the log files. The vpxd.log (/var/log/vmware/vpx) was updated weeks ago, but the last entry was interesting: No space left on device. But there was free disk space on /storage/log. I immediately checked the inode count with df -i and there it was: No free inodes. Why is this a problem? Each name entry in the file system consumes an inode. If there are no free inodes, no new directories and files can be created. The error message is the same as for missing disk space. Something had to have created a lot of files on /storage/log. Because /var/log/vmware is a symbolic linkt to /storage/log/vmware, it had to be something on the /storage/log partition. We checked the remote syslog location under /storage/log/remote and found gigabytes and an incredible number of logs. After removing the logs, the VPXD was able to start and the inode count was on a normal level.
But why were there so many logs? We checked the logrotate config and found a faulty config for the remote syslog files. Instead of rotating logs and remove old ones, this config rotated all logs every day and potentiated the number of logs. Please note that there is no logrotate config to rotate remote syslog files by default! This one was added manually.
This is the default config for the remote syslog-collector of the VCSA:
destination log_remote {
file("/var/log/remote/$HOST_FROM/$YEAR-$MONTH/messages-$YEAR-$MONTH-$DAY"
create_dirs(yes) frac-digits(3)
template("$ISODATE $PROGRAM $MSGONLYn")
template_escape(no)
);
};
As you can see, with these settings a folder for each host and each month is created. According to this VMTN posting, we changed the syslog-collector config a bit:
destination log_remote {
file("/var/log/remote/$HOST_FROM/messages"
create_dirs(yes) frac-digits(3)
template("$ISODATE $PROGRAM $MSGONLYn")
template_escape(no)
);
};
With this settings, only a single file per host is created. We made also a change to /etc/logrotate.d/syslog and added this at the end:
/var/log/remote/*/messages {
daily
compress
delaycompress
rotate 30
postrotate
/etc/init.d/syslog-collector reload > /dev/null
endscript
}
With this configuration 30 log files will be preserved. The number of log files or how often log rotation should happen (weekly or daily) can easily be adjusted. But these settings should be sufficient for small environments.
It’s important to understand that the VCSA has different disks and that the disks are mountend to different mount points within the root filesystem. This is from a vSphere 5.5 VCSA:
vcsa1:~ # mount
/dev/sda3 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,mode=1777)
devpts on /dev/pts type devpts (rw,mode=0620,gid=5)
/dev/sda1 on /boot type ext3 (rw,noexec,nosuid,nodev,noacl)
/dev/sdb1 on /storage/core type ext3 (rw,nosuid,nodev)
/dev/sdb2 on /storage/log type ext3 (rw,nosuid,nodev)
/dev/sdb3 on /storage/db type ext3 (rw,nosuid,nodev)
/var/log/vmware
and /var/log/remote
are links to /storage/log/vmware
and /storage/log/remote
. Make sure that there is always enough free diskspace on ALL disks! I also want to highlight VMware KB2092127 (After upgrading to vCenter Server Appliance 5.5 Update 2, pg_log file reports this error: WARNING: there is already a transaction in progress). This error hit me a couple of times…