Docunext


Debugging and Reducing I O Wait

October 29th, 2009

CDROM Problems?

rmmod ide_cd_mod
[11720813.923100] hda: task_in_intr: status=0x51 { DriveReady SeekComplete Error }
[11720813.923100] hda: task_in_intr: error=0x04 { AbortedCommand }
[11720813.923100] ide: failed opcode was: 0xec

http://www.linuxforums.org/forum/debian-linux-help/61930-weird-kernel-behavior.html

/boot/grub/menu.list

# defoptions=hda=noprobe scheduler=deadline
update-grub

http://www.lesswatts.org/tips/disks.php

VM Problems?

apt-get install sysstat
pidstat -d 5
echo 1 > /proc/sys/vm/block_dump
dmesg | egrep "READ|WRITE|dirtied" | egrep -o '([a-zA-Z]*)' | sort | uniq -c | sort -rn | head

VERY USEFUL: http://www.westnet.com/~gsmith/content/linux-pdflush.htm

MY OLD SETTINGS:

echo "100" > /proc/sys/vm/dirty_writeback_centisecs
echo '10' > /proc/sys/vm/dirty_ratio
echo '5' > /proc/sys/vm/dirty_background_ratio

NEW: (defaults)

echo '500' > /proc/sys/vm/dirty_writeback_centisecs
echo '40' > /proc/sys/vm/dirty_ratio
echo '10' > /proc/sys/vm/dirty_background_ratio

CONSIDER:

echo '8' > /proc/sys/vm/dirty_background_ratio

Correct I/O Scheduler?

tw_cli

//pro-12-gl> info c4

Unit  UnitType  Status         %Cmpl  Stripe  Size(GB)  Cache  AVerify  IgnECC
------------------------------------------------------------------------------
u0   RAID-5    OK             -      64K     1117.52   ON     OFF      OFF
cat /sys/block/sda/queue/scheduler
noop anticipatory deadline [cfq]
echo deadline > /sys/block/sda/queue/scheduler

Filesystem Configuration

On my setup, kjournald is one busy beaver:

# pidstat -d 5
Linux 2.6.26-2-openvz-686 (pro-12-gl.savonix.com) 	10/29/2009 	_i686_	(2 CPU)

11:19:41 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
11:19:46 PM      1895      0.00     23.06      0.00  kjournald
11:19:46 PM      7309      0.00      0.80      0.00  syslog-ng
11:19:46 PM     21404      0.00      0.80      0.00  tlsmgr

11:19:46 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
11:19:51 PM     18208      0.00      0.80      0.00  syslog-ng

11:19:51 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
11:19:56 PM      1895      0.00      9.60      0.00  kjournald
11:19:56 PM     17120      0.00      1.60      0.00  tlsmgr
11:19:56 PM     27257      0.00      1.60      0.00  apache2

11:19:56 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
11:20:01 PM       423      0.00      0.80      0.00  apache2
11:20:01 PM      1851      0.00      0.80      0.00  nginx
11:20:01 PM      1895      0.00      8.80      0.00  kjournald

11:20:01 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
11:20:06 PM      1895      0.00      8.80      0.00  kjournald
11:20:06 PM      1898      0.00      0.80      0.00  courierpop3d

11:20:06 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
11:20:11 PM       928      0.00     13.60      0.00  kjournald
11:20:11 PM      1895      0.00      3.20      0.00  kjournald
11:20:11 PM      7309      0.00      1.60      0.00  syslog-ng

Turns out there are some settings which can be set to change the way kjournald behaves:

A lot of the information I'm reading about tuning ext3 involves the linux 2.4 kernel, so I'm not going to do any tune2fs tweaking tonight.

UPDATE: While making these changes, it appears that my iowait has settled to an acceptable level. I'm no longer getting messages from monit about resource limits getting matched.

UPDATE 2: I noticed a potential cause of high iowait in general may have been rsync. To remedy this I've changed /etc/default/rsync to include:

RSYNC_NICE='10'
Yearly Indexes: 2003 2004 2006 2007 2008 2009 2010 2011 2012 2013 2015 2019 2020 2022