Chapter 14. Monitoring and tuning z/VM and Linux 241
being measured in 100s rather than 1000s. The higher values seen tend to build up over
time, and are sustained over periods of intense system activity. However, there are times
when the MIGRATE value may spike for brief periods of time.
Minidisk cache (MDC) statistics are given on the third line. The effectiveness of MDC can be
judged by the combination of the READS rate and the HIT RATIO. If both are high, then a
large number of physical I/Os are avoided due to the MDC feature. For a system that has
an appreciably high I/O rate, composed of reads plus writes, and a high proportion of
reads, and a good hit ratio for those reads (tending to 90% or greater), the real, physical
I/O avoidance can be high. This author has seen the avoidance as high as 50% in some
cases. Conversely, however, a high HIT RATIO with a low value for the READS rate should
not be taken as good (100% hit ratio, when doing only 1 I/O per second is effectively
meaningless).
Line 4 describes more storage (memory) management. The PAGING rate is important.
Higher values will often impact performance. This can be at least partially offset by
increasing the number of page volumes, but a more thorough examination of this problem
is advisable whenever it arises.The STEAL percentage is often misleading. This is basically
the percentage of pages taken from guests that z/VM believes are non-dormant. Because
some guests have periodic timers going off, they appear to be active to z/VM even when
relatively idle. Pages taken from these guests are still considered to be stolen. So there
are scenarios where a system only has a user set comprising active guests, in which case
all pages taken would be considered stolen. Bearing this in mind, if a high STEAL value is
observed, the paging rate needs to be checked. If the paging rate is relatively low, then the
STEAL value is not important.
On lines 5 through 8 you also see a series of counters that represent the users in various
queues. The z/VM scheduler classifies work into 3 different classes (1 through 3) and a
special additional class labelled zero. So the Column of Q
x
values and E
x
represent the
virtual machines in the dispatch list and the eligible list. The most important value here to
validate is that there are no virtual machines in the Eligible list: E1, E2, E3; this implies
z/VM has stopped dispatching some virtual machines to avoid overcommitting resources.
Such a system would require further investigation, possibly leading to some tuning work,
or even hardware addition in extreme cases. Ignore the values in parenthesis.
INDICATE QUEUES EXP
Another useful command to understand the state of the system is the INDICATE QUEUES
EXP command. Here is an example:
==> ind q exp
DATAMGT1 Q3 AP 00000537/00000537 .... -2.025 A02
BITNER Q1 R00 00000785/00000796 .I.. -1.782 A00
EDLLNX4 Q3 PS 00007635/00007635 .... -1.121 A00
TCPIP Q0 R01 00004016/00003336 .I.. -.9324 A01
APCTEST1 Q2 IO 00003556/00003512 .I.. -.7847 A01
EDLWRK20 Q3 AP 00001495/00001462 .... -.6996 A01
EDL Q3 IO 00000918/00000902 .... -.2409 A01
EDLWRK11 Q3 AP 00002323/00002299 .... -.0183 A00
EDLWRK18 Q3 IO 00001052/00000388 .... -.0047 A00
EDLWRK4 Q3 AP 00004792/00002295 .... .0055 A01
EDLWRK8 Q3 AP 00004804/00004797 .... .0089 A02
EDLWRK16 Q3 AP 00002378/00002378 .... .0170 A02
EDLWRK2 Q3 AP 00005544/00002956 .... .0360 A00
EDLWRK12 Q3 AP 00004963/00002348 .... .0677 A01
EDLWRK6 Q3 IO 00000750/00000302 .... .0969 A02
EDLWRK3 Q3 AP 00005098/00005096 .... .0999 A02
EDLWRK17 Q3 AP 00004786/00004766 .... .1061 A01
Commenti su questo manuale