-
Notifications
You must be signed in to change notification settings - Fork 139
/
irqbalance.1
222 lines (201 loc) · 8.74 KB
/
irqbalance.1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
.de Sh \" Subsection
.br
.if t .Sp
.ne 5
.PP
\fB\\$1\fR
.PP
..
.de Sp \" Vertical space (when we can't use .PP)
.if t .sp .5v
.if n .sp
..
.de Ip \" List item
.br
.ie \\n(.$>=3 .ne \\$3
.el .ne 3
.IP "\\$1" \\$2
..
.TH "IRQBALANCE" 1 "Dec 2006" "Linux" "irqbalance"
.SH NAME
irqbalance \- distribute hardware interrupts across processors on a multiprocessor system
.SH "SYNOPSIS"
.nf
\fBirqbalance\fR
.fi
.SH "DESCRIPTION"
.PP
The purpose of \fBirqbalance\fR is to distribute hardware interrupts across
processors on a multiprocessor system in order to increase performance\&.
.SH "OPTIONS"
.TP
.B -o, --oneshot
Causes irqbalance to be run once, after which the daemon exits.
.TP
.B -d, --debug
Causes irqbalance to print extra debug information. Implies --foreground.
.TP
.B -f, --foreground
Causes irqbalance to run in the foreground (without --debug).
.TP
.B -j, --journal
Enables log output optimized for systemd-journal.
.TP
.B -p, --powerthresh=<threshold>
Set the threshold at which we attempt to move a CPU into powersave mode.
If more than <threshold> CPUs are more than 1 standard deviation below the
average CPU softirq workload, and no CPUs are more than 1 standard deviation
above (and have more than 1 IRQ assigned to them), attempt to place 1 CPU in
powersave mode. In powersave mode, a CPU will not have any IRQs balanced to it,
in an effort to prevent that CPU from waking up without need.
.TP
.B -i, --banirq=<irqnum>
Add the specified IRQ to the set of banned IRQs. irqbalance will not affect
the affinity of any IRQs on the banned list, allowing them to be specified
manually. This option is additive and can be specified multiple times. For
example to ban IRQs 43 and 44 from balancing, use the following command line:
.B irqbalance --banirq=43 --banirq=44
.TP
.B -m, --banmod=<module_name>
Add the specified module to the set of banned modules, similar to --banirq.
irqbalance will not affect the affinity of any IRQs of given modules, allowing
them to be specified manually. This option is additive and can be specified
multiple times. For example to ban all IRQs of module foo and module bar from
balancing, use the following command line:
.B irqbalance --banmod=foo --banmod=bar
.TP
.B -c, --deepestcache=<integer>
This allows a user to specify the cache level at which irqbalance partitions
cache domains. Specifying a deeper cache may allow a greater degree of
flexibility for irqbalance to assign IRQ affinity to achieve greater performance
increases, but setting a cache depth too large on some systems (specifically
where all CPUs on a system share the deepest cache level), will cause irqbalance
to see balancing as unnecessary.
.B irqbalance --deepestcache=2
.P
The default value for deepestcache is 2.
.TP
.B -l, --policyscript=<script>
When specified, the referenced script or directory will execute once for each discovered IRQ,
with the sysfs device path and IRQ number passed as arguments. Note that the
device path argument will point to the parent directory from which the IRQ
attributes directory may be directly opened.
Policy scripts specified need to be owned and executable by the user of irqbalance process,
if a directory is specified, non-executable files will be skipped.
The script may specify zero or more key=value pairs that will guide irqbalance in
the management of that IRQ. Key=value pairs are printed by the script on stdout
and will be captured and interpreted by irqbalance. Irqbalance expects a zero
exit code from the provided utility. Recognized key=value pairs are:
.TP
.I ban=[true | false]
Directs irqbalance to exclude the passed in IRQ from balancing.
.TP
.I balance_level=[none | package | cache | core]
This allows a user to override the balance level of a given IRQ. By default the
balance level is determined automatically based on the pci device class of the
device that owns the IRQ.
.TP
.I numa_node=<integer>
This allows a user to override the NUMA node that sysfs indicates a given device
IRQ is local to. Often, systems will not specify this information in ACPI, and as a
result devices are considered equidistant from all NUMA nodes in a system.
This option allows for that hardware provided information to be overridden, so
that irqbalance can bias IRQ affinity for these devices toward its most local
node. Note that specifying a -1 here forces irqbalance to consider an interrupt
from a device to be equidistant from all nodes.
.TP
Note that, if a directory is specified rather than a regular file, all files in
the directory will be considered policy scripts, and executed on adding of an
irq to a database. If such a directory is specified, scripts in the directory
must additionally exit with one of the following exit codes:
.TP
.I 0
This indicates the script has a policy for the referenced irq, and that further
script processing should stop
.TP
.I 1
This indicates that the script has no policy for the referenced irq, and that
script processing should continue
.TP
.I 2
This indicates that an error has occurred in the script, and it should be skipped
(further processing to continue)
.TP
.B --migrateval, -e <val>
Specify a minimum migration ratio to trigger a rebalancing
Normally any improvement in load distribution will trigger the migration of an
irq, as long as preforming the migration will not simply move the load to a new
cpu. By specifying a migration value, the load balance improvement is subject
to hysteresis defined by this value, which is inversely propotional to the
value. For example, a value of 2 in this option tells irqbalance that the
improvement in load distribution must be at least 50%, a value of 4 indicates
the load distribution improvement must be at least 25%, etc
.TP
.B -s, --pid=<file>
Have irqbalance write its process id to the specified file. By default no
pidfile is written. The written pidfile is automatically unlinked when
irqbalance exits. It is ignored when used with --debug or --foreground.
.TP
.B -t, --interval=<time>
Set the measurement time for irqbalance. irqbalance will sleep for <time>
seconds between samples of the irq load on the system cpus. Defaults to 10.
.SH "ENVIRONMENT VARIABLES"
.TP
.B IRQBALANCE_ONESHOT
Same as --oneshot.
.TP
.B IRQBALANCE_DEBUG
Same as --debug.
.TP
.B IRQBALANCE_BANNED_CPUS
Provides a mask of CPUs which irqbalance should ignore and never assign interrupts to.
If not specified, irqbalance use mask of isolated and adaptive-ticks CPUs on the
system as the default value. The "isolcpus=" boot parameter specifies the isolated CPUs. The "nohz_full=" boot parameter specifies the adaptive-ticks CPUs. By default, no CPU will be an isolated or adaptive-ticks CPU.
This is a hexmask without the leading ’0x’. On systems with large numbers of
processors, each group of eight hex digits is separated by a comma ’,’. i.e.
‘export IRQBALANCE_BANNED_CPUS=fc0‘ would prevent irqbalance from assigning irqs
to the 7th-12th cpus (cpu6-cpu11) or ‘export IRQBALANCE_BANNED_CPUS=ff000000,00000001‘
would prevent irqbalance from assigning irqs to the 1st (cpu0) and 57th-64th cpus
(cpu56-cpu63).
Notes: This environment variable will be discarded, please use IRQBALANCE_BANNED_CPULIST
instead. Before deleting this environment variable, Introduce a deprecation period first
for the consider of compatibility.
.TP
.B IRQBALANCE_BANNED_CPULIST
Provides a cpulist which irqbalance should ignore and never assign interrupts to.
If not specified, irqbalance use mask of isolated and adaptive-ticks CPUs on the
system as the default value.
.SH "SIGNALS"
.TP
.B SIGHUP
Forces a rescan of the available IRQs and system topology.
.SH "API"
irqbalance is able to communicate via socket and return it's current assignment
tree and setup, as well as set new settings based on sent values. Socket is abstract,
with a name in form of
.BR irqbalance<PID>.sock ,
where <PID> is the process ID of irqbalance instance to communicate with.
Possible values to send:
.TP
.B stats
Retrieve assignment tree of IRQs to CPUs, in recursive manner. For each CPU node
in tree, its type, number, load and whether the save mode is active are sent. For
each assigned IRQ type, it's number, load, number of IRQs since last rebalancing
and it's class are sent. Refer to types.h file for explanation of defines.
.TP
.B setup
Get the current value of sleep interval, mask of banned CPUs and list of banned IRQs.
.TP
.B settings sleep <s>
Set new value of sleep interval, <s> >= 1.
.TP
.B settings cpus <cpu_number1> <cpu_number2> ...
Ban listed CPUs from IRQ handling, all old values of banned CPUs are forgotten.
.TP
.B settings ban irqs <irq1> <irq2> ...
Ban listed IRQs from being balanced, all old values of banned IRQs are forgotten.
.PP
irqbalance checks SCM_CREDENTIALS of sender (only root user is allowed to interact).
Based on chosen tools, ancillary message with credentials needs to be sent with request.
.SH "HOMEPAGE"
https://github.com/Irqbalance/irqbalance