diff options
author | Denis Vlasenko <vda.linux@googlemail.com> | 2006-10-31 18:41:29 +0000 |
---|---|---|
committer | Denis Vlasenko <vda.linux@googlemail.com> | 2006-10-31 18:41:29 +0000 |
commit | 4126b1f5c6983b7c2dd4f92d635ab762d861c2d6 (patch) | |
tree | 7199827f4e50eb610c64e3735c730666348a864b /docs | |
parent | 8c76487f06023f33ea0d92f686b29b958748d3f8 (diff) | |
download | busybox-w32-4126b1f5c6983b7c2dd4f92d635ab762d861c2d6.tar.gz busybox-w32-4126b1f5c6983b7c2dd4f92d635ab762d861c2d6.tar.bz2 busybox-w32-4126b1f5c6983b7c2dd4f92d635ab762d861c2d6.zip |
add usefun info on SIGINT handling peculiarities
Diffstat (limited to 'docs')
-rw-r--r-- | docs/sigint.htm | 627 |
1 files changed, 627 insertions, 0 deletions
diff --git a/docs/sigint.htm b/docs/sigint.htm new file mode 100644 index 000000000..6fe76bbef --- /dev/null +++ b/docs/sigint.htm | |||
@@ -0,0 +1,627 @@ | |||
1 | <HTML> | ||
2 | <HEAD> | ||
3 | <link rel="SHORTCUT ICON" href="http://www.cons.org/favicon.ico"> | ||
4 | <TITLE>Proper handling of SIGINT/SIGQUIT [http://www.cons.org/cracauer/sigint.html]</TITLE> | ||
5 | <!-- Created by: GNU m4 using $Revision: 1.20 $ of crawww.m4lib on 11-Feb-2005 --> | ||
6 | <BODY BGCOLOR="#fff8e1"> | ||
7 | <CENTER><H2>Proper handling of SIGINT/SIGQUIT</H2></CENTER> | ||
8 | <img src=linie.png width="100%" alt=" "> | ||
9 | <P> | ||
10 | |||
11 | <table border=1 cellpadding=4> | ||
12 | <tr><th valign=top align=left>Abstract: </th> | ||
13 | <td valign=top align=left> | ||
14 | In UNIX terminal sessions, you usually have a key like | ||
15 | <code>C-c</code> (Control-C) to immediately end whatever program you | ||
16 | have running in the foreground. This should work even when the program | ||
17 | you called has called other programs in turn. Everything should be | ||
18 | aborted, giving you your command prompt back, no matter how deep the | ||
19 | call stack is. | ||
20 | |||
21 | <p>Basically, it's trivial. But the existence of interactive | ||
22 | applications that use SIGINT and/or SIGQUIT for other purposes than a | ||
23 | complete immediate abort make matters complicated, and - as was to | ||
24 | expect - left us with several ways to solve the problems. Of course, | ||
25 | existing shells and applications follow different ways. | ||
26 | |||
27 | <P>This Web pages outlines different ways to solve the problem and | ||
28 | argues that only one of them can do everything right, although it | ||
29 | means that we have to fix some existing software. | ||
30 | |||
31 | |||
32 | |||
33 | </td></tr><tr><th valign=top align=left>Intended audience: </th> | ||
34 | <td valign=top align=left>Programmers who implement programs that catch SIGINT/SIGQUIT. | ||
35 | <BR>Programmers who implements shells or shell-like programs that | ||
36 | execute batches of programs. | ||
37 | |||
38 | <p>Users who have problems problems getting rid of runaway shell | ||
39 | scripts using <code>Control-C</code>. Or have interactive applications | ||
40 | that don't behave right when sending SIGINT. Examples are emacs'es | ||
41 | that die on Control-g or shellscript statements that sometimes are | ||
42 | executed and sometimes not, apparently not determined by the user's | ||
43 | intention. | ||
44 | |||
45 | |||
46 | </td></tr><tr><th valign=top align=left>Required knowledge: </th> | ||
47 | <td valign=top align=left>You have to know what it means to catch SIGINT or SIGQUIT and how | ||
48 | processes are waiting for other processes (childs) they spawned. | ||
49 | |||
50 | |||
51 | </td></tr></table> | ||
52 | <img src=linie.png width="100%" alt=" "> | ||
53 | |||
54 | |||
55 | <H3>Basic concepts</H3> | ||
56 | |||
57 | What technically happens when you press Control-C is that all programs | ||
58 | running in the foreground in your current terminal (or virtual | ||
59 | terminal) get the signal SIGINT sent. | ||
60 | |||
61 | <p>You may change the key that triggers the signal using | ||
62 | <code>stty</code> and running programs may remap the SIGINT-sending | ||
63 | key at any time they like, without your intervention and without | ||
64 | asking you first. | ||
65 | |||
66 | <p>The usual reaction of a running program to SIGINT is to exit. | ||
67 | However, not all program do an exit on SIGINT, programs are free to | ||
68 | use the signal for other actions or to ignore it at all. | ||
69 | |||
70 | <p>All programs running in the foreground receive the signal. This may | ||
71 | be a nested "stack" of programs: You started a program that started | ||
72 | another and the outer is waiting for the inner to exit. This nesting | ||
73 | may be arbitrarily deep. | ||
74 | |||
75 | <p>The innermost program is the one that decides what to do on SIGINT. | ||
76 | It may exit, do something else or do nothing. Still, when the user hit | ||
77 | SIGINT, all the outer programs are awaken, get the signal and may | ||
78 | react on it. | ||
79 | |||
80 | <H3>What we try to achieve</H3> | ||
81 | |||
82 | The problem is with shell scripts (or similar programs that call | ||
83 | several subprograms one after another). | ||
84 | |||
85 | <p>Let us consider the most basic script: | ||
86 | <PRE> | ||
87 | #! /bin/sh | ||
88 | program1 | ||
89 | program2 | ||
90 | </PRE> | ||
91 | and the usual run looks like this: | ||
92 | <PRE> | ||
93 | $ sh myscript | ||
94 | [output of program1] | ||
95 | [output of program2] | ||
96 | $ | ||
97 | </PRE> | ||
98 | |||
99 | <p>Let us assume that both programs do nothing special on SIGINT, they | ||
100 | just exit. | ||
101 | |||
102 | <p>Now imagine the user hits C-c while a shellscript is executing its | ||
103 | first program. The following programs receive SIGINT: program1 and | ||
104 | also the shell executing the script. program1 exits. | ||
105 | |||
106 | <p>But what should the shell do? If we say that it is only the | ||
107 | innermost's programs business to react on SIGINT, the shell will do | ||
108 | nothing special (not exit) and it will continue the execution of the | ||
109 | script and run program2. But this is wrong: The user's intention in | ||
110 | hitting C-c is to abort the whole script, to get his prompt back. If | ||
111 | he hits C-c while the first program is running, he does not want | ||
112 | program2 to be even started. | ||
113 | |||
114 | <p>here is what would happen if the shell doesn't do anything: | ||
115 | <PRE> | ||
116 | $ sh myscript | ||
117 | [first half of program1's output] | ||
118 | C-c [users presses C-c] | ||
119 | [second half of program1's output will not be displayed] | ||
120 | [output of program2 will appear] | ||
121 | </PRE> | ||
122 | |||
123 | |||
124 | <p>Consider a more annoying example: | ||
125 | <pre> | ||
126 | #! /bin/sh | ||
127 | # let's assume there are 300 *.dat files | ||
128 | for file in *.dat ; do | ||
129 | dat2ascii $dat | ||
130 | done | ||
131 | </pre> | ||
132 | |||
133 | If your shell wouldn't end if the user hits <code>C-c</code>, | ||
134 | <code>C-c</code> would just end <strong>one</strong> dat2ascii run and | ||
135 | the script would continue. Thus, you had to hit <code>C-c</code> up to | ||
136 | 300 times to end this script. | ||
137 | |||
138 | <H3>Alternatives to do so</H3> | ||
139 | |||
140 | <p>There are several ways to handle abortion of shell scripts when | ||
141 | SIGINT is received while a foreground child runs: | ||
142 | |||
143 | <menu> | ||
144 | |||
145 | <li>As just outlined, the shellscript may just continue, ignoring the | ||
146 | fact that the user hit <code>C-c</code>. That way, your shellscript - | ||
147 | including any loops - would continue and you had no chance of aborting | ||
148 | it except using the kill command after finding out the outermost | ||
149 | shell's PID. This "solution" will not be discussed further, as it is | ||
150 | obviously not desirable. | ||
151 | |||
152 | <p><li>The shell itself exits immediately when it receives SIGINT. Not | ||
153 | only the program called will exit, but the calling (the | ||
154 | script-executing) shell. The first variant is to exit the shell (and | ||
155 | therefore discontinuing execution of the script) immediately, while | ||
156 | the background program may still be executing (remember that although | ||
157 | the shell is just waiting for the called program to exit, it is woken | ||
158 | up and may act). I will call the way of doing things the "IUE" (for | ||
159 | "immediate unconditional exit") for the rest of this document. | ||
160 | |||
161 | <p><li>As a variant of the former, when the shell receives SIGINT | ||
162 | while it is waiting for a child to exit, the shell does not exit | ||
163 | immediately. but it remembers the fact that a SIGINT happened. After | ||
164 | the called program exits and the shell's wait ends, the shell will | ||
165 | exit itself and hence discontinue the script. I will call the way of | ||
166 | doing things the "WUE" (for "wait and unconditional exit") for the | ||
167 | rest of this document. | ||
168 | |||
169 | <p><li>There is also a way that the calling shell can tell whether the | ||
170 | called program exited on SIGINT and if it ignored SIGINT (or used it | ||
171 | for other purposes). As in the <sl>WUE</sl> way, the shell waits for | ||
172 | the child to complete. It figures whether the program was ended on | ||
173 | SIGINT and if so, it discontinue the script. If the program did any | ||
174 | other exit, the script will be continued. I will call the way of doing | ||
175 | things the "WCE" (for "wait and cooperative exit") for the rest of | ||
176 | this document. | ||
177 | |||
178 | </menu> | ||
179 | |||
180 | <H3>The problem</H3> | ||
181 | |||
182 | On first sight, all three solutions (IUE, WUE and WCE) all seem to do | ||
183 | what we want: If C-c is hit while the first program of the shell | ||
184 | script runs, the script is discontinued. The user gets his prompt back | ||
185 | immediately. So what are the difference between these way of handling | ||
186 | SIGINT? | ||
187 | |||
188 | <p>There are programs that use the signal SIGINT for other purposes | ||
189 | than exiting. They use it as a normal keystroke. The user is expected | ||
190 | to use the key that sends SIGINT during a perfectly normal program | ||
191 | run. As a result, the user sends SIGINT in situations where he/she | ||
192 | does not want the program or the script to end. | ||
193 | |||
194 | <p>The primary example is the emacs editor: C-g does what ESC does in | ||
195 | other applications: It cancels a partially executed or prepared | ||
196 | operation. Technically, emacs remaps the key that sends SIGINT from | ||
197 | C-c to C-g and catches SIGINT. | ||
198 | |||
199 | <p>Remember that the SIGINT is sent to all programs running in the | ||
200 | foreground. If emacs is executing from a shell script, both emacs and | ||
201 | the shell get SIGINT. emacs is the program that decides what to do: | ||
202 | Exit on SIGINT or not. emacs decides not to exit. The problem arises | ||
203 | when the shell draws its own conclusions from receiving SIGINT without | ||
204 | consulting emacs for its opinion. | ||
205 | |||
206 | <p>Consider this script: | ||
207 | <PRE> | ||
208 | #! /bin/sh | ||
209 | emacs /tmp/foo | ||
210 | cp /tmp/foo /home/user/mail/sent | ||
211 | </PRE> | ||
212 | |||
213 | <p>If C-g is used in emacs, both the shell and emacs will received | ||
214 | SIGINT. Emacs will not exit, the user used C-g as a normal editing | ||
215 | keystroke, he/she does not want the script to be aborted on C-g. | ||
216 | |||
217 | <p>The central problem is that the second command (cp) may | ||
218 | unintentionally be killed when the shell draws its own conclusion | ||
219 | about the user's intention. The innermost program is the only one to | ||
220 | judge. | ||
221 | |||
222 | <H3>One more example</H3> | ||
223 | |||
224 | <p>Imagine a mail session using a curses mailer in a tty. You called | ||
225 | your mailer and started to compose a message. Your mailer calls emacs. | ||
226 | <code>C-g</code> is a normal editing key in emacs. Technically it | ||
227 | sends SIGINT (it was <code>C-c</code>, but emacs remapped the key) to | ||
228 | <menu> | ||
229 | <li>emacs | ||
230 | <li>the shell between your mailer and emacs, the one from your mailers | ||
231 | system("emacs /tmp/bla.44") command | ||
232 | <li>the mailer itself | ||
233 | <li>possibly another shell if your mailer was called by a shell script | ||
234 | or from another application using system(3) | ||
235 | <li>your interactive shell (which ignores it since it is interactive | ||
236 | and hence is not relevant to this discussion) | ||
237 | </menu> | ||
238 | |||
239 | <p>If everyone just exits on SIGINT, you will be left with nothing but | ||
240 | your login shell, without asking. | ||
241 | |||
242 | <p>But for sure you don't want to be dropped out of your editor and | ||
243 | out of your mailer back to the commandline, having your edited data | ||
244 | and mailer status deleted. | ||
245 | |||
246 | <p>Understand the difference: While <code>C-g</code> is used an a kind | ||
247 | of abort key in emacs, it isn't the major "abort everything" key. When | ||
248 | you use <code>C-g</code> in emacs, you want to end some internal emacs | ||
249 | command. You don't want your whole emacs and mailer session to end. | ||
250 | |||
251 | <p>So, if the shell exits immediately if the user sends SIGINT (the | ||
252 | second of the four ways shown above), the parent of emacs would die, | ||
253 | leaving emacs without the controlling tty. The user will lose it's | ||
254 | editing session immediately and unrecoverable. If the "main" shell of | ||
255 | the operating system defaults to this behavior, every editor session | ||
256 | that is spawned from a mailer or such will break (because it is | ||
257 | usually executed by system(3), which calls /bin/sh). This was the case | ||
258 | in FreeBSD before I and Bruce Evans changed it in 1998. | ||
259 | |||
260 | <p>If the shell recognized that SIGINT was sent and exits after the | ||
261 | current foreground process exited (the third way of the four), the | ||
262 | editor session will not be disturbed, but things will still not work | ||
263 | right. | ||
264 | |||
265 | <H3>A further look at the alternatives</H3> | ||
266 | |||
267 | <p>Still considering this script to examine the shell's actions in the | ||
268 | IUE, WUE and ICE way of handling SIGINT: | ||
269 | <PRE> | ||
270 | #! /bin/sh | ||
271 | emacs /tmp/foo | ||
272 | cp /tmp/foo /home/user/mail/sent | ||
273 | </PRE> | ||
274 | |||
275 | <p>The IUE ("immediate unconditional exit") way does not work at all: | ||
276 | emacs wants to survive the SIGINT (it's a normal editing key for | ||
277 | emacs), but its parent shell unconditionally thinks "We received | ||
278 | SIGINT. Abort everything. Now.". The shell will exit even before emacs | ||
279 | exits. But this will leave emacs in an unusable state, since the death | ||
280 | of its calling shell will leave it without required resources (file | ||
281 | descriptors). This way does not work at all for shellscripts that call | ||
282 | programs that use SIGINT for other purposes than immediate exit. Even | ||
283 | for programs that exit on SIGINT, but want to do some cleanup between | ||
284 | the signal and the exit, may fail before they complete their cleanup. | ||
285 | |||
286 | <p>It should be noted that this way has one advantage: If a child | ||
287 | blocks SIGINT and does not exit at all, this way will get control back | ||
288 | to the user's terminal. Since such programs should be banned from your | ||
289 | system anyway, I don't think that weighs against the disadvantages. | ||
290 | |||
291 | <p>WUE ("wait and unconditional exit") is a little more clever: If C-g | ||
292 | was used in emacs, the shell will get SIGINT. It will not immediately | ||
293 | exit, but remember the fact that a SIGINT happened. When emacs ends | ||
294 | (maybe a long time after the SIGINT), it will say "Ok, a SIGINT | ||
295 | happened sometime while the child was executing, the user wants the | ||
296 | script to be discontinued". It will then exit. The cp will not be | ||
297 | executed. But that's bad. The "cp" will be executed when the emacs | ||
298 | session ended without the C-g key ever used, but it will not be | ||
299 | executed when the user used C-g at least one time. That is clearly not | ||
300 | desired. Since C-g is a normal editing key in emacs, the user expects | ||
301 | the rest of the script to behave identically no matter what keys he | ||
302 | used. | ||
303 | |||
304 | <p>As a result, the "WUE" way is better than the "IUE" way in that it | ||
305 | does not break SIGINT-using programs completely. The emacs session | ||
306 | will end undisturbed. But it still does not support scripts where | ||
307 | other actions should be performed after a program that use SIGINT for | ||
308 | non-exit purposes. Since the behavior is basically undeterminable for | ||
309 | the user, this can lead to nasty surprises. | ||
310 | |||
311 | <p>The "WCE" way fixes this by "asking" the called program whether it | ||
312 | exited on SIGINT or not. While emacs receives SIGINT, it does not exit | ||
313 | on it and a calling shell waiting for its exit will not be told that | ||
314 | it exited on SIGINT. (Although it receives SIGINT at some point in | ||
315 | time, the system does not enforce that emacs will exit with | ||
316 | "I-exited-on-SIGINT" status. This is under emacs' control, see below). | ||
317 | |||
318 | <p>this still work for the normal script without SIGINT-using | ||
319 | programs:</p> | ||
320 | <PRE> | ||
321 | #! /bin/sh | ||
322 | program1 | ||
323 | program2 | ||
324 | </PRE> | ||
325 | |||
326 | Unless program1 and program2 mess around with signal handling, the | ||
327 | system will tell the calling shell whether the programs exited | ||
328 | normally or as a result of SIGINT. | ||
329 | |||
330 | <p>The "WCE" way then has an easy way to things right: When one called | ||
331 | program exited with "I-exited-on-SIGINT" status, it will discontinue | ||
332 | the script after this program. If the program ends without this | ||
333 | status, the next command in the script is started. | ||
334 | |||
335 | <p>It is important to understand that a shell in "WCE" modus does not | ||
336 | need to listen to the SIGINT signal at all. Both in the | ||
337 | "emacs-then-cp" script and in the "several-normal-programs" script, it | ||
338 | will be woken up and receive SIGINT when the user hits the | ||
339 | corresponding key. But the shell does not need to react on this event | ||
340 | and it doesn't need to remember the event of any SIGINT, either. | ||
341 | Telling whether the user wants to end a script is done by asking that | ||
342 | program that has to decide, that program that interprets keystrokes | ||
343 | from the user, the innermost program. | ||
344 | |||
345 | <H3>So everything is well with WCE?</H3> | ||
346 | |||
347 | Well, almost. | ||
348 | |||
349 | <p>The problem with the "WCE" modus is that there are broken programs | ||
350 | that do not properly communicate the required information up to the | ||
351 | calling program. | ||
352 | |||
353 | <p>Unless a program messes with signal handling, the system does this | ||
354 | automatically. | ||
355 | |||
356 | <p>There are programs that want to exit on SIGINT, but they don't let | ||
357 | the system do the automatic exit, because they want to do some | ||
358 | cleanup. To do so, they catch SIGINT, do the cleanup and then exit by | ||
359 | themselves. | ||
360 | |||
361 | <p>And here is where the problem arises: Once they catch the signal, | ||
362 | the system will no longer communicate the "I-exited-on-SIGINT" status | ||
363 | to the calling program automatically. Even if the program exit | ||
364 | immediately in the signal handler of SIGINT. Once it catches the | ||
365 | signal, it has to take care of communicating the signal status | ||
366 | itself. | ||
367 | |||
368 | <p>Some programs don't do this. On SIGINT, they do cleanup and exit | ||
369 | immediatly, but the calling shell isn't told about the non-normal exit | ||
370 | and it will call the next program in the script. | ||
371 | |||
372 | <p>As a result, the user hits SIGINT and while one program exits, the | ||
373 | shellscript continues. To him/her it looks like the shell fails to | ||
374 | obey to his abortion command. | ||
375 | |||
376 | <p>Both IUE or WUE shell would not have this problem, since they | ||
377 | discontinue the script on their own. But as I said, they don't support | ||
378 | programs using SIGINT for non-exiting purposes, no matter whether | ||
379 | these programs properly communicate their signal status to the calling | ||
380 | shell or not. | ||
381 | |||
382 | <p>Since some shell in wide use implement the WUE way (and some even | ||
383 | IUE), there is a considerable number of broken programs out there that | ||
384 | break WCE shells. The programmers just don't recognize it if their | ||
385 | shell isn't WCE. | ||
386 | |||
387 | <H3>How to be a proper program</H3> | ||
388 | |||
389 | <p>(Short note in advance: What you need to achieve is that | ||
390 | WIFSIGNALED(status) is true in the calling program and that | ||
391 | WTERMSIG(status) returns SIGINT.) | ||
392 | |||
393 | <p>If you don't catch SIGINT, the system automatically does the right | ||
394 | thing for you: Your program exits and the calling program gets the | ||
395 | right "I-exited-on-SIGINT" status after waiting for your exit. | ||
396 | |||
397 | <p>But once you catch SIGINT, you have to act. | ||
398 | |||
399 | <p>Decide whether the SIGINT is used for exit/abort purposes and hence | ||
400 | a shellscript calling this program should discontinue. This is | ||
401 | hopefully obvious. If you just need to do some cleanup on SIGINT, but | ||
402 | then exit immediately, the answer is "yes". | ||
403 | |||
404 | <p>If so, you have to tell the calling program about it by exiting | ||
405 | with the "I-exited-on-SIGINT" status. | ||
406 | |||
407 | <p>There is no other way of doing this than to kill yourself with a | ||
408 | SIGINT signal. Do it by resetting the SIGINT handler to SIG_DFL, then | ||
409 | send yourself the signal. | ||
410 | |||
411 | <PRE> | ||
412 | void sigint_handler(int sig) | ||
413 | { | ||
414 | <do some cleanup> | ||
415 | signal(SIGINT, SIG_DFL); | ||
416 | kill(getpid(), SIGINT); | ||
417 | } | ||
418 | </PRE> | ||
419 | |||
420 | Notes: | ||
421 | |||
422 | <MENU> | ||
423 | |||
424 | <LI>You cannot "fake" the proper exit status by an exit(3) with a | ||
425 | special numeric value. People often assume this since the manuals for | ||
426 | shells often list some return value for exactly this. But this is just | ||
427 | a convention for your shell script. It does not work from one UNIX API | ||
428 | program to another. | ||
429 | |||
430 | <P>All that happens is that the shell sets the "$?" variable to a | ||
431 | special numeric value for the convenience of your script, because your | ||
432 | script does not have access to the lower-lever UNIX status evaluation | ||
433 | functions. This is just an agreement between your script and the | ||
434 | executing shell, it does not have any meaning in other contexts. | ||
435 | |||
436 | <P><LI>Do not use kill(0, SIGINT) without consulting the manul for | ||
437 | your OS implementation. I.e. on BSD, this would not send the signal to | ||
438 | the current process, but to all processes in the group. | ||
439 | |||
440 | <P><LI>POSIX 1003.1 allows all these calls to appear in signal | ||
441 | handlers, so it is portable. | ||
442 | |||
443 | </MENU> | ||
444 | |||
445 | <p>In a bourne shell script, you can catch signals using the | ||
446 | <code>trap</code> command. Here, the same as for C programs apply. If | ||
447 | the intention of SIGINT is to end your program, you have to exit in a | ||
448 | way that the calling programs "sees" that you have been killed. If | ||
449 | you don't catch SIGINT, this happend automatically, but of you catch | ||
450 | SIGINT, i.e. to do cleanup work, you have to end the program by | ||
451 | killing yourself, not by calling exit. | ||
452 | |||
453 | <p>Consider this example from FreeBSD's <code>mkdep</code>, which is a | ||
454 | bourne shell script. | ||
455 | |||
456 | <pre> | ||
457 | TMP=_mkdep$$ | ||
458 | trap 'rm -f $TMP ; trap 2 ; kill -2 $$' 1 2 3 13 15 | ||
459 | </pre> | ||
460 | |||
461 | Yes, you have to do it the hard way. It's even more annoying in shell | ||
462 | scripts than in C programs since you can't "pre-delete" temporary | ||
463 | files (which isn't really portable in C, though). | ||
464 | |||
465 | <P>All this applies to programs in all languages, not only C and | ||
466 | bourne shell. Every language implementation that lets you catch SIGINT | ||
467 | should also give you the option to reset the signal and kill yourself. | ||
468 | |||
469 | <P>It is always desireable to exit the right way, even if you don't | ||
470 | expect your usual callers to depend on it, some unusual one will come | ||
471 | along. This proper exit status will be needed for WCE and will not | ||
472 | hurt when the calling shell uses IUE or WUE. | ||
473 | |||
474 | <H3>How to be a proper shell</H3> | ||
475 | |||
476 | All this applies only for the script-executing case. Most shells will | ||
477 | also have interactive modes where things are different. | ||
478 | |||
479 | <MENU> | ||
480 | |||
481 | <LI>Do nothing special when SIGINT appears while you wait for a child. | ||
482 | You don't even have to remember that one happened. | ||
483 | |||
484 | <P><LI>Wait for child to exit, get the exit status. Do not truncate it | ||
485 | to type char. | ||
486 | |||
487 | <P><LI>Look at WIFSIGNALED(status) and WTERMSIG(status) to tell | ||
488 | whether the child says "I exited on SIGINT: in my opinion the user | ||
489 | wants the shellscript to be discontinued". | ||
490 | |||
491 | <P><LI>If the latter applies, discontinue the script. | ||
492 | |||
493 | <P><LI>Exit. But since a shellscript may in turn be called by a | ||
494 | shellscript, you need to make sure that you properly communicate the | ||
495 | discontinue intention to the calling program. As in any other program | ||
496 | (see above), do | ||
497 | |||
498 | <PRE> | ||
499 | signal(SIGINT, SIG_DFL); | ||
500 | kill(getpid(), SIGINT); | ||
501 | </PRE> | ||
502 | |||
503 | </MENU> | ||
504 | |||
505 | <H3>Other remarks</H3> | ||
506 | |||
507 | Although this web page talks about SIGINT only, almost the same issues | ||
508 | apply to SIGQUIT, including proper exiting by killing yourself after | ||
509 | catching the signal and proper reaction on the WIFSIGNALED(status) | ||
510 | value. One notable difference for SIGQUIT is that you have to make | ||
511 | sure that not the whole call tree dumps core. | ||
512 | |||
513 | <H3>What to fight</H3> | ||
514 | |||
515 | Make sure all programs <em>really</em> kill themselves if they react | ||
516 | to SIGINT or SIGQUIT and intend to abort their operation as a result | ||
517 | of this signal. Programs that don't use SIGINT/SIGQUIT as a | ||
518 | termination trigger - but as part of normal operation - don't kill | ||
519 | themselves, but do a normal exit instead. | ||
520 | |||
521 | <p>Make sure people understand why you can't fake an exit-on-signal by | ||
522 | doing exit(...) using any numerical status. | ||
523 | |||
524 | <p>Make sure you use a shell that behaves right. Especially if you | ||
525 | develop programs, since it will help seeing problems. | ||
526 | |||
527 | <H3>Concrete examples how to fix programs:</H3> | ||
528 | <ul> | ||
529 | |||
530 | <li>The fix for FreeBSD's | ||
531 | <A HREF="http://www.freebsd.org/cgi/cvsweb.cgi/src/usr.bin/time/time.c.diff?r1=1.10&r2=1.11">time(1)</A>. This fix is the best example, it's quite short and clear and | ||
532 | it fixes a case where someone tried to fake signal exit status by a | ||
533 | numerical value. And the complete program is small. | ||
534 | |||
535 | <p><li>Fix for FreeBSD's | ||
536 | <A HREF="http://www.freebsd.org/cgi/cvsweb.cgi/src/usr.bin/truss/main.c.diff?r1=1.9&r2=1.10">truss(1)</A>. | ||
537 | |||
538 | <p><li>The fix for FreeBSD's | ||
539 | <A HREF="http://www.freebsd.org/cgi/cvsweb.cgi/src/usr.bin/mkdep/mkdep.gcc.sh.diff?r1=1.8.2.1&r2=1.8.2.2">mkdep(1)</A>, a shell script. | ||
540 | |||
541 | |||
542 | <p><li>Fix for FreeBSD's make(1), <A HREF="http://www.freebsd.org/cgi/cvsweb.cgi/src/usr.bin/make/job.c.diff?r1=1.9&r2=1.10">part 1</A>, | ||
543 | <A HREF="http://www.freebsd.org/cgi/cvsweb.cgi/src/usr.bin/make/compat.c.diff?r1=1.10&r2=1.11">part 2</A>. | ||
544 | |||
545 | </ul> | ||
546 | |||
547 | <H3>Testsuite for shells</H3> | ||
548 | |||
549 | I have a collection of shellscripts that test shells for the | ||
550 | behavior. See my <A HREF="download/">download dir</A> to get the newest | ||
551 | "sh-interrupt" files, either as a tarfile or as individual file for | ||
552 | online browsing. This isn't really documented, besides from the | ||
553 | comments the scripts echo. | ||
554 | |||
555 | <H3>Appendix 1 - table of implementation choices</H3> | ||
556 | |||
557 | <table border cellpadding=2> | ||
558 | |||
559 | <tr valign=top> | ||
560 | <th>Method sign</th> | ||
561 | <th>Does what?</th> | ||
562 | <th>Example shells that implement it:</th> | ||
563 | <th>What happens when a shellscript called emacs, the user used | ||
564 | <code>C-g</code> and the script has additional commands in it?</th> | ||
565 | <th>What happens when a shellscript called emacs, the user did not use | ||
566 | <code>C-c</code> and the script has additional commands in it?</th> | ||
567 | <th>What happens if a non-interactive child catches SIGINT?</th> | ||
568 | <th>To behave properly, childs must do what?</th> | ||
569 | </tr> | ||
570 | |||
571 | <tr valign=top align=left> | ||
572 | <td>IUE</td> | ||
573 | <td>The shell executing a script exits immediately if it receives | ||
574 | SIGINT.</td> | ||
575 | <td>4.4BSD ash (ash), NetBSD, FreeBSD prior to 3.0/22.8</td> | ||
576 | <td>The editor session is lost and subsequent commands are not | ||
577 | executed.</td> | ||
578 | <td>The editor continues as normal and the subsequent commands are | ||
579 | executed. </td> | ||
580 | <td>The scripts ends immediately, returning to the caller even before | ||
581 | the current foreground child of the shell exits. </td> | ||
582 | <td>It doesn't matter what the child does or how it exits, even if the | ||
583 | child continues to operate, the shell returns. </td> | ||
584 | </tr> | ||
585 | |||
586 | <tr valign=top align=left> | ||
587 | <td>WUE</td> | ||
588 | <td>If the shell executing a script received SIGINT while a foreground | ||
589 | process was running, it will exit after that child's exit.</td> | ||
590 | <td>pdksh (OpenBSD /bin/sh)</td> | ||
591 | <td>The editor continues as normal, but subsequent commands from the | ||
592 | script are not executed.</td> | ||
593 | <td>The editor continues as normal and subsequent commands are | ||
594 | executed. </td> | ||
595 | <td>The scripts returns to its caller after the current foreground | ||
596 | child exits, no matter how the child exited. </td> | ||
597 | <td>It doesn't matter how the child exits (signal status or not), but | ||
598 | if it doesn't return at all, the shell will not return. In no case | ||
599 | will further commands from the script be executed. </td> | ||
600 | </tr> | ||
601 | |||
602 | <tr valign=top align=left> | ||
603 | <td>WCE</td> | ||
604 | <td>The shell exits if a child signaled that it was killed on a | ||
605 | signal (either it had the default handler for SIGINT or it killed | ||
606 | itself). </td> | ||
607 | <td>bash (Linux /bin/sh), most commercial /bin/sh, FreeBSD /bin/sh | ||
608 | from 3.0/2.2.8.</td> | ||
609 | <td>The editor continues as normal and subsequent commands are | ||
610 | executed. </td> | ||
611 | <td>The editor continues as normal and subsequent commands are | ||
612 | executed. </td> | ||
613 | <td>The scripts returns to its caller after the current foreground | ||
614 | child exits, but only if the child exited with signal status. If | ||
615 | the child did a normal exit (even if it received SIGINT, but catches | ||
616 | it), the script will continue. </td> | ||
617 | <td>The child must be implemented right, or the user will not be able | ||
618 | to break shell scripts reliably.</td> | ||
619 | </tr> | ||
620 | |||
621 | </table> | ||
622 | |||
623 | <P><img src=linie.png width="100%" alt=" "> | ||
624 | <BR>©2005 Martin Cracauer <cracauer @ cons.org> | ||
625 | <A HREF="http://www.cons.org/cracauer/">http://www.cons.org/cracauer/</A> | ||
626 | <BR>Last changed: $Date: 2005/02/11 21:44:43 $ | ||
627 | </BODY></HTML> | ||