diff options
author | Rob Landley <rob@landley.net> | 2006-05-11 15:00:32 +0000 |
---|---|---|
committer | Rob Landley <rob@landley.net> | 2006-05-11 15:00:32 +0000 |
commit | b73d2bf4bfac8f43cb068ba7a63a057eb9ca88ce (patch) | |
tree | 55b5e472cdb4d24c07621941769976dd99702c88 /docs/busybox.net | |
parent | 8d2cb8be3b4c6be632911a2705b13e084ab9ef72 (diff) | |
download | busybox-w32-b73d2bf4bfac8f43cb068ba7a63a057eb9ca88ce.tar.gz busybox-w32-b73d2bf4bfac8f43cb068ba7a63a057eb9ca88ce.tar.bz2 busybox-w32-b73d2bf4bfac8f43cb068ba7a63a057eb9ca88ce.zip |
Reorganize FAQ, update a few entries, and consolidate with programming.html.
Diffstat (limited to 'docs/busybox.net')
-rw-r--r-- | docs/busybox.net/FAQ.html | 901 | ||||
-rw-r--r-- | docs/busybox.net/programming.html | 584 |
2 files changed, 758 insertions, 727 deletions
diff --git a/docs/busybox.net/FAQ.html b/docs/busybox.net/FAQ.html index b21f722b6..07c1fd4e9 100644 --- a/docs/busybox.net/FAQ.html +++ b/docs/busybox.net/FAQ.html | |||
@@ -1,38 +1,62 @@ | |||
1 | <!--#include file="header.html" --> | 1 | <!--#include file="header.html" --> |
2 | 2 | ||
3 | |||
4 | <h3>Frequently Asked Questions</h3> | 3 | <h3>Frequently Asked Questions</h3> |
5 | 4 | ||
6 | This is a collection of some of the more frequently asked questions | 5 | This is a collection of some of the more frequently asked questions |
7 | about BusyBox. Some of the questions even have answers. If you | 6 | about BusyBox. Some of the questions even have answers. If you |
8 | have additions to this FAQ document, we would love to add them, | 7 | have additions to this FAQ document, we would love to add them, |
9 | 8 | ||
9 | <h2>General questions</h2> | ||
10 | <ol> | 10 | <ol> |
11 | <li><a href="#getting_started">How can I get started using BusyBox?</a> | 11 | <li><a href="#getting_started">How can I get started using BusyBox?</a> |
12 | <li><a href="#build_system">How do I build a BusyBox-based system?</a> | 12 | <li><a href="#build_system">How do I build a BusyBox-based system?</a> |
13 | <li><a href="#init">Busybox init isn't working!</a> | ||
14 | <li><a href="#kernel">Which Linux kernel versions are supported?</a> | 13 | <li><a href="#kernel">Which Linux kernel versions are supported?</a> |
15 | <li><a href="#arch">Which architectures does BusyBox run on?</a> | 14 | <li><a href="#arch">Which architectures does BusyBox run on?</a> |
16 | <li><a href="#libc">Which C libraries are supported?</a> | 15 | <li><a href="#libc">Which C libraries are supported?</a> |
17 | <li><a href="#commercial">Can I include BusyBox as part of the software on my device?</a> | 16 | <li><a href="#commercial">Can I include BusyBox as part of the software on my device?</a> |
18 | <li><a href="#bugs">I think I found a bug in BusyBox! What should I do?!</a> | ||
19 | <li><a href="#job_control">Why do I keep getting "sh: can't access tty; job control | ||
20 | turned off" errors? Why doesn't Control-C work within my shell?</a> | ||
21 | <li><a href="#demanding">I demand that you to add <favorite feature> right now! How come | ||
22 | you don't answer all my questions on the mailing list instantly? I demand | ||
23 | that you help me with all of my problems <em>Right Now</em>!</a> | ||
24 | <li><a href="#helpme">I need help with BusyBox! What should I do?</a> | ||
25 | <li><a href="#contracts">I need you to add <favorite feature>! Are the BusyBox developers willing to | ||
26 | be paid in order to fix bugs or add in <favorite feature>? Are you willing to provide | ||
27 | support contracts?</a> | ||
28 | <li><a href="#external">Where can I find other small utilities since busybox does not include the features I want?</a></li> | 17 | <li><a href="#external">Where can I find other small utilities since busybox does not include the features I want?</a></li> |
29 | <li><a href="#support">I think you guys are great and I want to help support your work!</a> | 18 | <li><a href="#demanding">I demand that you to add <favorite feature> right now! How come you don't answer all my questions on the mailing list instantly? I demand that you help me with all of my problems <em>Right Now</em>!</a> |
30 | <li><a href="#optimize">I want to make busybox even smaller, how do I go about it?</a> | 19 | <li><a href="#helpme">I need help with BusyBox! What should I do?</a> |
20 | <li><a href="#contracts">I need you to add <favorite feature>! Are the BusyBox developers willing to be paid in order to fix bugs or add in <favorite feature>? Are you willing to provide support contracts?</a> | ||
21 | </ol> | ||
31 | 22 | ||
23 | <h2>Troubleshooting</h2> | ||
24 | <ol> | ||
25 | <li><a href="#bugs">I think I found a bug in BusyBox! What should I do?!</a></li> | ||
26 | <li><a href="#init">Busybox init isn't working!</a></li> | ||
27 | <li><a href="#sed">I can't configure busybox on my system.</a></li> | ||
28 | <li><a href="#job_control">Why do I keep getting "sh: can't access tty; job control turned off" errors? Why doesn't Control-C work within my shell?</a></li> | ||
29 | </ol> | ||
30 | |||
31 | <h2>Programming questions</h2> | ||
32 | <ol> | ||
33 | <li><a href="#goals">What are the goals of busybox?</a></li> | ||
34 | <li><a href="#design">What is the design of busybox?</a></li> | ||
35 | <li><a href="#source">How is the source code organized?</a></li> | ||
36 | <ul> | ||
37 | <li><a href="#source_applets">The applet directories.</a></li> | ||
38 | <li><a href="#source_libbb">The busybox shared library (libbb)</a></li> | ||
39 | </ul> | ||
40 | <li><a href="#optimize">I want to make busybox even smaller, how do I go about it?</a></li> | ||
41 | <li><a href="#adding">Adding an applet to busybox</a></li> | ||
42 | <li><a href="#standards">What standards does busybox adhere to?</a></li> | ||
43 | <li><a href="#portability">Portability.</a></li> | ||
44 | <li><a href="#tips">Tips and tricks.</a></li> | ||
45 | <ul> | ||
46 | <li><a href="#tips_encrypted_passwords">Encrypted Passwords</a></li> | ||
47 | <li><a href="#tips_vfork">Fork and vfork</a></li> | ||
48 | <li><a href="#tips_short_read">Short reads and writes</a></li> | ||
49 | <li><a href="#tips_memory">Memory used by relocatable code, PIC, and static linking.</a></li> | ||
50 | <li><a href="#tips_kernel_headers">Including Linux kernel headers.</a></li> | ||
51 | </ul> | ||
52 | <li><a href="#who">Who are the BusyBox developers?</a></li> | ||
53 | </ul> | ||
32 | 54 | ||
33 | 55 | ||
34 | </ol> | 56 | </ol> |
35 | 57 | ||
58 | <h1>General questions</h1> | ||
59 | |||
36 | <hr /> | 60 | <hr /> |
37 | <p> | 61 | <p> |
38 | <h2><a name="getting_started">How can I get started using BusyBox?</a></h2> | 62 | <h2><a name="getting_started">How can I get started using BusyBox?</a></h2> |
@@ -116,34 +140,6 @@ have additions to this FAQ document, we would love to add them, | |||
116 | 140 | ||
117 | <hr /> | 141 | <hr /> |
118 | <p> | 142 | <p> |
119 | <h2><a name="init">Busybox init isn't working!</a></h2> | ||
120 | <p> | ||
121 | Build a statically linked version of the following "hello world" program | ||
122 | with your cross compiler toolchain. | ||
123 | </p> | ||
124 | <pre> | ||
125 | #include <stdio.h> | ||
126 | |||
127 | int main(int argc, char *argv) | ||
128 | { | ||
129 | printf("Hello world!\n"); | ||
130 | sleep(999999999); | ||
131 | } | ||
132 | </pre> | ||
133 | |||
134 | <p> | ||
135 | Now try to boot your device with an "init=" argument pointing to your | ||
136 | hello world program. Did you see the hello world message? Until you | ||
137 | do, don't bother messing with busybox init. | ||
138 | </p> | ||
139 | |||
140 | <p> | ||
141 | Once you've got it working statically linked, try getting it to work | ||
142 | dynamically linked. Then read the FAQ entry before this one. | ||
143 | </p> | ||
144 | |||
145 | <hr /> | ||
146 | <p> | ||
147 | <h2><a name="kernel">Which Linux kernel versions are supported?</a></h2> | 143 | <h2><a name="kernel">Which Linux kernel versions are supported?</a></h2> |
148 | <p> | 144 | <p> |
149 | Full functionality requires Linux 2.4.x or better. (Earlier versions may | 145 | Full functionality requires Linux 2.4.x or better. (Earlier versions may |
@@ -185,73 +181,29 @@ int main(int argc, char *argv) | |||
185 | <a href="http://www.busybox.net/lists/busybox/2005-March/013759.html">this thread</a>). | 181 | <a href="http://www.busybox.net/lists/busybox/2005-March/013759.html">this thread</a>). |
186 | This is still experimental, but may be supported in a future release. | 182 | This is still experimental, but may be supported in a future release. |
187 | </p> | 183 | </p> |
184 | |||
188 | <hr /> | 185 | <hr /> |
189 | <p> | 186 | <p> |
190 | <h2><a name="commercial">Can I include BusyBox as part of the software on my device?</a></h2> | 187 | <h2><a name="commercial">Can I include BusyBox as part of the software on my device?</a></h2> |
188 | <p> | ||
191 | 189 | ||
190 | <p> | ||
192 | Yes. As long as you <a href="http://busybox.net/license.html">fully comply | 191 | Yes. As long as you <a href="http://busybox.net/license.html">fully comply |
193 | with the generous terms of the GPL BusyBox license</a> you can ship BusyBox | 192 | with the generous terms of the GPL BusyBox license</a> you can ship BusyBox |
194 | as part of the software on your device. | 193 | as part of the software on your device. |
195 | 194 | </p> | |
196 | <br> | ||
197 | <a href="#support">Please consider sharing some of the money you make.</a> | ||
198 | |||
199 | |||
200 | <hr /> | ||
201 | <p> | ||
202 | <h2><a name="bugs">I think I found a bug in BusyBox! What should I do?</a></h2> | ||
203 | <p> | ||
204 | |||
205 | |||
206 | <p> | ||
207 | |||
208 | If you simply need help with using or configuring BusyBox, please submit a | ||
209 | detailed description of your problem to the BusyBox mailing list at <a | ||
210 | href="mailto:busybox@mail.busybox.net"> busybox@mail.busybox.net</a>. | ||
211 | Please do not send email to individual developers asking | ||
212 | for private help unless you are planning on paying for consulting services. | ||
213 | When we answer questions on the BusyBox mailing list, it helps everyone, | ||
214 | while private answers help only you... | ||
215 | |||
216 | <p> | ||
217 | |||
218 | The developers of BusyBox are busy people, and have only so much they can | ||
219 | keep in their brains at a time. As a result, bug reports sometimes get | ||
220 | lost when posted to the mailing list. To prevent your bug report from | ||
221 | getting lost, if you find a bug in BusyBox, please use the <a | ||
222 | href="http://bugs.busybox.net/">BusyBox Bug and Patch Tracking System</a> | ||
223 | to submit a detailed bug report. | ||
224 | |||
225 | <p> | ||
226 | |||
227 | The same also applies to patches... Regardless of whether your patch is a | ||
228 | bug fix or adds shiney new features, please post your patch to the <a | ||
229 | href="http://bugs.busybox.net/">BusyBox Bug and Patch Tracking System</a> | ||
230 | to make certain it is properly considered. | ||
231 | |||
232 | 195 | ||
233 | <hr /> | 196 | <hr /> |
234 | <p> | 197 | <p> |
235 | <h2><a name="job_control">Why do I keep getting "sh: can't access tty; job control | 198 | <h2><a name="external">where can i find other small utilities since busybox |
236 | turned off" errors? Why doesn't Control-C work within my shell?</a></h2> | 199 | does not include the features i want?</a></h2> |
237 | <p> | 200 | <p> |
238 | 201 | we maintain such a <a href="tinyutils.html">list</a> on this site! | |
239 | Job control will be turned off since your shell can not obtain a controlling | 202 | </p> |
240 | terminal. This typically happens when you run your shell on /dev/console. | ||
241 | The kernel will not provide a controlling terminal on the /dev/console | ||
242 | device. Your should run your shell on a normal tty such as tty1 or ttyS0 | ||
243 | and everything will work perfectly. If you <em>REALLY</em> want your shell | ||
244 | to run on /dev/console, then you can hack your kernel (if you are into that | ||
245 | sortof thing) by changing drivers/char/tty_io.c to change the lines where | ||
246 | it sets "noctty = 1;" to instead set it to "0". I recommend you instead | ||
247 | run your shell on a real console... | ||
248 | |||
249 | 203 | ||
250 | <hr /> | 204 | <hr /> |
251 | <p> | 205 | <p> |
252 | <h2><a name="demanding">I demand that you to add <favorite feature> right now! How come | 206 | <h2><a name="demanding">I demand that you to add <favorite feature> right now! How come you don't answer all my questions on the mailing list instantly? I demand that you help me with all of my problems <em>Right Now</em>!</a></h2> |
253 | you don't answer all my questions on the mailing list instantly? I demand | ||
254 | that you help me with all of my problems <em>Right Now</em>!</a></h2> | ||
255 | <p> | 207 | <p> |
256 | 208 | ||
257 | You have not paid us a single cent and yet you still have the product of | 209 | You have not paid us a single cent and yet you still have the product of |
@@ -266,81 +218,243 @@ int main(int argc, char *argv) | |||
266 | <p> | 218 | <p> |
267 | 219 | ||
268 | If you find that you need help with BusyBox, you can ask for help on the | 220 | If you find that you need help with BusyBox, you can ask for help on the |
269 | BusyBox mailing list at busybox@mail.busybox.net. In addition to the BusyBox | 221 | BusyBox mailing list at busybox@busybox.net.</p> |
270 | mailing list, Erik (andersee), Manuel (mjn3), Rob (landley) and others are | 222 | |
271 | known to hang out on the uClibc IRC channel: #uclibc on irc.freenode.net. | 223 | <p> In addition to the mailing list, Erik Andersen (andersee), Manuel Nova |
272 | (Daily logs of that IRC channel, going back to 2002, are available | 224 | (mjn3), Rob Landley (landley), Mike Frysinger (SpanKY), Bernhard Fischer |
273 | <a href="http://ibot.Rikers.org/%23uclibc/">here</a>.) | 225 | (blindvt), and other long-time BusyBox developers are known to hang out |
274 | 226 | on the uClibc IRC channel: #uclibc on irc.freenode.net. There is a | |
275 | <p> | 227 | <a href="http://ibot.Rikers.org/%23uclibc/">web archive of |
228 | daily logs of the #uclibc IRC channel</a> going back to 2002. | ||
229 | </p> | ||
276 | 230 | ||
231 | <p> | ||
277 | <b>Please do not send private email to Rob, Erik, Manuel, or the other | 232 | <b>Please do not send private email to Rob, Erik, Manuel, or the other |
278 | BusyBox contributors asking for private help unless you are planning on | 233 | BusyBox contributors asking for private help unless you are planning on |
279 | paying for consulting services.</b> | 234 | paying for consulting services.</b> |
235 | </p> | ||
280 | 236 | ||
281 | <p> | 237 | <p> |
282 | |||
283 | When we answer questions on the BusyBox mailing list, it helps everyone | 238 | When we answer questions on the BusyBox mailing list, it helps everyone |
284 | since people with similar problems in the future will be able to get help | 239 | since people with similar problems in the future will be able to get help |
285 | by searching the mailing list archives. Private help is reserved as a paid | 240 | by searching the mailing list archives. Private help is reserved as a paid |
286 | service. If you need to use private communication, or if you are serious | 241 | service. If you need to use private communication, or if you are serious |
287 | about getting timely assistance with BusyBox, you should seriously consider | 242 | about getting timely assistance with BusyBox, you should seriously consider |
288 | paying for consulting services. | 243 | paying for consulting services. |
244 | </p> | ||
289 | 245 | ||
290 | <p> | 246 | <hr /> |
247 | <p> | ||
248 | <h2><a name="contracts">I need you to add <favorite feature>! Are the BusyBox developers willing to be paid in order to fix bugs or add in <favorite feature>? Are you willing to provide support contracts?</a></h2> | ||
249 | </p> | ||
291 | 250 | ||
251 | <p> | ||
252 | Yes we are. The easy way to sponsor a new feature is to post an offer on | ||
253 | the mailing list to see who's interested. You can also email the project's | ||
254 | maintainer and ask them to recommend someone. | ||
255 | </p> | ||
292 | 256 | ||
257 | <p> If you prefer to deal with an organization rather than an individual, Rob | ||
258 | Landley (the current BusyBox maintainer) works for | ||
259 | <a http://www.timesys.com>TimeSys</a>, and Eric Andersen (the previous | ||
260 | busybox maintainer and current uClibc maintainer) owns | ||
261 | <a href="http://codepoet-consulting.com/">CodePoet Consulting</a>. Both | ||
262 | companies offer support contracts and handle new development, and there | ||
263 | are plenty of other companies that do the same. | ||
264 | </p> | ||
265 | |||
266 | |||
267 | |||
268 | |||
269 | <h1>Troubleshooting</h1> | ||
293 | 270 | ||
294 | <hr /> | 271 | <hr /> |
272 | <p></p> | ||
273 | <h2><a name="bugs">I think I found a bug in BusyBox! What should I do?</a></h2> | ||
274 | <p></p> | ||
275 | |||
295 | <p> | 276 | <p> |
296 | <h2><a name="contracts">I need you to add <favorite feature>! Are the BusyBox | 277 | If you simply need help with using or configuring BusyBox, please submit a |
297 | developers willing to be paid in order to fix bugs or add in <favorite feature>? | 278 | detailed description of your problem to the BusyBox mailing list at <a |
298 | Are you willing to provide support contracts?</a></h2> | 279 | href="mailto:busybox@busybox.net"> busybox@busybox.net</a>. |
280 | Please do not send email to individual developers asking | ||
281 | for private help unless you are planning on paying for consulting services. | ||
282 | When we answer questions on the BusyBox mailing list, it helps everyone, | ||
283 | while private answers help only you... | ||
284 | </p> | ||
285 | |||
299 | <p> | 286 | <p> |
287 | The developers of BusyBox are busy people, and have only so much they can | ||
288 | keep in their brains at a time. As a result, bug reports and new feature | ||
289 | patches sometimes get lost when posted to the mailing list. To prevent | ||
290 | your bug report from getting lost, if you find a bug in BusyBox that isn't | ||
291 | immediately addressed, please use the <a | ||
292 | href="http://bugs.busybox.net/">BusyBox Bug and Patch Tracking System</a> | ||
293 | to submit a detailed explanation and we'll get to it as soon as we can. | ||
294 | </p> | ||
300 | 295 | ||
301 | Sure! Now you have our attention! What you should do is contact <a | 296 | <hr /> |
302 | href="mailto:andersen@codepoet.org">Erik Andersen</a> of <a | 297 | <p> |
303 | href="http://codepoet-consulting.com/">CodePoet Consulting</a> to bid | 298 | <h2><a name="init">Busybox init isn't working!</a></h2> |
304 | on your project. If Erik is too busy to personally add your feature, there | 299 | <p> |
305 | are many other active BusyBox contributors who will almost certainly be able | 300 | Build a statically linked version of the following "hello world" program |
306 | to help you out. Erik can contact them privately, and may even let you to | 301 | with your cross compiler toolchain. |
307 | post your request for services on the mailing list. | 302 | </p> |
303 | <pre> | ||
304 | #include <stdio.h> | ||
308 | 305 | ||
306 | int main(int argc, char *argv) | ||
307 | { | ||
308 | printf("Hello world!\n"); | ||
309 | sleep(999999999); | ||
310 | } | ||
311 | </pre> | ||
312 | |||
313 | <p> | ||
314 | Now try to boot your device with an "init=" argument pointing to your | ||
315 | hello world program. Did you see the hello world message? Until you | ||
316 | do, don't bother messing with busybox init. | ||
317 | </p> | ||
318 | |||
319 | <p> | ||
320 | Once you've got it working statically linked, try getting it to work | ||
321 | dynamically linked. Then read the FAQ entry <a href="#build_system">How | ||
322 | do I build a BusyBox-based system?</a> | ||
323 | </p> | ||
309 | 324 | ||
310 | <hr /> | 325 | <hr /> |
311 | <p> | 326 | <p> |
312 | <h2><a name="external">Where can I find other small utilities since busybox | 327 | <h2><a name="sed">I can't configure busybox on my system.</a></h2> |
313 | does not include the features I want?</a></h2> | ||
314 | <p> | 328 | <p> |
315 | We maintain such a <a href="tinyutils.html">list</a> on this site! | 329 | Configuring Busybox depends on a recent version of sed. Older |
330 | distributions (Red Hat 7.2, Debian 3.0) may not come with a | ||
331 | usable version. Luckily BusyBox can use its own sed to configure itself, | ||
332 | although this leads to a bit of a chicken and egg problem. | ||
333 | You can work around this by hand-configuring busybox to build with just | ||
334 | sed, then putting that sed in your path to configure the rest of busybox | ||
335 | with, like so: | ||
336 | </p> | ||
316 | 337 | ||
338 | <pre> | ||
339 | tar xvjf sources/busybox-x.x.x.tar.bz2 | ||
340 | cd busybox-x.x.x | ||
341 | make allnoconfig | ||
342 | make include/bb_config.h | ||
343 | echo "CONFIG_SED=y" >> .config | ||
344 | echo "#undef ENABLE_SED" >> include/bb_config.h | ||
345 | echo "#define ENABLE_SED 1" >> include/bb_config.h | ||
346 | make | ||
347 | mv busybox sed | ||
348 | export PATH=`pwd`:"$PATH" | ||
349 | </pre> | ||
350 | |||
351 | <p>Then you can run "make defconfig" or "make menuconfig" normally.</p> | ||
317 | 352 | ||
318 | <hr /> | 353 | <hr /> |
319 | <p> | 354 | <p> |
320 | <h2><a name="support">I think you guys are great and I want to help support your work!</a></h2> | 355 | <h2><a name="job_control">Why do I keep getting "sh: can't access tty; job control turned off" errors? Why doesn't Control-C work within my shell?</a></h2> |
321 | <p> | 356 | <p> |
322 | 357 | ||
323 | Wow, that would be great! If you would like to make a donation to help | 358 | Job control will be turned off since your shell can not obtain a controlling |
324 | support BusyBox, and/or request features, you can click here: | 359 | terminal. This typically happens when you run your shell on /dev/console. |
325 | 360 | The kernel will not provide a controlling terminal on the /dev/console | |
326 | <!-- Begin PayPal Logo --> | 361 | device. Your should run your shell on a normal tty such as tty1 or ttyS0 |
327 | <center> | 362 | and everything will work perfectly. If you <em>REALLY</em> want your shell |
328 | <form action="https://www.paypal.com/cgi-bin/webscr" method="post"> | 363 | to run on /dev/console, then you can hack your kernel (if you are into that |
329 | <input type="hidden" name="cmd" value="_xclick"> | 364 | sortof thing) by changing drivers/char/tty_io.c to change the lines where |
330 | <input type="hidden" name="business" value="andersen@codepoet.org"> | 365 | it sets "noctty = 1;" to instead set it to "0". I recommend you instead |
331 | <input type="hidden" name="item_name" value="Support BusyBox"> | 366 | run your shell on a real console... |
332 | <input type="hidden" name="image_url" value="http://codepoet-consulting.com/images/codepoet.png"> | 367 | </p> |
333 | <input type="hidden" name="no_shipping" value="1"> | ||
334 | <input type="image" src="images/donate.png" name="submit" alt="Make donation using PayPal"> | ||
335 | </form> | ||
336 | </center> | ||
337 | <!-- End PayPal Logo --> | ||
338 | 368 | ||
339 | If you prefer to contact Erik directly to make a donation, donate hardware, | 369 | <h1>Development</h1> |
340 | request support, etc, you can contact | 370 | |
341 | <a href="http://codepoet-consulting.com/">CodePoet Consulting</a> here. | 371 | <h2><b><a name="goals">What are the goals of busybox?</a></b></h2> |
342 | CodePoet Consulting can accept both Visa and MasterCard for those that do | 372 | |
343 | not trust PayPal... | 373 | <p>Busybox aims to be the smallest and simplest correct implementation of the |
374 | standard Linux command line tools. First and foremost, this means the | ||
375 | smallest executable size we can manage. We also want to have the simplest | ||
376 | and cleanest implementation we can manage, be <a href="#standards">standards | ||
377 | compliant</a>, minimize run-time memory usage (heap and stack), run fast, and | ||
378 | take over the world.</p> | ||
379 | |||
380 | <h2><b><a name="design">What is the design of busybox?</a></b></h2> | ||
381 | |||
382 | <p>Busybox is like a swiss army knife: one thing with many functions. | ||
383 | The busybox executable can act like many different programs depending on | ||
384 | the name used to invoke it. Normal practice is to create a bunch of symlinks | ||
385 | pointing to the busybox binary, each of which triggers a different busybox | ||
386 | function. (See <a href="FAQ.html#getting_started">getting started</a> in the | ||
387 | FAQ for more information on usage, and <a href="BusyBox.html">the | ||
388 | busybox documentation</a> for a list of symlink names and what they do.) | ||
389 | |||
390 | <p>The "one binary to rule them all" approach is primarily for size reasons: a | ||
391 | single multi-purpose executable is smaller then many small files could be. | ||
392 | This way busybox only has one set of ELF headers, it can easily share code | ||
393 | between different apps even when statically linked, it has better packing | ||
394 | efficiency by avoding gaps between files or compression dictionary resets, | ||
395 | and so on.</p> | ||
396 | |||
397 | <p>Work is underway on new options such as "make standalone" to build separate | ||
398 | binaries for each applet, and a "libbb.so" to make the busybox common code | ||
399 | available as a shared library. Neither is ready yet at the time of this | ||
400 | writing.</p> | ||
401 | |||
402 | <a name="source"></a> | ||
403 | |||
404 | <h2><a name="source_applets"><b>The applet directories</b></a></h2> | ||
405 | |||
406 | <p>The directory "applets" contains the busybox startup code (applets.c and | ||
407 | busybox.c), and several subdirectories containing the code for the individual | ||
408 | applets.</p> | ||
409 | |||
410 | <p>Busybox execution starts with the main() function in applets/busybox.c, | ||
411 | which sets the global variable bb_applet_name to argv[0] and calls | ||
412 | run_applet_by_name() in applets/applets.c. That uses the applets[] array | ||
413 | (defined in include/busybox.h and filled out in include/applets.h) to | ||
414 | transfer control to the appropriate APPLET_main() function (such as | ||
415 | cat_main() or sed_main()). The individual applet takes it from there.</p> | ||
416 | |||
417 | <p>This is why calling busybox under a different name triggers different | ||
418 | functionality: main() looks up argv[0] in applets[] to get a function pointer | ||
419 | to APPLET_main().</p> | ||
420 | |||
421 | <p>Busybox applets may also be invoked through the multiplexor applet | ||
422 | "busybox" (see busybox_main() in applets/busybox.c), and through the | ||
423 | standalone shell (grep for STANDALONE_SHELL in applets/shell/*.c). | ||
424 | See <a href="FAQ.html#getting_started">getting started</a> in the | ||
425 | FAQ for more information on these alternate usage mechanisms, which are | ||
426 | just different ways to reach the relevant APPLET_main() function.</p> | ||
427 | |||
428 | <p>The applet subdirectories (archival, console-tools, coreutils, | ||
429 | debianutils, e2fsprogs, editors, findutils, init, loginutils, miscutils, | ||
430 | modutils, networking, procps, shell, sysklogd, and util-linux) correspond | ||
431 | to the configuration sub-menus in menuconfig. Each subdirectory contains the | ||
432 | code to implement the applets in that sub-menu, as well as a Config.in | ||
433 | file defining that configuration sub-menu (with dependencies and help text | ||
434 | for each applet), and the makefile segment (Makefile.in) for that | ||
435 | subdirectory.</p> | ||
436 | |||
437 | <p>The run-time --help is stored in usage_messages[], which is initialized at | ||
438 | the start of applets/applets.c and gets its help text from usage.h. During the | ||
439 | build this help text is also used to generate the BusyBox documentation (in | ||
440 | html, txt, and man page formats) in the docs directory. See | ||
441 | <a href="#adding">adding an applet to busybox</a> for more | ||
442 | information.</p> | ||
443 | |||
444 | <h2><a name="source_libbb"><b>libbb</b></a></h2> | ||
445 | |||
446 | <p>Most non-setup code shared between busybox applets lives in the libbb | ||
447 | directory. It's a mess that evolved over the years without much auditing | ||
448 | or cleanup. For anybody looking for a great project to break into busybox | ||
449 | development with, documenting libbb would be both incredibly useful and good | ||
450 | experience.</p> | ||
451 | |||
452 | <p>Common themes in libbb include allocation functions that test | ||
453 | for failure and abort the program with an error message so the caller doesn't | ||
454 | have to test the return value (xmalloc(), xstrdup(), etc), wrapped versions | ||
455 | of open(), close(), read(), and write() that test for their own failures | ||
456 | and/or retry automatically, linked list management functions (llist.c), | ||
457 | command line argument parsing (getopt_ulflags.c), and a whole lot more.</p> | ||
344 | 458 | ||
345 | <hr /> | 459 | <hr /> |
346 | <p> | 460 | <p> |
@@ -352,16 +466,517 @@ int main(int argc, char *argv) | |||
352 | so a small change may not even be visible by itself, but many small | 466 | so a small change may not even be visible by itself, but many small |
353 | savings add up). | 467 | savings add up). |
354 | </p> | 468 | </p> |
469 | |||
470 | <p> The busybox Makefile builds two versions of busybox, one of which | ||
471 | (busybox_unstripped) has extra information that various analysis tools | ||
472 | can use. (This has nothing to do with CONFIG_DEBUG, leave that off | ||
473 | when trying to optimize for size.) | ||
474 | </p> | ||
475 | |||
476 | <p> The <b>"make bloatcheck"</b> option uses Matt Mackall's bloat-o-meter | ||
477 | script to compare two versions of busybox (busybox_unstripped vs | ||
478 | busybox_old), and report which symbols changed size and by how much. | ||
479 | To use it, first build a base version, rename busybox_unstripped to | ||
480 | busybox_old, and then build a new version with your changes and run | ||
481 | "make bloatcheck" to see the size differences from the old version. | ||
482 | </p> | ||
483 | <p> | ||
484 | The first line of output has totals: how many symbols were added or | ||
485 | removed, how many symbols grew or shrank, the number of bytes added | ||
486 | and number of bytes removed by these changes, and finally the total | ||
487 | number of bytes difference between the two files. The remaining | ||
488 | lines show each individual symbol, the old and new sizes, and the | ||
489 | increase or decrease in size (which results are sorted by). | ||
490 | </p> | ||
491 | <p> | ||
492 | The <b>"make sizes"</b> option produces raw symbol size information for | ||
493 | busybox_unstripped. This is the output from the "nm --size-sort" | ||
494 | command (see "man nm" for more information), and is the information | ||
495 | bloat-o-meter parses to produce the comparison report above. For | ||
496 | defconfig, this is a good way to find the largest symbols in the tree | ||
497 | (which is a good place to start when trying to shrink the code). To | ||
498 | take a closer look at individual applets, configure busybox with just | ||
499 | one applet (run "make allnoconfig" and then switch on a single applet | ||
500 | with menuconfig), and then use "make sizes" to see the size of that | ||
501 | applet's components. | ||
502 | </p> | ||
355 | <p> | 503 | <p> |
356 | The busybox Makefile can generate a report of how much space is actually | 504 | The "showasm" command (in the scripts directory) produces an assembly |
357 | being used by each function and variable. Run "<b>make sizes</b>" (preferably | 505 | dump of a function, providing a closer look at what changed. Try |
358 | with CONFIG_DEBUG off) to get a list of symbols and the amount of | 506 | "scripts/showasm busybox_unstripped" to list available symbols, and |
359 | space allocated for each one, sorted by size. | 507 | "scripts/showasm busybox_unstripped symbolname" to see the assembly |
508 | for a sepecific symbol. | ||
360 | </p> | 509 | </p> |
361 | <hr /> | 510 | <hr /> |
362 | 511 | ||
363 | 512 | ||
364 | 513 | ||
514 | <h2><a name="adding"><b>Adding an applet to busybox</b></a></h2> | ||
515 | |||
516 | <p>To add a new applet to busybox, first pick a name for the applet and | ||
517 | a corresponding CONFIG_NAME. Then do this:</p> | ||
518 | |||
519 | <ul> | ||
520 | <li>Figure out where in the busybox source tree your applet best fits, | ||
521 | and put your source code there. Be sure to use APPLET_main() instead | ||
522 | of main(), where APPLET is the name of your applet.</li> | ||
523 | |||
524 | <li>Add your applet to the relevant Config.in file (which file you add | ||
525 | it to determines where it shows up in "make menuconfig"). This uses | ||
526 | the same general format as the linux kernel's configuration system.</li> | ||
527 | |||
528 | <li>Add your applet to the relevant Makefile.in file (in the same | ||
529 | directory as the Config.in you chose), using the existing entries as a | ||
530 | template and the same CONFIG symbol as you used for Config.in. (Don't | ||
531 | forget "needlibm" or "needcrypt" if your applet needs libm or | ||
532 | libcrypt.)</li> | ||
533 | |||
534 | <li>Add your applet to "include/applets.h", using one of the existing | ||
535 | entries as a template. (Note: this is in alphabetical order. Applets | ||
536 | are found via binary search, and if you add an applet out of order it | ||
537 | won't work.)</li> | ||
538 | |||
539 | <li>Add your applet's runtime help text to "include/usage.h". You need | ||
540 | at least appname_trivial_usage (the minimal help text, always included | ||
541 | in the busybox binary when this applet is enabled) and appname_full_usage | ||
542 | (extra help text included in the busybox binary with | ||
543 | CONFIG_FEATURE_VERBOSE_USAGE is enabled), or it won't compile. | ||
544 | The other two help entry types (appname_example_usage and | ||
545 | appname_notes_usage) are optional. They don't take up space in the binary, | ||
546 | but instead show up in the generated documentation (BusyBox.html, | ||
547 | BusyBox.txt, and the man page BusyBox.1).</li> | ||
548 | |||
549 | <li>Run menuconfig, switch your applet on, compile, test, and fix the | ||
550 | bugs. Be sure to try both "allyesconfig" and "allnoconfig" (and | ||
551 | "allbareconfig" if relevant).</li> | ||
552 | |||
553 | </ul> | ||
554 | |||
555 | <h2><a name="standards">What standards does busybox adhere to?</a></h2> | ||
556 | |||
557 | <p>The standard we're paying attention to is the "Shell and Utilities" | ||
558 | portion of the <a href="http://www.opengroup.org/onlinepubs/009695399/">Open | ||
559 | Group Base Standards</a> (also known as the Single Unix Specification version | ||
560 | 3 or SUSv3). Note that paying attention isn't necessarily the same thing as | ||
561 | following it.</p> | ||
562 | |||
563 | <p>SUSv3 doesn't even mention things like init, mount, tar, or losetup, nor | ||
564 | commonly used options like echo's '-e' and '-n', or sed's '-i'. Busybox is | ||
565 | driven by what real users actually need, not the fact the standard believes | ||
566 | we should implement ed or sccs. For size reasons, we're unlikely to include | ||
567 | much internationalization support beyond UTF-8, and on top of all that, our | ||
568 | configuration menu lets developers chop out features to produce smaller but | ||
569 | very non-standard utilities.</p> | ||
570 | |||
571 | <p>Also, Busybox is aimed primarily at Linux. Unix standards are interesting | ||
572 | because Linux tries to adhere to them, but portability to dozens of platforms | ||
573 | is only interesting in terms of offering a restricted feature set that works | ||
574 | everywhere, not growing dozens of platform-specific extensions. Busybox | ||
575 | should be portable to all hardware platforms Linux supports, and any other | ||
576 | similar operating systems that are easy to do and won't require much | ||
577 | maintenance.</p> | ||
578 | |||
579 | <p>In practice, standards compliance tends to be a clean-up step once an | ||
580 | applet is otherwise finished. When polishing and testing a busybox applet, | ||
581 | we ensure we have at least the option of full standards compliance, or else | ||
582 | document where we (intentionally) fall short.</p> | ||
583 | |||
584 | <h2><a name="portability">Portability.</a></h2> | ||
585 | |||
586 | <p>Busybox is a Linux project, but that doesn't mean we don't have to worry | ||
587 | about portability. First of all, there are different hardware platforms, | ||
588 | different C library implementations, different versions of the kernel and | ||
589 | build toolchain... The file "include/platform.h" exists to centralize and | ||
590 | encapsulate various platform-specific things in one place, so most busybox | ||
591 | code doesn't have to care where it's running.</p> | ||
592 | |||
593 | <p>To start with, Linux runs on dozens of hardware platforms. We try to test | ||
594 | each release on x86, x86-64, arm, power pc, and mips. (Since qemu can handle | ||
595 | all of these, this isn't that hard.) This means we have to care about a number | ||
596 | of portability issues like endianness, word size, and alignment, all of which | ||
597 | belong in platform.h. That header handles conditional #includes and gives | ||
598 | us macros we can use in the rest of our code. At some point in the future | ||
599 | we might grow a platform.c, possibly even a platform subdirectory. As long | ||
600 | as the applets themselves don't have to care.</p> | ||
601 | |||
602 | <p>On a related note, we made the "default signedness of char varies" problem | ||
603 | go away by feeding the compiler -funsigned-char. This gives us consistent | ||
604 | behavior on all platforms, and defaults to 8-bit clean text processing (which | ||
605 | gets us halfway to UTF-8 support). NOMMU support is less easily separated | ||
606 | (see the tips section later in this document), but we're working on it.</p> | ||
607 | |||
608 | <p>Another type of portability is build environments: we unapologetically use | ||
609 | a number of gcc and glibc extensions (as does the Linux kernel), but these have | ||
610 | been picked up by packages like uClibc, TCC, and Intel's C Compiler. As for | ||
611 | gcc, we take advantage of newer compiler optimizations to get the smallest | ||
612 | possible size, but we also regression test against an older build environment | ||
613 | using the Red Hat 9 image at "http://busybox.net/downloads/qemu". This has a | ||
614 | 2.4 kernel, gcc 3.2, make 3.79.1, and glibc 2.3, and is the oldest | ||
615 | build/deployment environment we still put any effort into maintaining. (If | ||
616 | anyone takes an interest in older kernels you're welcome to submit patches, | ||
617 | but the effort would probably be better spent | ||
618 | <a href="http://www.selenic.com/linux-tiny/">trimming | ||
619 | down the 2.6 kernel</a>.) Older gcc versions than that are uninteresting since | ||
620 | we now use c99 features, although | ||
621 | <a href="http://fabrice.bellard.free.fr/tcc/">tcc</a> might be worth a | ||
622 | look.</p> | ||
623 | |||
624 | <p>We also test busybox against the current release of uClibc. Older versions | ||
625 | of uClibc aren't very interesting (they were buggy, and uClibc wasn't really | ||
626 | usable as a general-purpose C library before version 0.9.26 anyway).</p> | ||
627 | |||
628 | <p>Other unix implementations are mostly uninteresting, since Linux binaries | ||
629 | have become the new standard for portable Unix programs. Specifically, | ||
630 | the ubiquity of Linux was cited as the main reason the Intel Binary | ||
631 | Compatability Standard 2 died, by the standards group organized to name a | ||
632 | successor to ibcs2: <a href="http://www.telly.org/86open/">the 86open | ||
633 | project</a>. That project disbanded in 1999 with the endorsement of an | ||
634 | existing standard: Linux ELF binaries. Since then, the major players at the | ||
635 | time (such as <a | ||
636 | href=http://www-03.ibm.com/servers/aix/products/aixos/linux/index.html>AIX</a>, <a | ||
637 | href=http://www.sun.com/software/solaris/ds/linux_interop.jsp#3>Solaris</a>, and | ||
638 | <a href=http://www.onlamp.com/pub/a/bsd/2000/03/17/linuxapps.html>FreeBSD</a>) | ||
639 | have all either grown Linux support or folded.</p> | ||
640 | |||
641 | <p>The major exceptions are newcomer MacOS X, some embedded environments | ||
642 | (such as newlib+libgloss) which provide a posix environment but not a full | ||
643 | Linux environment, and environments like Cygwin that provide only partial Linux | ||
644 | emulation. Also, some embedded Linux systems run a Linux kernel but amputate | ||
645 | things like the /proc directory to save space.</p> | ||
646 | |||
647 | <p>Supporting these systems is largely a question of providing a clean subset | ||
648 | of BusyBox's functionality -- whichever applets can easily be made to | ||
649 | work in that environment. Annotating the configuration system to | ||
650 | indicate which applets require which prerequisites (such as procfs) is | ||
651 | also welcome. Other efforts to support these systems (swapping #include | ||
652 | files to build in different environments, adding adapter code to platform.h, | ||
653 | adding more extensive special-case supporting infrastructure such as mount's | ||
654 | legacy mtab support) are handled on a case-by-case basis. Support that can be | ||
655 | cleanly hidden in platform.h is reasonably attractive, and failing that | ||
656 | support that can be cleanly separated into a separate conditionally compiled | ||
657 | file is at least worth a look. Special-case code in the body of an applet is | ||
658 | something we're trying to avoid.</p> | ||
659 | |||
660 | <h2><a name="tips" />Programming tips and tricks.</a></h2> | ||
661 | |||
662 | <p>Various things busybox uses that aren't particularly well documented | ||
663 | elsewhere.</p> | ||
664 | |||
665 | <h2><a name="tips_encrypted_passwords">Encrypted Passwords</a></h2> | ||
666 | |||
667 | <p>Password fields in /etc/passwd and /etc/shadow are in a special format. | ||
668 | If the first character isn't '$', then it's an old DES style password. If | ||
669 | the first character is '$' then the password is actually three fields | ||
670 | separated by '$' characters:</p> | ||
671 | <pre> | ||
672 | <b>$type$salt$encrypted_password</b> | ||
673 | </pre> | ||
674 | |||
675 | <p>The "type" indicates which encryption algorithm to use: 1 for MD5 and 2 for SHA1.</p> | ||
676 | |||
677 | <p>The "salt" is a bunch of ramdom characters (generally 8) the encryption | ||
678 | algorithm uses to perturb the password in a known and reproducible way (such | ||
679 | as by appending the random data to the unencrypted password, or combining | ||
680 | them with exclusive or). Salt is randomly generated when setting a password, | ||
681 | and then the same salt value is re-used when checking the password. (Salt is | ||
682 | thus stored unencrypted.)</p> | ||
683 | |||
684 | <p>The advantage of using salt is that the same cleartext password encrypted | ||
685 | with a different salt value produces a different encrypted value. | ||
686 | If each encrypted password uses a different salt value, an attacker is forced | ||
687 | to do the cryptographic math all over again for each password they want to | ||
688 | check. Without salt, they could simply produce a big dictionary of commonly | ||
689 | used passwords ahead of time, and look up each password in a stolen password | ||
690 | file to see if it's a known value. (Even if there are billions of possible | ||
691 | passwords in the dictionary, checking each one is just a binary search against | ||
692 | a file only a few gigabytes long.) With salt they can't even tell if two | ||
693 | different users share the same password without guessing what that password | ||
694 | is and decrypting it. They also can't precompute the attack dictionary for | ||
695 | a specific password until they know what the salt value is.</p> | ||
696 | |||
697 | <p>The third field is the encrypted password (plus the salt). For md5 this | ||
698 | is 22 bytes.</p> | ||
699 | |||
700 | <p>The busybox function to handle all this is pw_encrypt(clear, salt) in | ||
701 | "libbb/pw_encrypt.c". The first argument is the clear text password to be | ||
702 | encrypted, and the second is a string in "$type$salt$password" format, from | ||
703 | which the "type" and "salt" fields will be extracted to produce an encrypted | ||
704 | value. (Only the first two fields are needed, the third $ is equivalent to | ||
705 | the end of the string.) The return value is an encrypted password in | ||
706 | /etc/passwd format, with all three $ separated fields. It's stored in | ||
707 | a static buffer, 128 bytes long.</p> | ||
708 | |||
709 | <p>So when checking an existing password, if pw_encrypt(text, | ||
710 | old_encrypted_password) returns a string that compares identical to | ||
711 | old_encrypted_password, you've got the right password. When setting a new | ||
712 | password, generate a random 8 character salt string, put it in the right | ||
713 | format with sprintf(buffer, "$%c$%s", type, salt), and feed buffer as the | ||
714 | second argument to pw_encrypt(text,buffer).</p> | ||
715 | |||
716 | <h2><a name="tips_vfork">Fork and vfork</a></h2> | ||
717 | |||
718 | <p>On systems that haven't got a Memory Management Unit, fork() is unreasonably | ||
719 | expensive to implement (and sometimes even impossible), so a less capable | ||
720 | function called vfork() is used instead. (Using vfork() on a system with an | ||
721 | MMU is like pounding a nail with a wrench. Not the best tool for the job, but | ||
722 | it works.)</p> | ||
723 | |||
724 | <p>Busybox hides the difference between fork() and vfork() in | ||
725 | libbb/bb_fork_exec.c. If you ever want to fork and exec, use bb_fork_exec() | ||
726 | (which returns a pid and takes the same arguments as execve(), although in | ||
727 | this case envp can be NULL) and don't worry about it. This description is | ||
728 | here in case you want to know why that does what it does.</p> | ||
729 | |||
730 | <p>Implementing fork() depends on having a Memory Management Unit. With an | ||
731 | MMU then you can simply set up a second set of page tables and share the | ||
732 | physical memory via copy-on-write. So a fork() followed quickly by exec() | ||
733 | only copies a few pages of the parent's memory, just the ones it changes | ||
734 | before freeing them.</p> | ||
735 | |||
736 | <p>With a very primitive MMU (using a base pointer plus length instead of page | ||
737 | tables, which can provide virtual addresses and protect processes from each | ||
738 | other, but no copy on write) you can still implement fork. But it's | ||
739 | unreasonably expensive, because you have to copy all the parent process' | ||
740 | memory into the new process (which could easily be several megabytes per fork). | ||
741 | And you have to do this even though that memory gets freed again as soon as the | ||
742 | exec happens. (This is not just slow and a waste of space but causes memory | ||
743 | usage spikes that can easily cause the system to run out of memory.)</p> | ||
744 | |||
745 | <p>Without even a primitive MMU, you have no virtual addresses. Every process | ||
746 | can reach out and touch any other process' memory, because all pointers are to | ||
747 | physical addresses with no protection. Even if you copy a process' memory to | ||
748 | new physical addresses, all of its pointers point to the old objects in the | ||
749 | old process. (Searching through the new copy's memory for pointers and | ||
750 | redirect them to the new locations is not an easy problem.)</p> | ||
751 | |||
752 | <p>So with a primitive or missing MMU, fork() is just not a good idea.</p> | ||
753 | |||
754 | <p>In theory, vfork() is just a fork() that writeably shares the heap and stack | ||
755 | rather than copying it (so what one process writes the other one sees). In | ||
756 | practice, vfork() has to suspend the parent process until the child does exec, | ||
757 | at which point the parent wakes up and resumes by returning from the call to | ||
758 | vfork(). All modern kernel/libc combinations implement vfork() to put the | ||
759 | parent to sleep until the child does its exec. There's just no other way to | ||
760 | make it work: the parent has to know the child has done its exec() or exit() | ||
761 | before it's safe to return from the function it's in, so it has to block | ||
762 | until that happens. In fact without suspending the parent there's no way to | ||
763 | even store separate copies of the return value (the pid) from the vfork() call | ||
764 | itself: both assignments write into the same memory location.</p> | ||
765 | |||
766 | <p>One way to understand (and in fact implement) vfork() is this: imagine | ||
767 | the parent does a setjmp and then continues on (pretending to be the child) | ||
768 | until the exec() comes around, then the _exec_ does the actual fork, and the | ||
769 | parent does a longjmp back to the original vfork call and continues on from | ||
770 | there. (It thus becomes obvious why the child can't return, or modify | ||
771 | local variables it doesn't want the parent to see changed when it resumes.) | ||
772 | |||
773 | <p>Note a common mistake: the need for vfork doesn't mean you can't have two | ||
774 | processes running at the same time. It means you can't have two processes | ||
775 | sharing the same memory without stomping all over each other. As soon as | ||
776 | the child calls exec(), the parent resumes.</p> | ||
777 | |||
778 | <p>If the child's attempt to call exec() fails, the child should call _exit() | ||
779 | rather than a normal exit(). This avoids any atexit() code that might confuse | ||
780 | the parent. (The parent should never call _exit(), only a vforked child that | ||
781 | failed to exec.)</p> | ||
782 | |||
783 | <p>(Now in theory, a nommu system could just copy the _stack_ when it forks | ||
784 | (which presumably is much shorter than the heap), and leave the heap shared. | ||
785 | Even with no MMU at all | ||
786 | In practice, you've just wound up in a multi-threaded situation and you can't | ||
787 | do a malloc() or free() on your heap without freeing the other process' memory | ||
788 | (and if you don't have the proper locking for being threaded, corrupting the | ||
789 | heap if both of you try to do it at the same time and wind up stomping on | ||
790 | each other while traversing the free memory lists). The thing about vfork is | ||
791 | that it's a big red flag warning "there be dragons here" rather than | ||
792 | something subtle and thus even more dangerous.)</p> | ||
793 | |||
794 | <h2><a name="tips_sort_read">Short reads and writes</a></h2> | ||
795 | |||
796 | <p>Busybox has special functions, bb_full_read() and bb_full_write(), to | ||
797 | check that all the data we asked for got read or written. Is this a real | ||
798 | world consideration? Try the following:</p> | ||
799 | |||
800 | <pre>while true; do echo hello; sleep 1; done | tee out.txt</pre> | ||
801 | |||
802 | <p>If tee is implemented with bb_full_read(), tee doesn't display output | ||
803 | in real time but blocks until its entire input buffer (generally a couple | ||
804 | kilobytes) is read, then displays it all at once. In that case, we _want_ | ||
805 | the short read, for user interface reasons. (Note that read() should never | ||
806 | return 0 unless it has hit the end of input, and an attempt to write 0 | ||
807 | bytes should be ignored by the OS.)</p> | ||
808 | |||
809 | <p>As for short writes, play around with two processes piping data to each | ||
810 | other on the command line (cat bigfile | gzip > out.gz) and suspend and | ||
811 | resume a few times (ctrl-z to suspend, "fg" to resume). The writer can | ||
812 | experience short writes, which are especially dangerous because if you don't | ||
813 | notice them you'll discard data. They can also happen when a system is under | ||
814 | load and a fast process is piping to a slower one. (Such as an xterm waiting | ||
815 | on x11 when the scheduler decides X is being a CPU hog with all that | ||
816 | text console scrolling...)</p> | ||
817 | |||
818 | <p>So will data always be read from the far end of a pipe at the | ||
819 | same chunk sizes it was written in? Nope. Don't rely on that. For one | ||
820 | counterexample, see <a href="http://www.faqs.org/rfcs/rfc896.html">rfc 896 | ||
821 | for Nagle's algorithm</a>, which waits a fraction of a second or so before | ||
822 | sending out small amounts of data through a TCP/IP connection in case more | ||
823 | data comes in that can be merged into the same packet. (In case you were | ||
824 | wondering why action games that use TCP/IP set TCP_NODELAY to lower the latency | ||
825 | on their their sockets, now you know.)</p> | ||
826 | |||
827 | <h2><a name="tips_memory">Memory used by relocatable code, PIC, and static linking.</a></h2> | ||
828 | |||
829 | <p>The downside of standard dynamic linking is that it results in self-modifying | ||
830 | code. Although each executable's pages are mmaped() into a process' address | ||
831 | space from the executable file and are thus naturally shared between processes | ||
832 | out of the page cache, the library loader (ld-linux.so.2 or ld-uClibc.so.0) | ||
833 | writes to these pages to supply addresses for relocatable symbols. This | ||
834 | dirties the pages, triggering copy-on-write allocation of new memory for each | ||
835 | processes' dirtied pages.</p> | ||
836 | |||
837 | <p>One solution to this is Position Independent Code (PIC), a way of linking | ||
838 | a file so all the relocations are grouped together. This dirties fewer | ||
839 | pages (often just a single page) for each process' relocations. The down | ||
840 | side is this results in larger executables, which take up more space on disk | ||
841 | (and a correspondingly larger space in memory). But when many copies of the | ||
842 | same program are running, PIC dynamic linking trades a larger disk footprint | ||
843 | for a smaller memory footprint, by sharing more pages.</p> | ||
844 | |||
845 | <p>A third solution is static linking. A statically linked program has no | ||
846 | relocations, and thus the entire executable is shared between all running | ||
847 | instances. This tends to have a significantly larger disk footprint, but | ||
848 | on a system with only one or two executables, shared libraries aren't much | ||
849 | of a win anyway.</p> | ||
850 | |||
851 | <p>You can tell the glibc linker to display debugging information about its | ||
852 | relocations with the environment variable "LD_DEBUG". Try | ||
853 | "LD_DEBUG=help /bin/true" for a list of commands. Learning to interpret | ||
854 | "LD_DEBUG=statistics cat /proc/self/statm" could be interesting.</p> | ||
855 | |||
856 | <p>For more on this topic, here's Rich Felker:</p> | ||
857 | <blockquote> | ||
858 | <p>Dynamic linking (without fixed load addresses) fundamentally requires | ||
859 | at least one dirty page per dso that uses symbols. Making calls (but | ||
860 | never taking the address explicitly) to functions within the same dso | ||
861 | does not require a dirty page by itself, but will with ELF unless you | ||
862 | use -Bsymbolic or hidden symbols when linking.</p> | ||
863 | |||
864 | <p>ELF uses significant additional stack space for the kernel to pass all | ||
865 | the ELF data structures to the newly created process image. These are | ||
866 | located above the argument list and environment. This normally adds 1 | ||
867 | dirty page to the process size.</p> | ||
868 | |||
869 | <p>The ELF dynamic linker has its own data segment, adding one or more | ||
870 | dirty pages. I believe it also performs relocations on itself.</p> | ||
871 | |||
872 | <p>The ELF dynamic linker makes significant dynamic allocations to manage | ||
873 | the global symbol table and the loaded dso's. This data is never | ||
874 | freed. It will be needed again if libdl is used, so unconditionally | ||
875 | freeing it is not possible, but normal programs do not use libdl. Of | ||
876 | course with glibc all programs use libdl (due to nsswitch) so the | ||
877 | issue was never addressed.</p> | ||
878 | |||
879 | <p>ELF also has the issue that segments are not page-aligned on disk. | ||
880 | This saves up to 4k on disk, but at the expense of using an additional | ||
881 | dirty page in most cases, due to a large portion of the first data | ||
882 | page being filled with a duplicate copy of the last text page.</p> | ||
883 | |||
884 | <p>The above is just a partial list of the tiny memory penalties of ELF | ||
885 | dynamic linking, which eventually add up to quite a bit. The smallest | ||
886 | I've been able to get a process down to is 8 dirty pages, and the | ||
887 | above factors seem to mostly account for it (but some were difficult | ||
888 | to measure).</p> | ||
889 | </blockquote> | ||
890 | |||
891 | <h2><a name="tips_kernel_headers"></a>Including kernel headers</h2> | ||
892 | |||
893 | <p>The "linux" or "asm" directories of /usr/include contain Linux kernel | ||
894 | headers, so that the C library can talk directly to the Linux kernel. In | ||
895 | a perfect world, applications shouldn't include these headers directly, but | ||
896 | we don't live in a perfect world.</p> | ||
897 | |||
898 | <p>For example, Busybox's losetup code wants linux/loop.c because nothing else | ||
899 | #defines the structures to call the kernel's loopback device setup ioctls. | ||
900 | Attempts to cut and paste the information into a local busybox header file | ||
901 | proved incredibly painful, because portions of the loop_info structure vary by | ||
902 | architecture, namely the type __kernel_dev_t has different sizes on alpha, | ||
903 | arm, x86, and so on. Meaning we either #include <linux/posix_types.h> or | ||
904 | we hardwire #ifdefs to check what platform we're building on and define this | ||
905 | type appropriately for every single hardware architecture supported by | ||
906 | Linux, which is simply unworkable.</p> | ||
907 | |||
908 | <p>This is aside from the fact that the relevant type defined in | ||
909 | posix_types.h was renamed to __kernel_old_dev_t during the 2.5 series, so | ||
910 | to cut and paste the structure into our header we have to #include | ||
911 | <linux/version.h> to figure out which name to use. (What we actually do is | ||
912 | check if we're building on 2.6, and if so just use the new 64 bit structure | ||
913 | instead to avoid the rename entirely.) But we still need the version | ||
914 | check, since 2.4 didn't have the 64 bit structure.</p> | ||
915 | |||
916 | <p>The BusyBox developers spent <u>two years</u> trying to figure | ||
917 | out a clean way to do all this. There isn't one. The losetup in the | ||
918 | util-linux package from kernel.org isn't doing it cleanly either, they just | ||
919 | hide the ugliness by nesting #include files. Their mount/loop.h | ||
920 | #includes "my_dev_t.h", which #includes <linux/posix_types.h> and | ||
921 | <linux/version.h> just like we do. There simply is no alternative.</p> | ||
922 | |||
923 | <p>Just because directly #including kernel headers is sometimes | ||
924 | unavoidable doesn't me we should include them when there's a better | ||
925 | way to do it. However, block copying information out of the kernel headers | ||
926 | is not a better way.</p> | ||
927 | |||
928 | <h2><a name="who">Who are the BusyBox developers?</a></h2> | ||
929 | |||
930 | <p>The following login accounts currently exist on busybox.net. (I.E. these | ||
931 | people can commit <a href="http://busybox.net/downloads/patches">patches</a> | ||
932 | into subversion for the BusyBox, uClibc, and buildroot projects.)</p> | ||
933 | |||
934 | <pre> | ||
935 | aldot :Bernhard Fischer | ||
936 | andersen :Erik Andersen <- uClibc and BuildRoot maintainer. | ||
937 | bug1 :Glenn McGrath | ||
938 | davidm :David McCullough | ||
939 | gkajmowi :Garrett Kajmowicz <- uClibc++ maintainer | ||
940 | jbglaw :Jan-Benedict Glaw | ||
941 | jocke :Joakim Tjernlund | ||
942 | landley :Rob Landley <- BusyBox maintainer | ||
943 | lethal :Paul Mundt | ||
944 | mjn3 :Manuel Novoa III | ||
945 | osuadmin :osuadmin | ||
946 | pgf :Paul Fox | ||
947 | pkj :Peter Kjellerstedt | ||
948 | prpplague :David Anders | ||
949 | psm :Peter S. Mazinger | ||
950 | russ :Russ Dill | ||
951 | sandman :Robert Griebl | ||
952 | sjhill :Steven J. Hill | ||
953 | solar :Ned Ludd | ||
954 | timr :Tim Riker | ||
955 | tobiasa :Tobias Anderberg | ||
956 | vapier :Mike Frysinger | ||
957 | </pre> | ||
958 | |||
959 | <p>The following accounts used to exist on busybox.net, but don't anymore so | ||
960 | I can't ask /etc/passwd for their names. (If anybody would like to make | ||
961 | a stab at it...)</p> | ||
962 | |||
963 | <pre> | ||
964 | aaronl | ||
965 | beppu | ||
966 | dwhedon | ||
967 | erik : Also Erik Andersen? | ||
968 | gfeldman | ||
969 | jimg | ||
970 | kraai | ||
971 | markw | ||
972 | miles | ||
973 | proski | ||
974 | rjune | ||
975 | tausq | ||
976 | vodz :Vladimir N. Oleynik | ||
977 | </pre> | ||
978 | |||
979 | |||
365 | <br> | 980 | <br> |
366 | <br> | 981 | <br> |
367 | <br> | 982 | <br> |
diff --git a/docs/busybox.net/programming.html b/docs/busybox.net/programming.html deleted file mode 100644 index b73e6ef95..000000000 --- a/docs/busybox.net/programming.html +++ /dev/null | |||
@@ -1,584 +0,0 @@ | |||
1 | <!--#include file="header.html" --> | ||
2 | |||
3 | <h2>Rob's notes on programming busybox.</h2> | ||
4 | |||
5 | <ul> | ||
6 | <li><a href="#goals">What are the goals of busybox?</a></li> | ||
7 | <li><a href="#design">What is the design of busybox?</a></li> | ||
8 | <li><a href="#source">How is the source code organized?</a></li> | ||
9 | <ul> | ||
10 | <li><a href="#source_applets">The applet directories.</a></li> | ||
11 | <li><a href="#source_libbb">The busybox shared library (libbb)</a></li> | ||
12 | </ul> | ||
13 | <li><a href="#adding">Adding an applet to busybox</a></li> | ||
14 | <li><a href="#standards">What standards does busybox adhere to?</a></li> | ||
15 | <li><a href="#portability">Portability.</a></li> | ||
16 | <li><a href="#tips">Tips and tricks.</a></li> | ||
17 | <ul> | ||
18 | <li><a href="#tips_encrypted_passwords">Encrypted Passwords</a></li> | ||
19 | <li><a href="#tips_vfork">Fork and vfork</a></li> | ||
20 | <li><a href="#tips_short_read">Short reads and writes</a></li> | ||
21 | <li><a href="#tips_memory">Memory used by relocatable code, PIC, and static linking.</a></li> | ||
22 | <li><a href="#tips_kernel_headers">Including Linux kernel headers.</a></li> | ||
23 | </ul> | ||
24 | <li><a href="#who">Who are the BusyBox developers?</a></li> | ||
25 | </ul> | ||
26 | |||
27 | <h2><b><a name="goals">What are the goals of busybox?</a></b></h2> | ||
28 | |||
29 | <p>Busybox aims to be the smallest and simplest correct implementation of the | ||
30 | standard Linux command line tools. First and foremost, this means the | ||
31 | smallest executable size we can manage. We also want to have the simplest | ||
32 | and cleanest implementation we can manage, be <a href="#standards">standards | ||
33 | compliant</a>, minimize run-time memory usage (heap and stack), run fast, and | ||
34 | take over the world.</p> | ||
35 | |||
36 | <h2><b><a name="design">What is the design of busybox?</a></b></h2> | ||
37 | |||
38 | <p>Busybox is like a swiss army knife: one thing with many functions. | ||
39 | The busybox executable can act like many different programs depending on | ||
40 | the name used to invoke it. Normal practice is to create a bunch of symlinks | ||
41 | pointing to the busybox binary, each of which triggers a different busybox | ||
42 | function. (See <a href="FAQ.html#getting_started">getting started</a> in the | ||
43 | FAQ for more information on usage, and <a href="BusyBox.html">the | ||
44 | busybox documentation</a> for a list of symlink names and what they do.) | ||
45 | |||
46 | <p>The "one binary to rule them all" approach is primarily for size reasons: a | ||
47 | single multi-purpose executable is smaller then many small files could be. | ||
48 | This way busybox only has one set of ELF headers, it can easily share code | ||
49 | between different apps even when statically linked, it has better packing | ||
50 | efficiency by avoding gaps between files or compression dictionary resets, | ||
51 | and so on.</p> | ||
52 | |||
53 | <p>Work is underway on new options such as "make standalone" to build separate | ||
54 | binaries for each applet, and a "libbb.so" to make the busybox common code | ||
55 | available as a shared library. Neither is ready yet at the time of this | ||
56 | writing.</p> | ||
57 | |||
58 | <a name="source"></a> | ||
59 | |||
60 | <h2><a name="source_applets"><b>The applet directories</b></a></h2> | ||
61 | |||
62 | <p>The directory "applets" contains the busybox startup code (applets.c and | ||
63 | busybox.c), and several subdirectories containing the code for the individual | ||
64 | applets.</p> | ||
65 | |||
66 | <p>Busybox execution starts with the main() function in applets/busybox.c, | ||
67 | which sets the global variable bb_applet_name to argv[0] and calls | ||
68 | run_applet_by_name() in applets/applets.c. That uses the applets[] array | ||
69 | (defined in include/busybox.h and filled out in include/applets.h) to | ||
70 | transfer control to the appropriate APPLET_main() function (such as | ||
71 | cat_main() or sed_main()). The individual applet takes it from there.</p> | ||
72 | |||
73 | <p>This is why calling busybox under a different name triggers different | ||
74 | functionality: main() looks up argv[0] in applets[] to get a function pointer | ||
75 | to APPLET_main().</p> | ||
76 | |||
77 | <p>Busybox applets may also be invoked through the multiplexor applet | ||
78 | "busybox" (see busybox_main() in applets/busybox.c), and through the | ||
79 | standalone shell (grep for STANDALONE_SHELL in applets/shell/*.c). | ||
80 | See <a href="FAQ.html#getting_started">getting started</a> in the | ||
81 | FAQ for more information on these alternate usage mechanisms, which are | ||
82 | just different ways to reach the relevant APPLET_main() function.</p> | ||
83 | |||
84 | <p>The applet subdirectories (archival, console-tools, coreutils, | ||
85 | debianutils, e2fsprogs, editors, findutils, init, loginutils, miscutils, | ||
86 | modutils, networking, procps, shell, sysklogd, and util-linux) correspond | ||
87 | to the configuration sub-menus in menuconfig. Each subdirectory contains the | ||
88 | code to implement the applets in that sub-menu, as well as a Config.in | ||
89 | file defining that configuration sub-menu (with dependencies and help text | ||
90 | for each applet), and the makefile segment (Makefile.in) for that | ||
91 | subdirectory.</p> | ||
92 | |||
93 | <p>The run-time --help is stored in usage_messages[], which is initialized at | ||
94 | the start of applets/applets.c and gets its help text from usage.h. During the | ||
95 | build this help text is also used to generate the BusyBox documentation (in | ||
96 | html, txt, and man page formats) in the docs directory. See | ||
97 | <a href="#adding">adding an applet to busybox</a> for more | ||
98 | information.</p> | ||
99 | |||
100 | <h2><a name="source_libbb"><b>libbb</b></a></h2> | ||
101 | |||
102 | <p>Most non-setup code shared between busybox applets lives in the libbb | ||
103 | directory. It's a mess that evolved over the years without much auditing | ||
104 | or cleanup. For anybody looking for a great project to break into busybox | ||
105 | development with, documenting libbb would be both incredibly useful and good | ||
106 | experience.</p> | ||
107 | |||
108 | <p>Common themes in libbb include allocation functions that test | ||
109 | for failure and abort the program with an error message so the caller doesn't | ||
110 | have to test the return value (xmalloc(), xstrdup(), etc), wrapped versions | ||
111 | of open(), close(), read(), and write() that test for their own failures | ||
112 | and/or retry automatically, linked list management functions (llist.c), | ||
113 | command line argument parsing (getopt_ulflags.c), and a whole lot more.</p> | ||
114 | |||
115 | <h2><a name="adding"><b>Adding an applet to busybox</b></a></h2> | ||
116 | |||
117 | <p>To add a new applet to busybox, first pick a name for the applet and | ||
118 | a corresponding CONFIG_NAME. Then do this:</p> | ||
119 | |||
120 | <ul> | ||
121 | <li>Figure out where in the busybox source tree your applet best fits, | ||
122 | and put your source code there. Be sure to use APPLET_main() instead | ||
123 | of main(), where APPLET is the name of your applet.</li> | ||
124 | |||
125 | <li>Add your applet to the relevant Config.in file (which file you add | ||
126 | it to determines where it shows up in "make menuconfig"). This uses | ||
127 | the same general format as the linux kernel's configuration system.</li> | ||
128 | |||
129 | <li>Add your applet to the relevant Makefile.in file (in the same | ||
130 | directory as the Config.in you chose), using the existing entries as a | ||
131 | template and the same CONFIG symbol as you used for Config.in. (Don't | ||
132 | forget "needlibm" or "needcrypt" if your applet needs libm or | ||
133 | libcrypt.)</li> | ||
134 | |||
135 | <li>Add your applet to "include/applets.h", using one of the existing | ||
136 | entries as a template. (Note: this is in alphabetical order. Applets | ||
137 | are found via binary search, and if you add an applet out of order it | ||
138 | won't work.)</li> | ||
139 | |||
140 | <li>Add your applet's runtime help text to "include/usage.h". You need | ||
141 | at least appname_trivial_usage (the minimal help text, always included | ||
142 | in the busybox binary when this applet is enabled) and appname_full_usage | ||
143 | (extra help text included in the busybox binary with | ||
144 | CONFIG_FEATURE_VERBOSE_USAGE is enabled), or it won't compile. | ||
145 | The other two help entry types (appname_example_usage and | ||
146 | appname_notes_usage) are optional. They don't take up space in the binary, | ||
147 | but instead show up in the generated documentation (BusyBox.html, | ||
148 | BusyBox.txt, and the man page BusyBox.1).</li> | ||
149 | |||
150 | <li>Run menuconfig, switch your applet on, compile, test, and fix the | ||
151 | bugs. Be sure to try both "allyesconfig" and "allnoconfig" (and | ||
152 | "allbareconfig" if relevant).</li> | ||
153 | |||
154 | </ul> | ||
155 | |||
156 | <h2><a name="standards">What standards does busybox adhere to?</a></h2> | ||
157 | |||
158 | <p>The standard we're paying attention to is the "Shell and Utilities" | ||
159 | portion of the <a href="http://www.opengroup.org/onlinepubs/009695399/">Open | ||
160 | Group Base Standards</a> (also known as the Single Unix Specification version | ||
161 | 3 or SUSv3). Note that paying attention isn't necessarily the same thing as | ||
162 | following it.</p> | ||
163 | |||
164 | <p>SUSv3 doesn't even mention things like init, mount, tar, or losetup, nor | ||
165 | commonly used options like echo's '-e' and '-n', or sed's '-i'. Busybox is | ||
166 | driven by what real users actually need, not the fact the standard believes | ||
167 | we should implement ed or sccs. For size reasons, we're unlikely to include | ||
168 | much internationalization support beyond UTF-8, and on top of all that, our | ||
169 | configuration menu lets developers chop out features to produce smaller but | ||
170 | very non-standard utilities.</p> | ||
171 | |||
172 | <p>Also, Busybox is aimed primarily at Linux. Unix standards are interesting | ||
173 | because Linux tries to adhere to them, but portability to dozens of platforms | ||
174 | is only interesting in terms of offering a restricted feature set that works | ||
175 | everywhere, not growing dozens of platform-specific extensions. Busybox | ||
176 | should be portable to all hardware platforms Linux supports, and any other | ||
177 | similar operating systems that are easy to do and won't require much | ||
178 | maintenance.</p> | ||
179 | |||
180 | <p>In practice, standards compliance tends to be a clean-up step once an | ||
181 | applet is otherwise finished. When polishing and testing a busybox applet, | ||
182 | we ensure we have at least the option of full standards compliance, or else | ||
183 | document where we (intentionally) fall short.</p> | ||
184 | |||
185 | <h2><a name="portability">Portability.</a></h2> | ||
186 | |||
187 | <p>Busybox is a Linux project, but that doesn't mean we don't have to worry | ||
188 | about portability. First of all, there are different hardware platforms, | ||
189 | different C library implementations, different versions of the kernel and | ||
190 | build toolchain... The file "include/platform.h" exists to centralize and | ||
191 | encapsulate various platform-specific things in one place, so most busybox | ||
192 | code doesn't have to care where it's running.</p> | ||
193 | |||
194 | <p>To start with, Linux runs on dozens of hardware platforms. We try to test | ||
195 | each release on x86, x86-64, arm, power pc, and mips. (Since qemu can handle | ||
196 | all of these, this isn't that hard.) This means we have to care about a number | ||
197 | of portability issues like endianness, word size, and alignment, all of which | ||
198 | belong in platform.h. That header handles conditional #includes and gives | ||
199 | us macros we can use in the rest of our code. At some point in the future | ||
200 | we might grow a platform.c, possibly even a platform subdirectory. As long | ||
201 | as the applets themselves don't have to care.</p> | ||
202 | |||
203 | <p>On a related note, we made the "default signedness of char varies" problem | ||
204 | go away by feeding the compiler -funsigned-char. This gives us consistent | ||
205 | behavior on all platforms, and defaults to 8-bit clean text processing (which | ||
206 | gets us halfway to UTF-8 support). NOMMU support is less easily separated | ||
207 | (see the tips section later in this document), but we're working on it.</p> | ||
208 | |||
209 | <p>Another type of portability is build environments: we unapologetically use | ||
210 | a number of gcc and glibc extensions (as does the Linux kernel), but these have | ||
211 | been picked up by packages like uClibc, TCC, and Intel's C Compiler. As for | ||
212 | gcc, we take advantage of newer compiler optimizations to get the smallest | ||
213 | possible size, but we also regression test against an older build environment | ||
214 | using the Red Hat 9 image at "http://busybox.net/downloads/qemu". This has a | ||
215 | 2.4 kernel, gcc 3.2, make 3.79.1, and glibc 2.3, and is the oldest | ||
216 | build/deployment environment we still put any effort into maintaining. (If | ||
217 | anyone takes an interest in older kernels you're welcome to submit patches, | ||
218 | but the effort would probably be better spent | ||
219 | <a href="http://www.selenic.com/linux-tiny/">trimming | ||
220 | down the 2.6 kernel</a>.) Older gcc versions than that are uninteresting since | ||
221 | we now use c99 features, although | ||
222 | <a href="http://fabrice.bellard.free.fr/tcc/">tcc</a> might be worth a | ||
223 | look.</p> | ||
224 | |||
225 | <p>We also test busybox against the current release of uClibc. Older versions | ||
226 | of uClibc aren't very interesting (they were buggy, and uClibc wasn't really | ||
227 | usable as a general-purpose C library before version 0.9.26 anyway).</p> | ||
228 | |||
229 | <p>Other unix implementations are mostly uninteresting, since Linux binaries | ||
230 | have become the new standard for portable Unix programs. Specifically, | ||
231 | the ubiquity of Linux was cited as the main reason the Intel Binary | ||
232 | Compatability Standard 2 died, by the standards group organized to name a | ||
233 | successor to ibcs2: <a href="http://www.telly.org/86open/">the 86open | ||
234 | project</a>. That project disbanded in 1999 with the endorsement of an | ||
235 | existing standard: Linux ELF binaries. Since then, the major players at the | ||
236 | time (such as <a | ||
237 | href=http://www-03.ibm.com/servers/aix/products/aixos/linux/index.html>AIX</a>, <a | ||
238 | href=http://www.sun.com/software/solaris/ds/linux_interop.jsp#3>Solaris</a>, and | ||
239 | <a href=http://www.onlamp.com/pub/a/bsd/2000/03/17/linuxapps.html>FreeBSD</a>) | ||
240 | have all either grown Linux support or folded.</p> | ||
241 | |||
242 | <p>The major exceptions are newcomer MacOS X, some embedded environments | ||
243 | (such as newlib+libgloss) which provide a posix environment but not a full | ||
244 | Linux environment, and environments like Cygwin that provide only partial Linux | ||
245 | emulation. Also, some embedded Linux systems run a Linux kernel but amputate | ||
246 | things like the /proc directory to save space.</p> | ||
247 | |||
248 | <p>Supporting these systems is largely a question of providing a clean subset | ||
249 | of BusyBox's functionality -- whichever applets can easily be made to | ||
250 | work in that environment. Annotating the configuration system to | ||
251 | indicate which applets require which prerequisites (such as procfs) is | ||
252 | also welcome. Other efforts to support these systems (swapping #include | ||
253 | files to build in different environments, adding adapter code to platform.h, | ||
254 | adding more extensive special-case supporting infrastructure such as mount's | ||
255 | legacy mtab support) are handled on a case-by-case basis. Support that can be | ||
256 | cleanly hidden in platform.h is reasonably attractive, and failing that | ||
257 | support that can be cleanly separated into a separate conditionally compiled | ||
258 | file is at least worth a look. Special-case code in the body of an applet is | ||
259 | something we're trying to avoid.</p> | ||
260 | |||
261 | <h2><a name="tips" />Programming tips and tricks.</a></h2> | ||
262 | |||
263 | <p>Various things busybox uses that aren't particularly well documented | ||
264 | elsewhere.</p> | ||
265 | |||
266 | <h2><a name="tips_encrypted_passwords">Encrypted Passwords</a></h2> | ||
267 | |||
268 | <p>Password fields in /etc/passwd and /etc/shadow are in a special format. | ||
269 | If the first character isn't '$', then it's an old DES style password. If | ||
270 | the first character is '$' then the password is actually three fields | ||
271 | separated by '$' characters:</p> | ||
272 | <pre> | ||
273 | <b>$type$salt$encrypted_password</b> | ||
274 | </pre> | ||
275 | |||
276 | <p>The "type" indicates which encryption algorithm to use: 1 for MD5 and 2 for SHA1.</p> | ||
277 | |||
278 | <p>The "salt" is a bunch of ramdom characters (generally 8) the encryption | ||
279 | algorithm uses to perturb the password in a known and reproducible way (such | ||
280 | as by appending the random data to the unencrypted password, or combining | ||
281 | them with exclusive or). Salt is randomly generated when setting a password, | ||
282 | and then the same salt value is re-used when checking the password. (Salt is | ||
283 | thus stored unencrypted.)</p> | ||
284 | |||
285 | <p>The advantage of using salt is that the same cleartext password encrypted | ||
286 | with a different salt value produces a different encrypted value. | ||
287 | If each encrypted password uses a different salt value, an attacker is forced | ||
288 | to do the cryptographic math all over again for each password they want to | ||
289 | check. Without salt, they could simply produce a big dictionary of commonly | ||
290 | used passwords ahead of time, and look up each password in a stolen password | ||
291 | file to see if it's a known value. (Even if there are billions of possible | ||
292 | passwords in the dictionary, checking each one is just a binary search against | ||
293 | a file only a few gigabytes long.) With salt they can't even tell if two | ||
294 | different users share the same password without guessing what that password | ||
295 | is and decrypting it. They also can't precompute the attack dictionary for | ||
296 | a specific password until they know what the salt value is.</p> | ||
297 | |||
298 | <p>The third field is the encrypted password (plus the salt). For md5 this | ||
299 | is 22 bytes.</p> | ||
300 | |||
301 | <p>The busybox function to handle all this is pw_encrypt(clear, salt) in | ||
302 | "libbb/pw_encrypt.c". The first argument is the clear text password to be | ||
303 | encrypted, and the second is a string in "$type$salt$password" format, from | ||
304 | which the "type" and "salt" fields will be extracted to produce an encrypted | ||
305 | value. (Only the first two fields are needed, the third $ is equivalent to | ||
306 | the end of the string.) The return value is an encrypted password in | ||
307 | /etc/passwd format, with all three $ separated fields. It's stored in | ||
308 | a static buffer, 128 bytes long.</p> | ||
309 | |||
310 | <p>So when checking an existing password, if pw_encrypt(text, | ||
311 | old_encrypted_password) returns a string that compares identical to | ||
312 | old_encrypted_password, you've got the right password. When setting a new | ||
313 | password, generate a random 8 character salt string, put it in the right | ||
314 | format with sprintf(buffer, "$%c$%s", type, salt), and feed buffer as the | ||
315 | second argument to pw_encrypt(text,buffer).</p> | ||
316 | |||
317 | <h2><a name="tips_vfork">Fork and vfork</a></h2> | ||
318 | |||
319 | <p>On systems that haven't got a Memory Management Unit, fork() is unreasonably | ||
320 | expensive to implement (and sometimes even impossible), so a less capable | ||
321 | function called vfork() is used instead. (Using vfork() on a system with an | ||
322 | MMU is like pounding a nail with a wrench. Not the best tool for the job, but | ||
323 | it works.)</p> | ||
324 | |||
325 | <p>Busybox hides the difference between fork() and vfork() in | ||
326 | libbb/bb_fork_exec.c. If you ever want to fork and exec, use bb_fork_exec() | ||
327 | (which returns a pid and takes the same arguments as execve(), although in | ||
328 | this case envp can be NULL) and don't worry about it. This description is | ||
329 | here in case you want to know why that does what it does.</p> | ||
330 | |||
331 | <p>Implementing fork() depends on having a Memory Management Unit. With an | ||
332 | MMU then you can simply set up a second set of page tables and share the | ||
333 | physical memory via copy-on-write. So a fork() followed quickly by exec() | ||
334 | only copies a few pages of the parent's memory, just the ones it changes | ||
335 | before freeing them.</p> | ||
336 | |||
337 | <p>With a very primitive MMU (using a base pointer plus length instead of page | ||
338 | tables, which can provide virtual addresses and protect processes from each | ||
339 | other, but no copy on write) you can still implement fork. But it's | ||
340 | unreasonably expensive, because you have to copy all the parent process' | ||
341 | memory into the new process (which could easily be several megabytes per fork). | ||
342 | And you have to do this even though that memory gets freed again as soon as the | ||
343 | exec happens. (This is not just slow and a waste of space but causes memory | ||
344 | usage spikes that can easily cause the system to run out of memory.)</p> | ||
345 | |||
346 | <p>Without even a primitive MMU, you have no virtual addresses. Every process | ||
347 | can reach out and touch any other process' memory, because all pointers are to | ||
348 | physical addresses with no protection. Even if you copy a process' memory to | ||
349 | new physical addresses, all of its pointers point to the old objects in the | ||
350 | old process. (Searching through the new copy's memory for pointers and | ||
351 | redirect them to the new locations is not an easy problem.)</p> | ||
352 | |||
353 | <p>So with a primitive or missing MMU, fork() is just not a good idea.</p> | ||
354 | |||
355 | <p>In theory, vfork() is just a fork() that writeably shares the heap and stack | ||
356 | rather than copying it (so what one process writes the other one sees). In | ||
357 | practice, vfork() has to suspend the parent process until the child does exec, | ||
358 | at which point the parent wakes up and resumes by returning from the call to | ||
359 | vfork(). All modern kernel/libc combinations implement vfork() to put the | ||
360 | parent to sleep until the child does its exec. There's just no other way to | ||
361 | make it work: the parent has to know the child has done its exec() or exit() | ||
362 | before it's safe to return from the function it's in, so it has to block | ||
363 | until that happens. In fact without suspending the parent there's no way to | ||
364 | even store separate copies of the return value (the pid) from the vfork() call | ||
365 | itself: both assignments write into the same memory location.</p> | ||
366 | |||
367 | <p>One way to understand (and in fact implement) vfork() is this: imagine | ||
368 | the parent does a setjmp and then continues on (pretending to be the child) | ||
369 | until the exec() comes around, then the _exec_ does the actual fork, and the | ||
370 | parent does a longjmp back to the original vfork call and continues on from | ||
371 | there. (It thus becomes obvious why the child can't return, or modify | ||
372 | local variables it doesn't want the parent to see changed when it resumes.) | ||
373 | |||
374 | <p>Note a common mistake: the need for vfork doesn't mean you can't have two | ||
375 | processes running at the same time. It means you can't have two processes | ||
376 | sharing the same memory without stomping all over each other. As soon as | ||
377 | the child calls exec(), the parent resumes.</p> | ||
378 | |||
379 | <p>If the child's attempt to call exec() fails, the child should call _exit() | ||
380 | rather than a normal exit(). This avoids any atexit() code that might confuse | ||
381 | the parent. (The parent should never call _exit(), only a vforked child that | ||
382 | failed to exec.)</p> | ||
383 | |||
384 | <p>(Now in theory, a nommu system could just copy the _stack_ when it forks | ||
385 | (which presumably is much shorter than the heap), and leave the heap shared. | ||
386 | Even with no MMU at all | ||
387 | In practice, you've just wound up in a multi-threaded situation and you can't | ||
388 | do a malloc() or free() on your heap without freeing the other process' memory | ||
389 | (and if you don't have the proper locking for being threaded, corrupting the | ||
390 | heap if both of you try to do it at the same time and wind up stomping on | ||
391 | each other while traversing the free memory lists). The thing about vfork is | ||
392 | that it's a big red flag warning "there be dragons here" rather than | ||
393 | something subtle and thus even more dangerous.)</p> | ||
394 | |||
395 | <h2><a name="tips_sort_read">Short reads and writes</a></h2> | ||
396 | |||
397 | <p>Busybox has special functions, bb_full_read() and bb_full_write(), to | ||
398 | check that all the data we asked for got read or written. Is this a real | ||
399 | world consideration? Try the following:</p> | ||
400 | |||
401 | <pre>while true; do echo hello; sleep 1; done | tee out.txt</pre> | ||
402 | |||
403 | <p>If tee is implemented with bb_full_read(), tee doesn't display output | ||
404 | in real time but blocks until its entire input buffer (generally a couple | ||
405 | kilobytes) is read, then displays it all at once. In that case, we _want_ | ||
406 | the short read, for user interface reasons. (Note that read() should never | ||
407 | return 0 unless it has hit the end of input, and an attempt to write 0 | ||
408 | bytes should be ignored by the OS.)</p> | ||
409 | |||
410 | <p>As for short writes, play around with two processes piping data to each | ||
411 | other on the command line (cat bigfile | gzip > out.gz) and suspend and | ||
412 | resume a few times (ctrl-z to suspend, "fg" to resume). The writer can | ||
413 | experience short writes, which are especially dangerous because if you don't | ||
414 | notice them you'll discard data. They can also happen when a system is under | ||
415 | load and a fast process is piping to a slower one. (Such as an xterm waiting | ||
416 | on x11 when the scheduler decides X is being a CPU hog with all that | ||
417 | text console scrolling...)</p> | ||
418 | |||
419 | <p>So will data always be read from the far end of a pipe at the | ||
420 | same chunk sizes it was written in? Nope. Don't rely on that. For one | ||
421 | counterexample, see <a href="http://www.faqs.org/rfcs/rfc896.html">rfc 896 | ||
422 | for Nagle's algorithm</a>, which waits a fraction of a second or so before | ||
423 | sending out small amounts of data through a TCP/IP connection in case more | ||
424 | data comes in that can be merged into the same packet. (In case you were | ||
425 | wondering why action games that use TCP/IP set TCP_NODELAY to lower the latency | ||
426 | on their their sockets, now you know.)</p> | ||
427 | |||
428 | <h2><a name="tips_memory">Memory used by relocatable code, PIC, and static linking.</a></h2> | ||
429 | |||
430 | <p>The downside of standard dynamic linking is that it results in self-modifying | ||
431 | code. Although each executable's pages are mmaped() into a process' address | ||
432 | space from the executable file and are thus naturally shared between processes | ||
433 | out of the page cache, the library loader (ld-linux.so.2 or ld-uClibc.so.0) | ||
434 | writes to these pages to supply addresses for relocatable symbols. This | ||
435 | dirties the pages, triggering copy-on-write allocation of new memory for each | ||
436 | processes' dirtied pages.</p> | ||
437 | |||
438 | <p>One solution to this is Position Independent Code (PIC), a way of linking | ||
439 | a file so all the relocations are grouped together. This dirties fewer | ||
440 | pages (often just a single page) for each process' relocations. The down | ||
441 | side is this results in larger executables, which take up more space on disk | ||
442 | (and a correspondingly larger space in memory). But when many copies of the | ||
443 | same program are running, PIC dynamic linking trades a larger disk footprint | ||
444 | for a smaller memory footprint, by sharing more pages.</p> | ||
445 | |||
446 | <p>A third solution is static linking. A statically linked program has no | ||
447 | relocations, and thus the entire executable is shared between all running | ||
448 | instances. This tends to have a significantly larger disk footprint, but | ||
449 | on a system with only one or two executables, shared libraries aren't much | ||
450 | of a win anyway.</p> | ||
451 | |||
452 | <p>You can tell the glibc linker to display debugging information about its | ||
453 | relocations with the environment variable "LD_DEBUG". Try | ||
454 | "LD_DEBUG=help /bin/true" for a list of commands. Learning to interpret | ||
455 | "LD_DEBUG=statistics cat /proc/self/statm" could be interesting.</p> | ||
456 | |||
457 | <p>For more on this topic, here's Rich Felker:</p> | ||
458 | <blockquote> | ||
459 | <p>Dynamic linking (without fixed load addresses) fundamentally requires | ||
460 | at least one dirty page per dso that uses symbols. Making calls (but | ||
461 | never taking the address explicitly) to functions within the same dso | ||
462 | does not require a dirty page by itself, but will with ELF unless you | ||
463 | use -Bsymbolic or hidden symbols when linking.</p> | ||
464 | |||
465 | <p>ELF uses significant additional stack space for the kernel to pass all | ||
466 | the ELF data structures to the newly created process image. These are | ||
467 | located above the argument list and environment. This normally adds 1 | ||
468 | dirty page to the process size.</p> | ||
469 | |||
470 | <p>The ELF dynamic linker has its own data segment, adding one or more | ||
471 | dirty pages. I believe it also performs relocations on itself.</p> | ||
472 | |||
473 | <p>The ELF dynamic linker makes significant dynamic allocations to manage | ||
474 | the global symbol table and the loaded dso's. This data is never | ||
475 | freed. It will be needed again if libdl is used, so unconditionally | ||
476 | freeing it is not possible, but normal programs do not use libdl. Of | ||
477 | course with glibc all programs use libdl (due to nsswitch) so the | ||
478 | issue was never addressed.</p> | ||
479 | |||
480 | <p>ELF also has the issue that segments are not page-aligned on disk. | ||
481 | This saves up to 4k on disk, but at the expense of using an additional | ||
482 | dirty page in most cases, due to a large portion of the first data | ||
483 | page being filled with a duplicate copy of the last text page.</p> | ||
484 | |||
485 | <p>The above is just a partial list of the tiny memory penalties of ELF | ||
486 | dynamic linking, which eventually add up to quite a bit. The smallest | ||
487 | I've been able to get a process down to is 8 dirty pages, and the | ||
488 | above factors seem to mostly account for it (but some were difficult | ||
489 | to measure).</p> | ||
490 | </blockquote> | ||
491 | |||
492 | <h2><a name="tips_kernel_headers"></a>Including kernel headers</h2> | ||
493 | |||
494 | <p>The "linux" or "asm" directories of /usr/include contain Linux kernel | ||
495 | headers, so that the C library can talk directly to the Linux kernel. In | ||
496 | a perfect world, applications shouldn't include these headers directly, but | ||
497 | we don't live in a perfect world.</p> | ||
498 | |||
499 | <p>For example, Busybox's losetup code wants linux/loop.c because nothing else | ||
500 | #defines the structures to call the kernel's loopback device setup ioctls. | ||
501 | Attempts to cut and paste the information into a local busybox header file | ||
502 | proved incredibly painful, because portions of the loop_info structure vary by | ||
503 | architecture, namely the type __kernel_dev_t has different sizes on alpha, | ||
504 | arm, x86, and so on. Meaning we either #include <linux/posix_types.h> or | ||
505 | we hardwire #ifdefs to check what platform we're building on and define this | ||
506 | type appropriately for every single hardware architecture supported by | ||
507 | Linux, which is simply unworkable.</p> | ||
508 | |||
509 | <p>This is aside from the fact that the relevant type defined in | ||
510 | posix_types.h was renamed to __kernel_old_dev_t during the 2.5 series, so | ||
511 | to cut and paste the structure into our header we have to #include | ||
512 | <linux/version.h> to figure out which name to use. (What we actually do is | ||
513 | check if we're building on 2.6, and if so just use the new 64 bit structure | ||
514 | instead to avoid the rename entirely.) But we still need the version | ||
515 | check, since 2.4 didn't have the 64 bit structure.</p> | ||
516 | |||
517 | <p>The BusyBox developers spent <u>two years</u> _two years_ trying to figure | ||
518 | out a clean way to do all this. There isn't one. The losetup in the | ||
519 | util-linux package from kernel.org isn't doing it cleanly either, they just | ||
520 | hide the ugliness by nesting #include files. Their mount/loop.h | ||
521 | #includes "my_dev_t.h", which #includes <linux/posix_types.h> and | ||
522 | <linux/version.h> just like we do. There simply is no alternative.</p> | ||
523 | |||
524 | <p>We should never directly include kernel headers when there's a better | ||
525 | way to do it, but block copying information out of the kernel headers is not | ||
526 | a better way.</p> | ||
527 | |||
528 | <h2><a name="who">Who are the BusyBox developers?</a></h2> | ||
529 | |||
530 | <p>The following login accounts currently exist on busybox.net. (I.E. these | ||
531 | people can commit <a href="http://busybox.net/downloads/patches">patches</a> | ||
532 | into subversion for the BusyBox, uClibc, and buildroot projects.)</p> | ||
533 | |||
534 | <pre> | ||
535 | aldot :Bernhard Fischer | ||
536 | andersen :Erik Andersen <- uClibc and BuildRoot maintainer. | ||
537 | bug1 :Glenn McGrath | ||
538 | davidm :David McCullough | ||
539 | gkajmowi :Garrett Kajmowicz <- uClibc++ maintainer | ||
540 | jbglaw :Jan-Benedict Glaw | ||
541 | jocke :Joakim Tjernlund | ||
542 | landley :Rob Landley <- BusyBox maintainer | ||
543 | lethal :Paul Mundt | ||
544 | mjn3 :Manuel Novoa III | ||
545 | osuadmin :osuadmin | ||
546 | pgf :Paul Fox | ||
547 | pkj :Peter Kjellerstedt | ||
548 | prpplague :David Anders | ||
549 | psm :Peter S. Mazinger | ||
550 | russ :Russ Dill | ||
551 | sandman :Robert Griebl | ||
552 | sjhill :Steven J. Hill | ||
553 | solar :Ned Ludd | ||
554 | timr :Tim Riker | ||
555 | tobiasa :Tobias Anderberg | ||
556 | vapier :Mike Frysinger | ||
557 | </pre> | ||
558 | |||
559 | <p>The following accounts used to exist on busybox.net, but don't anymore so | ||
560 | I can't ask /etc/passwd for their names. (If anybody would like to make | ||
561 | a stab at it...)</p> | ||
562 | |||
563 | <pre> | ||
564 | aaronl | ||
565 | beppu | ||
566 | dwhedon | ||
567 | erik : Also Erik Andersen? | ||
568 | gfeldman | ||
569 | jimg | ||
570 | kraai | ||
571 | markw | ||
572 | miles | ||
573 | proski | ||
574 | rjune | ||
575 | tausq | ||
576 | vodz :Vladimir N. Oleynik | ||
577 | </pre> | ||
578 | |||
579 | |||
580 | <br> | ||
581 | <br> | ||
582 | <br> | ||
583 | |||
584 | <!--#include file="footer.html" --> | ||