diff options
author | landley <landley@69ca8d6d-28ef-0310-b511-8ec308f3f277> | 2006-01-29 06:29:01 +0000 |
---|---|---|
committer | landley <landley@69ca8d6d-28ef-0310-b511-8ec308f3f277> | 2006-01-29 06:29:01 +0000 |
commit | 71b7f50951eaff7d522ef239103348092ce8e11a (patch) | |
tree | ae7a94253a6b4769793cd246e506ec2f665a10ca /docs | |
parent | b46881bc3faffde3bdaa87bd66df576c3226f13d (diff) | |
download | busybox-w32-71b7f50951eaff7d522ef239103348092ce8e11a.tar.gz busybox-w32-71b7f50951eaff7d522ef239103348092ce8e11a.tar.bz2 busybox-w32-71b7f50951eaff7d522ef239103348092ce8e11a.zip |
Add explanations of encrypted passwords, and fork vs vfork.
git-svn-id: svn://busybox.net/trunk/busybox@13650 69ca8d6d-28ef-0310-b511-8ec308f3f277
Diffstat (limited to 'docs')
-rw-r--r-- | docs/busybox.net/programming.html | 115 |
1 files changed, 115 insertions, 0 deletions
diff --git a/docs/busybox.net/programming.html b/docs/busybox.net/programming.html index e44f291b3..f77f3c3a6 100644 --- a/docs/busybox.net/programming.html +++ b/docs/busybox.net/programming.html | |||
@@ -12,6 +12,11 @@ | |||
12 | </ul> | 12 | </ul> |
13 | <li><a href="#adding">Adding an applet to busybox</a></li> | 13 | <li><a href="#adding">Adding an applet to busybox</a></li> |
14 | <li><a href="#standards">What standards does busybox adhere to?</a></li> | 14 | <li><a href="#standards">What standards does busybox adhere to?</a></li> |
15 | <li><a href="#tips">Tips and tricks.</a></li> | ||
16 | <ul> | ||
17 | <li><a href="#tips_encrypted_passwords">Encrypted Passwords</a></li> | ||
18 | <li><a href="#tips_vfork">Fork and vfork</a></li> | ||
19 | </ul> | ||
15 | </ul> | 20 | </ul> |
16 | 21 | ||
17 | <h2><b><a name="goals" />What are the goals of busybox?</b></h2> | 22 | <h2><b><a name="goals" />What are the goals of busybox?</b></h2> |
@@ -172,6 +177,116 @@ applet is otherwise finished. When polishing and testing a busybox applet, | |||
172 | we ensure we have at least the option of full standards compliance, or else | 177 | we ensure we have at least the option of full standards compliance, or else |
173 | document where we (intentionally) fall short.</p> | 178 | document where we (intentionally) fall short.</p> |
174 | 179 | ||
180 | <h2><a name="tips" />Programming tips and tricks.</a></h2> | ||
181 | |||
182 | <p>Various things busybox uses that aren't particularly well documented | ||
183 | elsewhere.</p> | ||
184 | |||
185 | <h2><a name="tips_encrypted_passwords">Encrypted Passwords</a></h2> | ||
186 | |||
187 | <p>Password fields in /etc/passwd and /etc/shadow are in a special format. | ||
188 | If the first character isn't '$', then it's an old DES style password. If | ||
189 | the first character is '$' then the password is actually three fields | ||
190 | separated by '$' characters:</p> | ||
191 | <pre> | ||
192 | <b>$type$salt$encrypted_password</b> | ||
193 | </pre> | ||
194 | |||
195 | <p>The "type" indicates which encryption algorithm to use: 1 for MD5 and 2 for SHA1.</p> | ||
196 | |||
197 | <p>The "salt" is a bunch of ramdom characters (generally 8) the encryption | ||
198 | algorithm uses to perturb the password in a known and reproducible way (such | ||
199 | as by appending the random data to the unencrypted password, or combining | ||
200 | them with exclusive or). Salt is randomly generated when setting a password, | ||
201 | and then the same salt value is re-used when checking the password. (Salt is | ||
202 | thus stored unencrypted.)</p> | ||
203 | |||
204 | <p>The advantage of using salt is that the same cleartext password encrypted | ||
205 | with a different salt value produces a different encrypted value. | ||
206 | If each encrypted password uses a different salt value, an attacker is forced | ||
207 | to do the cryptographic math all over again for each password they want to | ||
208 | check. Without salt, they could simply produce a big dictionary of commonly | ||
209 | used passwords ahead of time, and look up each password in a stolen password | ||
210 | file to see if it's a known value. (Even if there are billions of possible | ||
211 | passwords in the dictionary, checking each one is just a binary search against | ||
212 | a file only a few gigabytes long.) With salt they can't even tell if two | ||
213 | different users share the same password without guessing what that password | ||
214 | is and decrypting it. They also can't precompute the attack dictionary for | ||
215 | a specific password until they know what the salt value is.</p> | ||
216 | |||
217 | <p>The third field is the encrypted password (plus the salt). For md5 this | ||
218 | is 22 bytes.</p> | ||
219 | |||
220 | <p>The busybox function to handle all this is pw_encrypt(clear, salt) in | ||
221 | "libbb/pw_encrypt.c". The first argument is the clear text password to be | ||
222 | encrypted, and the second is a string in "$type$salt$password" format, from | ||
223 | which the "type" and "salt" fields will be extracted to produce an encrypted | ||
224 | value. (Only the first two fields are needed, the third $ is equivalent to | ||
225 | the end of the string.) The return value is an encrypted password in | ||
226 | /etc/passwd format, with all three $ separated fields. It's stored in | ||
227 | a static buffer, 128 bytes long.</p> | ||
228 | |||
229 | <p>So when checking an existing password, if pw_encrypt(text, | ||
230 | old_encrypted_password) returns a string that compares identical to | ||
231 | old_encrypted_password, you've got the right password. When setting a new | ||
232 | password, generate a random 8 character salt string, put it in the right | ||
233 | format with sprintf(buffer, "$%c$%s", type, salt), and feed buffer as the | ||
234 | second argument to pw_encrypt(text,buffer).</p> | ||
235 | |||
236 | <h2><a name="tips_vfork">Fork and vfork</a></h2> | ||
237 | |||
238 | <p>On systems that haven't got a Memory Management Unit, fork() is unreasonably | ||
239 | expensive to implement, so a less capable function called vfork() is used | ||
240 | instead.</p> | ||
241 | |||
242 | <p>The reason vfork() exists is that if you haven't got an MMU then you can't | ||
243 | simply set up a second set of page tables and share the physical memory via | ||
244 | copy-on-write, which is what fork() normally does. This means that actually | ||
245 | forking has to copy all the parent's memory (which could easily be tens of | ||
246 | megabytes). And you have to do this even though that memory gets freed again | ||
247 | as soon as the exec happens, so it's probably all a big waste of time.</p> | ||
248 | |||
249 | <p>This is not only slow and a waste of space, it also causes totally | ||
250 | unnecessary memory usage spikes based on how big the _parent_ process is (not | ||
251 | the child), and these spikes are quite likely to trigger an out of memory | ||
252 | condition on small systems (which is where nommu is common anyway). So | ||
253 | although you _can_ emulate a real fork on a nommu system, you really don't | ||
254 | want to.</p> | ||
255 | |||
256 | <p>In theory, vfork() is just a fork() that writeably shares the heap and stack | ||
257 | rather than copying it (so what one process writes the other one sees). In | ||
258 | practice, vfork() has to suspend the parent process until the child does exec, | ||
259 | at which point the parent wakes up and resumes by returning from the call to | ||
260 | vfork(). All modern kernel/libc combinations implement vfork() to put the | ||
261 | parent to sleep until the child does its exec. There's just no other way to | ||
262 | make it work: they're sharing the same stack, so if either one returns from its | ||
263 | function it stomps on the callstack so that when the other process returns, | ||
264 | hilarity ensues. In fact without suspending the parent there's no way to even | ||
265 | store separate copies of the return value (the pid) from the vfork() call | ||
266 | itself: both assignments write into the same memory location.</p> | ||
267 | |||
268 | <p>One way to understand (and in fact implement) vfork() is this: imagine | ||
269 | the parent does a setjmp and then continues on (pretending to be the child) | ||
270 | until the exec() comes around, then the _exec_ does the actual fork, and the | ||
271 | parent does a longjmp back to the original vfork call and continues on from | ||
272 | there. (It thus becomes obvious why the child can't return, or modify | ||
273 | local variables it doesn't want the parent to see changed when it resumes.) | ||
274 | |||
275 | <p>Note a common mistake: the need for vfork doesn't mean you can't have two | ||
276 | processes running at the same time. It means you can't have two processes | ||
277 | sharing the same memory without stomping all over each other. As soon as | ||
278 | the child calls exec(), the parent resumes.</p> | ||
279 | |||
280 | <p>(Now in theory, a nommu system could just copy the _stack_ when it forks | ||
281 | (which presumably is much shorter than the heap), and leave the heap shared. | ||
282 | In practice, you've just wound up in a multi-threaded situation and you can't | ||
283 | do a malloc() or free() on your heap without freeing the other process's memory | ||
284 | (and if you don't have the proper locking for being threaded, corrupting the | ||
285 | heap if both of you try to do it at the same time and wind up stomping on | ||
286 | each other while traversing the free memory lists). The thing about vfork is | ||
287 | that it's a big red flag warning "there be dragons here" rather than | ||
288 | something subtle and thus even more dangerous.)</p> | ||
289 | |||
175 | <br> | 290 | <br> |
176 | <br> | 291 | <br> |
177 | <br> | 292 | <br> |