diff options
| author | Rob Landley <rob@landley.net> | 2006-01-29 06:29:01 +0000 |
|---|---|---|
| committer | Rob Landley <rob@landley.net> | 2006-01-29 06:29:01 +0000 |
| commit | b1b3cee831bc8dfcf439ad69f4694d0a8ca3f7e9 (patch) | |
| tree | ae7a94253a6b4769793cd246e506ec2f665a10ca /docs/busybox.net/programming.html | |
| parent | 08a1b5095d710c5d905056a9daa14a1acad5590b (diff) | |
| download | busybox-w32-b1b3cee831bc8dfcf439ad69f4694d0a8ca3f7e9.tar.gz busybox-w32-b1b3cee831bc8dfcf439ad69f4694d0a8ca3f7e9.tar.bz2 busybox-w32-b1b3cee831bc8dfcf439ad69f4694d0a8ca3f7e9.zip | |
Add explanations of encrypted passwords, and fork vs vfork.
Diffstat (limited to 'docs/busybox.net/programming.html')
| -rw-r--r-- | docs/busybox.net/programming.html | 115 |
1 files changed, 115 insertions, 0 deletions
diff --git a/docs/busybox.net/programming.html b/docs/busybox.net/programming.html index e44f291b3..f77f3c3a6 100644 --- a/docs/busybox.net/programming.html +++ b/docs/busybox.net/programming.html | |||
| @@ -12,6 +12,11 @@ | |||
| 12 | </ul> | 12 | </ul> |
| 13 | <li><a href="#adding">Adding an applet to busybox</a></li> | 13 | <li><a href="#adding">Adding an applet to busybox</a></li> |
| 14 | <li><a href="#standards">What standards does busybox adhere to?</a></li> | 14 | <li><a href="#standards">What standards does busybox adhere to?</a></li> |
| 15 | <li><a href="#tips">Tips and tricks.</a></li> | ||
| 16 | <ul> | ||
| 17 | <li><a href="#tips_encrypted_passwords">Encrypted Passwords</a></li> | ||
| 18 | <li><a href="#tips_vfork">Fork and vfork</a></li> | ||
| 19 | </ul> | ||
| 15 | </ul> | 20 | </ul> |
| 16 | 21 | ||
| 17 | <h2><b><a name="goals" />What are the goals of busybox?</b></h2> | 22 | <h2><b><a name="goals" />What are the goals of busybox?</b></h2> |
| @@ -172,6 +177,116 @@ applet is otherwise finished. When polishing and testing a busybox applet, | |||
| 172 | we ensure we have at least the option of full standards compliance, or else | 177 | we ensure we have at least the option of full standards compliance, or else |
| 173 | document where we (intentionally) fall short.</p> | 178 | document where we (intentionally) fall short.</p> |
| 174 | 179 | ||
| 180 | <h2><a name="tips" />Programming tips and tricks.</a></h2> | ||
| 181 | |||
| 182 | <p>Various things busybox uses that aren't particularly well documented | ||
| 183 | elsewhere.</p> | ||
| 184 | |||
| 185 | <h2><a name="tips_encrypted_passwords">Encrypted Passwords</a></h2> | ||
| 186 | |||
| 187 | <p>Password fields in /etc/passwd and /etc/shadow are in a special format. | ||
| 188 | If the first character isn't '$', then it's an old DES style password. If | ||
| 189 | the first character is '$' then the password is actually three fields | ||
| 190 | separated by '$' characters:</p> | ||
| 191 | <pre> | ||
| 192 | <b>$type$salt$encrypted_password</b> | ||
| 193 | </pre> | ||
| 194 | |||
| 195 | <p>The "type" indicates which encryption algorithm to use: 1 for MD5 and 2 for SHA1.</p> | ||
| 196 | |||
| 197 | <p>The "salt" is a bunch of ramdom characters (generally 8) the encryption | ||
| 198 | algorithm uses to perturb the password in a known and reproducible way (such | ||
| 199 | as by appending the random data to the unencrypted password, or combining | ||
| 200 | them with exclusive or). Salt is randomly generated when setting a password, | ||
| 201 | and then the same salt value is re-used when checking the password. (Salt is | ||
| 202 | thus stored unencrypted.)</p> | ||
| 203 | |||
| 204 | <p>The advantage of using salt is that the same cleartext password encrypted | ||
| 205 | with a different salt value produces a different encrypted value. | ||
| 206 | If each encrypted password uses a different salt value, an attacker is forced | ||
| 207 | to do the cryptographic math all over again for each password they want to | ||
| 208 | check. Without salt, they could simply produce a big dictionary of commonly | ||
| 209 | used passwords ahead of time, and look up each password in a stolen password | ||
| 210 | file to see if it's a known value. (Even if there are billions of possible | ||
| 211 | passwords in the dictionary, checking each one is just a binary search against | ||
| 212 | a file only a few gigabytes long.) With salt they can't even tell if two | ||
| 213 | different users share the same password without guessing what that password | ||
| 214 | is and decrypting it. They also can't precompute the attack dictionary for | ||
| 215 | a specific password until they know what the salt value is.</p> | ||
| 216 | |||
| 217 | <p>The third field is the encrypted password (plus the salt). For md5 this | ||
| 218 | is 22 bytes.</p> | ||
| 219 | |||
| 220 | <p>The busybox function to handle all this is pw_encrypt(clear, salt) in | ||
| 221 | "libbb/pw_encrypt.c". The first argument is the clear text password to be | ||
| 222 | encrypted, and the second is a string in "$type$salt$password" format, from | ||
| 223 | which the "type" and "salt" fields will be extracted to produce an encrypted | ||
| 224 | value. (Only the first two fields are needed, the third $ is equivalent to | ||
| 225 | the end of the string.) The return value is an encrypted password in | ||
| 226 | /etc/passwd format, with all three $ separated fields. It's stored in | ||
| 227 | a static buffer, 128 bytes long.</p> | ||
| 228 | |||
| 229 | <p>So when checking an existing password, if pw_encrypt(text, | ||
| 230 | old_encrypted_password) returns a string that compares identical to | ||
| 231 | old_encrypted_password, you've got the right password. When setting a new | ||
| 232 | password, generate a random 8 character salt string, put it in the right | ||
| 233 | format with sprintf(buffer, "$%c$%s", type, salt), and feed buffer as the | ||
| 234 | second argument to pw_encrypt(text,buffer).</p> | ||
| 235 | |||
| 236 | <h2><a name="tips_vfork">Fork and vfork</a></h2> | ||
| 237 | |||
| 238 | <p>On systems that haven't got a Memory Management Unit, fork() is unreasonably | ||
| 239 | expensive to implement, so a less capable function called vfork() is used | ||
| 240 | instead.</p> | ||
| 241 | |||
| 242 | <p>The reason vfork() exists is that if you haven't got an MMU then you can't | ||
| 243 | simply set up a second set of page tables and share the physical memory via | ||
| 244 | copy-on-write, which is what fork() normally does. This means that actually | ||
| 245 | forking has to copy all the parent's memory (which could easily be tens of | ||
| 246 | megabytes). And you have to do this even though that memory gets freed again | ||
| 247 | as soon as the exec happens, so it's probably all a big waste of time.</p> | ||
| 248 | |||
| 249 | <p>This is not only slow and a waste of space, it also causes totally | ||
| 250 | unnecessary memory usage spikes based on how big the _parent_ process is (not | ||
| 251 | the child), and these spikes are quite likely to trigger an out of memory | ||
| 252 | condition on small systems (which is where nommu is common anyway). So | ||
| 253 | although you _can_ emulate a real fork on a nommu system, you really don't | ||
| 254 | want to.</p> | ||
| 255 | |||
| 256 | <p>In theory, vfork() is just a fork() that writeably shares the heap and stack | ||
| 257 | rather than copying it (so what one process writes the other one sees). In | ||
| 258 | practice, vfork() has to suspend the parent process until the child does exec, | ||
| 259 | at which point the parent wakes up and resumes by returning from the call to | ||
| 260 | vfork(). All modern kernel/libc combinations implement vfork() to put the | ||
| 261 | parent to sleep until the child does its exec. There's just no other way to | ||
| 262 | make it work: they're sharing the same stack, so if either one returns from its | ||
| 263 | function it stomps on the callstack so that when the other process returns, | ||
| 264 | hilarity ensues. In fact without suspending the parent there's no way to even | ||
| 265 | store separate copies of the return value (the pid) from the vfork() call | ||
| 266 | itself: both assignments write into the same memory location.</p> | ||
| 267 | |||
| 268 | <p>One way to understand (and in fact implement) vfork() is this: imagine | ||
| 269 | the parent does a setjmp and then continues on (pretending to be the child) | ||
| 270 | until the exec() comes around, then the _exec_ does the actual fork, and the | ||
| 271 | parent does a longjmp back to the original vfork call and continues on from | ||
| 272 | there. (It thus becomes obvious why the child can't return, or modify | ||
| 273 | local variables it doesn't want the parent to see changed when it resumes.) | ||
| 274 | |||
| 275 | <p>Note a common mistake: the need for vfork doesn't mean you can't have two | ||
| 276 | processes running at the same time. It means you can't have two processes | ||
| 277 | sharing the same memory without stomping all over each other. As soon as | ||
| 278 | the child calls exec(), the parent resumes.</p> | ||
| 279 | |||
| 280 | <p>(Now in theory, a nommu system could just copy the _stack_ when it forks | ||
| 281 | (which presumably is much shorter than the heap), and leave the heap shared. | ||
| 282 | In practice, you've just wound up in a multi-threaded situation and you can't | ||
| 283 | do a malloc() or free() on your heap without freeing the other process's memory | ||
| 284 | (and if you don't have the proper locking for being threaded, corrupting the | ||
| 285 | heap if both of you try to do it at the same time and wind up stomping on | ||
| 286 | each other while traversing the free memory lists). The thing about vfork is | ||
| 287 | that it's a big red flag warning "there be dragons here" rather than | ||
| 288 | something subtle and thus even more dangerous.)</p> | ||
| 289 | |||
| 175 | <br> | 290 | <br> |
| 176 | <br> | 291 | <br> |
| 177 | <br> | 292 | <br> |
