crc32 - Python CRC-32 woes -


i'm writing python program extract data middle of 6 gb bz2 file. bzip2 file made of independently decryptable blocks of data, need find block (they delimited magic bits), create temporary one-block bzip2 file in memory, , pass bz2.decompress function. easy, no?

the bzip2 format has crc32 checksum file @ end. no problem, binascii.crc32 rescue. wait. data checksummed not end on byte boundary, , crc32 function operates on whole number of bytes.

my plan: use binascii.crc32 function on last byte, , function of own update computed crc last 1–7 bits. hours of coding , testing has left me bewildered, , puzzlement can boiled down question: how come crc32("\x00") not 0x00000000? shouldn't be, according wikipedia article?

you start 0b00000000 , pad 32 0's, polynomial division 0x04c11db7 until there no ones left in first 8 bits, immediately. last 32 bits checksum, , how can not zeroes?

i've searched google answers , looked @ code of several crc-32 implementations without finding clue why so.

how come crc32("\x00") not 0x00000000?

the basic crc algorithm treat input message polynomial in gf(2), divide fixed crc polynomial, , use polynomial remainder resulting hash.

crc-32 makes number of modifications on basic algorithm:

  1. the bits in each byte of message reversed. example, byte 0x01 treated polynomial x^7, not polynomial x^0.
  2. the message padded 32 zeros on right side.
  3. the first 4 bytes of reversed , padded message xor'd 0xffffffff.
  4. the remainder polynomial reversed.
  5. the remainder polynomial xor'd 0xffffffff.
  6. and recall crc-32 polynomial, in non-reversed form, 0x104c11db7.

let's work out crc-32 of one-byte string 0x00:

  1. message: 0x00
  2. reversed: 0x00
  3. padded: 0x00 00 00 00 00
  4. xor'd: 0xff ff ff ff 00
  5. remainder when divided 0x104c11db7: 0x4e 08 bf b4
  6. xor'd: 0xb1 f7 40 4b
  7. reversed: 0xd2 02 ef 8d

and there have it: crc-32 of 0x00 0xd202ef8d.
(you should verify this.)


Comments

Popular posts from this blog

apache - Add omitted ? to URLs -

redirect - bbPress Forum - rewrite to wwww.mysite prohibits login -

php - How can I stop spam on my custom forum/blog? -