OLD | NEW |
1 :mod:`codecs` --- Codec registry and base classes | 1 :mod:`codecs` --- Codec registry and base classes |
2 ================================================= | 2 ================================================= |
3 | 3 |
4 .. module:: codecs | 4 .. module:: codecs |
5 :synopsis: Encode and decode data and streams. | 5 :synopsis: Encode and decode data and streams. |
6 .. moduleauthor:: Marc-Andre Lemburg <mal@lemburg.com> | 6 .. moduleauthor:: Marc-Andre Lemburg <mal@lemburg.com> |
7 .. sectionauthor:: Marc-Andre Lemburg <mal@lemburg.com> | 7 .. sectionauthor:: Marc-Andre Lemburg <mal@lemburg.com> |
8 .. sectionauthor:: Martin v. Löwis <martin@v.loewis.de> | 8 .. sectionauthor:: Martin v. Löwis <martin@v.loewis.de> |
9 | 9 |
10 | 10 |
(...skipping 304 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
315 | | U+FFFD REPLACEMENT CHARACTER for the built-in | | 315 | | U+FFFD REPLACEMENT CHARACTER for the built-in | |
316 | | Unicode codecs on decoding and '?' on | | 316 | | Unicode codecs on decoding and '?' on | |
317 | | encoding. | | 317 | | encoding. | |
318 +-------------------------+-----------------------------------------------+ | 318 +-------------------------+-----------------------------------------------+ |
319 | ``'xmlcharrefreplace'`` | Replace with the appropriate XML character | | 319 | ``'xmlcharrefreplace'`` | Replace with the appropriate XML character | |
320 | | reference (only for encoding). | | 320 | | reference (only for encoding). | |
321 +-------------------------+-----------------------------------------------+ | 321 +-------------------------+-----------------------------------------------+ |
322 | ``'backslashreplace'`` | Replace with backslashed escape sequences | | 322 | ``'backslashreplace'`` | Replace with backslashed escape sequences | |
323 | | (only for encoding). | | 323 | | (only for encoding). | |
324 +-------------------------+-----------------------------------------------+ | 324 +-------------------------+-----------------------------------------------+ |
| 325 | ``'utf8b'`` | Replace byte with surrogate U+DCxx. | |
| 326 +-------------------------+-----------------------------------------------+ |
325 | 327 |
326 In addition, the following error handlers are specific to a single codec: | 328 In addition, the following error handlers are specific to a single codec: |
327 | 329 |
328 +------------------+---------+--------------------------------------------+ | 330 +------------------+---------+--------------------------------------------+ |
329 | Value | Codec | Meaning | | 331 | Value | Codec | Meaning | |
330 +==================+=========+============================================+ | 332 +==================+=========+============================================+ |
331 | ``'surrogates'`` | utf-8 | Allow encoding and decoding of surrogate | | 333 | ``'surrogates'`` | utf-8 | Allow encoding and decoding of surrogate | |
332 | | | codes in UTF-8. | | 334 | | | codes in UTF-8. | |
333 +------------------+---------+--------------------------------------------+ | 335 +------------------+---------+--------------------------------------------+ |
334 | 336 |
(...skipping 874 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
1209 | 1211 |
1210 .. module:: encodings.utf_8_sig | 1212 .. module:: encodings.utf_8_sig |
1211 :synopsis: UTF-8 codec with BOM signature | 1213 :synopsis: UTF-8 codec with BOM signature |
1212 .. moduleauthor:: Walter Dörwald | 1214 .. moduleauthor:: Walter Dörwald |
1213 | 1215 |
1214 This module implements a variant of the UTF-8 codec: On encoding a UTF-8 encoded | 1216 This module implements a variant of the UTF-8 codec: On encoding a UTF-8 encoded |
1215 BOM will be prepended to the UTF-8 encoded bytes. For the stateful encoder this | 1217 BOM will be prepended to the UTF-8 encoded bytes. For the stateful encoder this |
1216 is only done once (on the first write to the byte stream). For decoding an | 1218 is only done once (on the first write to the byte stream). For decoding an |
1217 optional UTF-8 encoded BOM at the start of the data will be skipped. | 1219 optional UTF-8 encoded BOM at the start of the data will be skipped. |
1218 | 1220 |
OLD | NEW |