UnicodeError
UnicodeError: encoding error
Traceback
Traceback (most recent call last):
File "main.py", line 1, in <module>
text = b"\xff\xfe".decode("utf-8")
UnicodeError: encoding errorWhat causes this error
An encoding or decoding operation failed due to characters or bytes that are incompatible with the specified codec. The data contains bytes or characters that cannot be represented in the target encoding.
How to fix it
Specify the correct encoding when opening files (`open(file, encoding='utf-8')`). Use error handlers like `errors='replace'` or `errors='ignore'` when the encoding is unknown. Use chardet or charset_normalizer to detect unknown encodings.
Code that causes this error
text = b"\xff\xfe".decode("utf-8")Fixed code
text = b"\xff\xfe".decode("utf-16")
# or with error handling:
text = b"\xff\xfe".decode("utf-8", errors="replace")About UnicodeError
UnicodeError is the base class for encoding and decoding errors related to Unicode text. It has three subclasses: UnicodeDecodeError, UnicodeEncodeError, and UnicodeTranslateError. In Python 3, all strings are Unicode by default, but errors arise when converting between bytes and strings with an incompatible encoding.
This typically happens when reading files with the wrong encoding, when processing web responses with misdetected character sets, or when interacting with system APIs that expect a specific encoding. The error message includes the codec name, the problematic byte position, and the character that caused the issue. Understanding the relationship between str (Unicode text) and bytes (raw byte sequences) is essential for resolving these errors.
Common scenarios
Converting user input strings to numbers without validation
Unpacking iterables with an unexpected number of elements
Passing out-of-range values to mathematical functions
Processing data files with inconsistent or malformed records