UnicodeError

UnicodeError

UnicodeError: encoding error

Traceback

terminal

Traceback (most recent call last):
  File "main.py", line 1, in <module>
    text = b"\xff\xfe".decode("utf-8")
UnicodeError: encoding error

What causes this error

An encoding or decoding operation failed due to characters or bytes that are incompatible with the specified codec. The data contains bytes or characters that cannot be represented in the target encoding.

How to fix it

Specify the correct encoding when opening files (`open(file, encoding='utf-8')`). Use error handlers like `errors='replace'` or `errors='ignore'` when the encoding is unknown. Use chardet or charset_normalizer to detect unknown encodings.

Code that causes this error

Broken

text = b"\xff\xfe".decode("utf-8")

Fixed code

Fixed

text = b"\xff\xfe".decode("utf-16")
# or with error handling:
text = b"\xff\xfe".decode("utf-8", errors="replace")

About UnicodeError

UnicodeError is the base class for encoding and decoding errors related to Unicode text. It has three subclasses: UnicodeDecodeError, UnicodeEncodeError, and UnicodeTranslateError. In Python 3, all strings are Unicode by default, but errors arise when converting between bytes and strings with an incompatible encoding.

This typically happens when reading files with the wrong encoding, when processing web responses with misdetected character sets, or when interacting with system APIs that expect a specific encoding. The error message includes the codec name, the problematic byte position, and the character that caused the issue. Understanding the relationship between str (Unicode text) and bytes (raw byte sequences) is essential for resolving these errors.

Common scenarios

Converting user input strings to numbers without validation

Unpacking iterables with an unexpected number of elements

Passing out-of-range values to mathematical functions

Processing data files with inconsistent or malformed records