UnicodeDecodeError

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff

Traceback

terminal

Traceback (most recent call last):
  File "main.py", line 2, in <module>
    text = f.read()  # may fail if binary data
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff

What causes this error

Bytes were decoded with a codec that does not match their actual encoding. A byte sequence was encountered that is invalid for the specified encoding.

How to fix it

Identify the correct encoding of the data. Use `open(file, encoding='latin-1')` or the appropriate codec. Use the `chardet` library to detect unknown encodings. Pass `errors='replace'` to handle malformed data.

Code that causes this error

Broken

with open("data.bin", "r") as f:
    text = f.read()  # may fail if binary data

Fixed code

Fixed

with open("data.bin", "r", encoding="utf-8", errors="replace") as f:
    text = f.read()

About UnicodeDecodeError

A UnicodeDecodeError is raised when converting bytes to a string and the byte sequence is not valid in the specified encoding. This is one of the most common encoding errors, especially when reading files or network data. The error means that the bytes you have do not represent valid characters in the codec you chose.

For example, reading a Latin-1 encoded file as UTF-8, or processing binary data (images, PDFs) as if it were text. The error message tells you exactly which byte at which position caused the failure and which codec was used. To fix this, you need to identify the actual encoding of the data — tools like `chardet` or `charset_normalizer` can automatically detect encoding.

If you cannot determine the encoding, using `errors='replace'` substitutes invalid bytes with the Unicode replacement character (U+FFFD), and `errors='ignore'` silently skips them.

Common scenarios

Converting user input strings to numbers without validation

Unpacking iterables with an unexpected number of elements

Passing out-of-range values to mathematical functions

Processing data files with inconsistent or malformed records