How to check if an array of bytes is a valid UTF-8 string in Python

1 Answer

0 votes
def is_valid_utf8(data: bytes) -> bool:
    try:
        data.decode('utf-8')
        return True
    except UnicodeDecodeError:
        return False

arr1 = "Hello, 世界".encode("utf-8")
arr2 = bytes([0xa3, 0xed, 0xfd])

print(is_valid_utf8(arr1))  
print(is_valid_utf8(arr2)) 



'''
run:

True
False

'''

 



answered Jul 8, 2025 by avibootz
...