Python string encode() - Encode a string in Python

The encode() converts a string to a bytes object.

b1 = 'abc'.encode()
b2 = 'русский'.encode()
b3 = 'ä'.encode()
b4 = '@$'.encode()

print(b1)  # b'abc'
print(b2)  # b'\xd1\x80\xd1\x83\xd1\x81\xd1\x81\xd0\xba\xd0\xb8\xd0\xb9'
print(b3)  # b'\xc3\xa4'
print(b4)  # b'@$'

The Python bytes object (not "byte" object) is a sequence of integers represented by ASCII characters.


encode(encoding, errors)

Parameter Description
encoding The optional parameter and the defual is utf-8.
errors The error handlers and optional. The default is 'strict'.
s = 'abcй'

b1 = s.encode(encoding='ascii', errors='ignore')
b2 = s.encode(encoding='ascii', errors='replace')
b3 = s.encode(encoding='ascii', errors='xmlcharrefreplace')
b4 = s.encode(encoding='ascii', errors='backslashreplace')
b5 = s.encode(encoding='ascii', errors='namereplace')

print(b1)  # b'abc'
print(b2)  # b'abc?'
print(b3)  # b'abcй'
print(b4)  # b'abc\\u0439'
print(b5)  # b'abc\\N{CYRILLIC SMALL LETTER SHORT I}'

strict is the default error handlers that raises the UnicodeEncodeError if the encode() can't encode a value with the given encoding.

s = 'abcй'
b = s.encode(encoding='ascii', errors='strict')

# UnicodeEncodeError: 'ascii' codec can't encode character '\u0439' in position 3: ordinal not in range(128)


Powered by Markdown