Python Strings encode() method - GeeksforGeeks (2023)

Python String encode() converts a string value into a collection of bytes, using an encoding scheme specified by the user.

Python String encode() Method Syntax:

Syntax: encode(encoding, errors)

Parameters:

  • encoding: Specifies the encoding on the basis of which encoding has to be performed.
  • errors: Decides how to handle the errors if they occur, e.g ‘strict’ raises Unicode error in case of exception and ‘ignore’ ignores the errors that occurred. There are six types of error response
    • strict – default response which raises a UnicodeDecodeError exception on failure
    • ignore – ignores the unencodable unicode from the result
    • replace – replaces the unencodable unicode to a question mark ?
    • xmlcharrefreplace – inserts XML character reference instead of unencodable unicode
    • backslashreplace – inserts a \uNNNN escape sequence instead of unencodable unicode
    • namereplace – inserts a \N{…} escape sequence instead of unencodable unicode

Return: Returns the string in the encoded form

Python String encode() Method Example:

Python3

print("¶".encode('utf-8'))

(Video) Python Programming Tutorial | Python String Methods - Part 1 | GeeksforGeeks

Output:

b'\xc2\xb6'

Example 1: Code to print encoding schemes available

There are certain encoding schemes supported by Python String encode() method. We can get the supported encodings using the Python code below.

Python3

(Video) Python Programming Tutorial | Python String Methods - Part 2 | GeeksforGeeks

Output:

The available encodings are : dict_keys(['ibm039', 'iso_ir_226', '1140', 'iso_ir_110', '1252', 'iso_8859_8', 'iso_8859_3', 'iso_ir_166', 'cp367', 'uu', 'quotedprintable', 'ibm775', 'iso_8859_16_2001', 'ebcdic_cp_ch', 'gb2312_1980', 'ibm852', 'uhc', 'macgreek', '850', 'iso2022jp_2', 'hz_gb_2312', 'elot_928', 'iso8859_1', 'eucjp', 'iso_ir_199', 'ibm865', 'cspc862latinhebrew', '863', 'iso_8859_5', 'latin4', 'windows_1253', 'csisolatingreek', 'latin5', '855', 'windows_1256', 'rot13', 'ms1361', 'windows_1254', 'ibm863', 'iso_8859_14_1998', 'utf8_ucs2', '500', 'iso8859', '775', 'l7', 'l2', 'gb18030_2000', 'l9', 'utf_32be', 'iso_ir_100', 'iso_8859_4', 'iso_ir_157', 'csibm857', 'shiftjis2004', 'iso2022jp_1', 'iso_8859_2_1987', 'cyrillic', 'ibm861', 'ms950', 'ibm437', '866', 'csibm863', '932', 'iso_8859_14', 'cskoi8r', 'csptcp154', '852', 'maclatin2', 'sjis', 'korean', '865', 'u32', 'csshiftjis', 'dbcs', 'csibm037', 'csibm1026', 'bz2', 'quopri', '860', '1255', '861', 'iso_ir_127', 'iso_celtic', 'chinese', 'l8', '1258', 'u_jis', 'cspc850multilingual', 'iso_2022_jp_2', 'greek8', 'csibm861', '646', 'unicode_1_1_utf_7', 'ibm862', 'latin2', 'ecma_118', 'csisolatinarabic', 'zlib', 'iso2022jp_3', 'ksx1001', '858', 'hkscs', 'shiftjisx0213', 'base64', 'ibm857', 'maccentraleurope', 'latin7', 'ruscii', 'cp_is', 'iso_ir_101', 'us_ascii', 'hebrew', 'ansi_x3.4_1986', 'csiso2022jp', 'iso_8859_15', 'ibm860', 'ebcdic_cp_us', 'x_mac_simp_chinese', 'csibm855', '1250', 'maciceland', 'iso_ir_148', 'iso2022jp', 'u16', 'u7', 's_jisx0213', 'iso_8859_6_1987', 'csisolatinhebrew', 'csibm424', 'quoted_printable', 'utf_16le', 'tis260', 'utf', 'x_mac_trad_chinese', '1256', 'cp866u', 'jisx0213', 'csiso58gb231280', 'windows_1250', 'cp1361', 'kz_1048', 'asmo_708', 'utf_16be', 'ecma_114', 'eucjis2004', 'x_mac_japanese', 'utf8', 'iso_ir_6', 'cp_gr', '037', 'big5_tw', 'eucgb2312_cn', 'iso_2022_jp_3', 'euc_cn', 'iso_8859_13', 'iso_8859_5_1988', 'maccyrillic', 'ks_c_5601_1987', 'greek', 'ibm869', 'roman8', 'csibm500', 'ujis', 'arabic', 'strk1048_2002', '424', 'iso_8859_11_2001', 'l5', 'iso_646.irv_1991', '869', 'ibm855', 'eucjisx0213', 'latin1', 'csibm866', 'ibm864', 'big5_hkscs', 'sjis_2004', 'us', 'iso_8859_7', 'macturkish', 'iso_2022_jp_2004', '437', 'windows_1255', 's_jis_2004', 's_jis', '1257', 'ebcdic_cp_wt', 'iso2022jp_2004', 'ms949', 'utf32', 'shiftjis', 'latin', 'windows_1251', '1125', 'ks_x_1001', 'iso_8859_10_1992', 'mskanji', 'cyrillic_asian', 'ibm273', 'tis620', '1026', 'csiso2022kr', 'cspc775baltic', 'iso_ir_58', 'latin8', 'ibm424', 'iso_ir_126', 'ansi_x3.4_1968', 'windows_1257', 'windows_1252', '949', 'base_64', 'ms936', 'csisolatin2', 'utf7', 'iso646_us', 'macroman', '1253', '862', 'iso_8859_1_1987', 'csibm860', 'gb2312_80', 'latin10', 'ksc5601', 'iso_8859_10', 'utf8_ucs4', 'csisolatin4', 'ebcdic_cp_be', 'iso_8859_1', 'hzgb', 'ansi_x3_4_1968', 'ks_c_5601', 'l3', 'cspc8codepage437', 'iso_8859_7_1987', '8859', 'ibm500', 'ibm1026', 'iso_8859_6', 'csibm865', 'ibm866', 'windows_1258', 'iso_ir_138', 'l4', 'utf_32le', 'iso_8859_11', 'thai', '864', 'euc_jis2004', 'cp936', '1251', 'zip', 'unicodebigunmarked', 'csHPRoman8', 'csibm858', 'utf16', '936', 'ibm037', 'iso_8859_8_1988', '857', 'csibm869', 'ebcdic_cp_he', 'cp819', 'euccn', 'iso_8859_2', 'ms932', 'iso_2022_jp_1', 'iso_2022_kr', 'csisolatin6', 'iso_2022_jp', 'x_mac_korean', 'latin3', 'csbig5', 'hz_gb', 'csascii', 'u8', 'csisolatin5', 'csisolatincyrillic', 'ms_kanji', 'cspcp852', 'rk1048', 'iso2022jp_ext', 'csibm273', 'iso_2022_jp_ext', 'ibm858', 'ibm850', 'sjisx0213', 'tis_620_2529_1', 'l10', 'iso_ir_109', 'ibm1125', '1254', 'euckr', 'tis_620_0', 'l1', 'ibm819', 'iso2022kr', 'ibm367', '950', 'r8', 'hex', 'cp154', 'tis_620_2529_0', 'iso_8859_16', 'pt154', 'ebcdic_cp_ca', 'ibm1140', 'l6', 'csibm864', 'csisolatin1', 'csisolatin3', 'latin6', 'iso_8859_9_1989', 'iso_8859_3_1988', 'unicodelittleunmarked', 'macintosh', '273', 'latin9', 'iso_8859_4_1988', 'iso_8859_9', 'ebcdic_cp_nl', 'iso_ir_144'])

Example 2: Code to encode the string

Python3

(Video) Python Programming Tutorial | Python String Methods - Part 3 | GeeksforGeeks

Output:

b'\xc2\xb6'

Errors when using wrong encoding scheme

Example 1: Python String encode() method will raise UnicodeEncodeError if wrong encoding scheme is used

Output:

(Video) Python Programming Tutorial | Strings, Lists, Tuples, Iterations | GeeksforGeeks

UnicodeEncodeError: 'ascii' codec can't encode character '\xb6' in position 0: ordinal not in range(128)

Example 2: Using ‘errors’ parameter to ignore errors while encoding

Python String encode() method with errors parameter set to ‘ignore’ will ignore the errors in conversion of characters into specified encoding scheme.

Python3

string = "123-¶" # utf-8 character

# ignore if there are any errors

print(string.encode('ascii', errors='ignore'))

Output:

(Video) Python Programming Tutorial - Logical Operations and Splitting in Strings | GeeksforGeeks

b'123-'

FAQs

What is the purpose of encode () and decode () method? ›

This method is used to convert from one encoding scheme, in which argument string is encoded to the desired encoding scheme. This works opposite to the encode. It accepts the encoding of the encoding string to decode it and returns the original string.

What is the default encoding of the STR encode () method in Python? ›

Using the string encode() method, you can convert unicode strings into any encodings supported by Python. By default, Python uses utf-8 encoding.

What does decode () do in Python? ›

Python bytes decode() function is used to convert bytes to string object. Both these functions allow us to specify the error handling scheme to use for encoding/decoding errors. The default is 'strict' meaning that encoding errors raise a UnicodeEncodeError.

How do I utf-8 encode a string in Python? ›

How to Convert a String to UTF-8 in Python?
  1. string1 = "apple" string2 = "Preeti125" string3 = "12345" string4 = "pre@12"
  2. string. encode(encoding = 'UTF-8', errors = 'strict')
  3. # unicode string string = 'pythön!' # default encoding to utf-8 string_utf = string. encode() print('The encoded version is:', string_utf)

What is encode method? ›

An encoding method is the application of established industry rules to a coded character set to produce an encoded character scheme. Such rules prescribe the number of bits required for storing the numeric representation of a specific character and its code position in the encoding.

What is the purpose of encode () in the socket? ›

Send() Function Of Python Socket Class

In case the data is in string format, the encode() method of str can be called to convert it into bytes. flags – This is an optional parameter. The parameter flags has a default value of 0. The valid values of the flags parameter as supported by the operating system to be used.

What is encode in Python example? ›

The encode() function in Python is responsible for returning the encoded form of any given string. The code points are translated into a series of bytes to efficiently store such strings. This process is defined as encoding. Python uses utf-8 as its encoding by default.

What is the default value of encoding in encode ()? ›

8. What is the default value of encoding in encode()? Explanation: The default value of encoding is utf-8.

What is str () in Python called? ›

Python has a built-in string class named "str" with many handy features (there is an older module named "string" which you should not use). String literals can be enclosed by either double or single quotes, although single quotes are more commonly used.

What are the types of encoding in Python? ›

UTF-8 is one of the most commonly used encodings, and Python often defaults to using it. UTF stands for “Unicode Transformation Format”, and the '8' means that 8-bit values are used in the encoding. (There are also UTF-16 and UTF-32 encodings, but they are less frequently used than UTF-8.)

What is the difference between encoding and decoding? ›

Decoding involves translating printed words to sounds or reading, and encoding is just the opposite: using individual sounds to build and write words.

What is the difference between encode and decode Python? ›

In Python. In the Python programming language, encoding represents a Unicode string as a string of bytes. This commonly occurs when you transfer an instance over a network or save it to a disk file. Decoding transforms a string of bytes into a Unicode string.

What is encoding =' UTF-8? ›

UTF-8 is an encoding system for Unicode. It can translate any Unicode character to a matching unique binary string, and can also translate the binary string back to a Unicode character. This is the meaning of “UTF”, or “Unicode Transformation Format.”

How do I encode a string? ›

Another way to encode a string is to use the Base64 encoding.
...
For example, consider the following code:
  1. String str = " Tschüss";
  2. ByteBuffer buffer = StandardCharsets. UTF_8. encode(str);
  3. String encoded_String = StandardCharsets. UTF_8. decode(buffer). toString(); assertEquals(str, encoded_String);

Why do we use UTF-8 encoding in Python? ›

UTF-8: It uses 1, 2, 3 or 4 bytes to encode every code point. It is backwards compatible with ASCII. All English characters just need 1 byte — which is quite efficient. We only need more bytes if we are sending non-English characters.

What is encode with example? ›

Encode means to change something into a programming code. For instance, changing a letter into the binary code for that letter or changing an analog sound into a digital file.

Why do we need to encode in Python? ›

Encoding, at its most basic, is a method of converting characters (such as letters, punctuation, symbols, whitespace, and control characters) to numbers and, eventually, bits. Each character can be encoded to a distinct bit sequence. To achieve this encoding process, Python encode() function is used.

How do you encode a message in Python? ›

Steps to develop message encode-decode project
  1. Import Libraries. from tkinter import * ...
  2. Initialize Window. root = Tk() ...
  3. Define variables. Text = StringVar() ...
  4. Function to encode. def Encode(key,message): ...
  5. Function to decode. def Decode(key,message): ...
  6. Function to set mode. def Mode(): ...
  7. Function to exit window. ...
  8. Function to reset window.

How do you encode? ›

Encoding information occurs through automatic processing and effortful processing. If someone asks you what you ate for lunch today, more than likely you could recall this information quite easily. This is known as automatic processing , or the encoding of details like time, space, frequency, and the meaning of words.

Why do we encode characters? ›

A character encoding provides a key to unlock (ie. crack) the code. It is a set of mappings between the bytes in the computer and the characters in the character set. Without the key, the data looks like garbage.

What are the four types of encoding? ›

There are four different types of encoding: visual, acoustic, semantic, and elaborative. Encoding is how the information is processed, stored, and retrieved; however, if it is encoded incorrectly, this can lead to a false memory.

How do you use encode in a sentence? ›

Each party encodes confidential data in a form not directly readable by the other. The human mind is like a computer which encodes information and stores it for future use.

What are the 3 types of character encoding? ›

There are three different Unicode character encodings: UTF-8, UTF-16 and UTF-32. Of these three, only UTF-8 should be used for Web content.

Which encoding is the best? ›

UTF-8 has been the most common encoding for the World Wide Web since 2008. As of December 2022, UTF-8 accounts for on average 98.0% of all web pages (and 989 of the top 1,000 highest ranked web pages, the next most popular encoding, ISO-8859-1, is used by 9 of those sites).

Is str () a method? ›

The str() method takes three parameters: object - whose string representation is to be returned. encoding - that the given byte object needs to be decoded to (can be UTF-8, ASCII, etc) errors - a response when decoding fails (can be strict, ignore, replace, etc)

What does str at () return? ›

Return value : Returns character at the specified position in the string.

What is __ init __ in Python? ›

The __init__ method is the Python equivalent of the C++ constructor in an object-oriented approach. The __init__ function is called every time an object is created from a class. The __init__ method lets the class initialize the object's attributes and serves no other purpose. It is only used within classes.

What are the 3 types of methods in Python? ›

There are basically three types of methods in Python: Instance Method. Class Method. Static Method.

How many encoding techniques are there? ›

12 different encoding techniques from basic to advanced

They are 12 basic encoding schemes that you should put in your tool kit.

What are the 3 main ways information is encoded? ›

Summary. Memory encoding is a process by which the sensory information is modified and stored in the brain. The three major types of memory encoding include visual encoding, acoustic encoding, and semantic encoding.

Which comes first encoding or decoding? ›

In order to do one part well, you need the other part. In order to read, you need to decode (sound out) words. In order to spell, you need to encode words. In other words, to spell, you need to pull the sounds apart within a word and match the letters to the sounds.

Is encoding and coding same? ›

The main difference among programming and code is that coding uses machine code, while encoding uses human language to communicate with a computer. The latter uses binary unique codes to communicate with hardware.

What is encoding and decoding with example? ›

For example, you may realize you're hungry and encode the following message to send to your roommate: “I'm hungry. Do you want to get pizza tonight?” As your roommate receives the message, they decode your communication and turn it back into thoughts to make meaning.

Is encoded and encrypted the same? ›

While encryption does involve encoding data, the two are not interchangeable terms, encryption is always used when referring to data that has been securely encoded. Encoding data is used only when talking about data that is not securely encoded. There are two basic types of encryption: symmetric key and public key.

Is UTF-8 or UTF-16 better? ›

UTF-16 is better where ASCII is not predominant, since it uses 2 bytes per character, primarily. UTF-8 will start to use 3 or more bytes for the higher order characters where UTF-16 remains at just 2 bytes for most characters. UTF-32 will cover all possible characters in 4 bytes.

What is the difference between UTF-8 and UTF-8? ›

UTF-8 is a valid IANA character set name, whereas utf8 is not. It's not even a valid alias. it refers to an implementation-provided locale, where settings of language, territory, and codeset are implementation-defined.

Are UTF-8 and Unicode the same? ›

The Difference Between Unicode and UTF-8

Unicode is a character set. UTF-8 is encoding. Unicode is a list of characters with unique decimal numbers (code points).

How do I know if a string is encoded? ›

In PHP, mb_detect_encoding() is used to detect the character encoding. It can detect the character encoding for a string from an ordered list of candidates. This function is supported in PHP 4.0. 6 or higher version.

How do you check if a string is encoded? ›

The decoded string will contain a space character instead and it will not be equal. A better way would be to compare the lengths. If the original string is larger than the decoded string, then the original was encoded.

How do you change encoded strings? ›

Strings are immutable in Java, which means we cannot change a String character encoding. To achieve what we want, we need to copy the bytes of the String and then create a new one with the desired encoding.

Is UTF-8 character set or encoding? ›

UTF-8 is a character encoding system. It lets you represent characters as ASCII text, while still allowing for international characters, such as Chinese characters. As of the mid 2020s, UTF-8 is one of the most popular encoding systems.

Is UTF-8 ASCII or Unicode? ›

UTF-8 encodes Unicode characters into a sequence of 8-bit bytes. The standard has a capacity for over a million distinct codepoints and is a superset of all characters in widespread use today. By comparison, ASCII (American Standard Code for Information Interchange) includes 128 character codes.

How do I encode an UTF-8 file? ›

UTF-8 Encoding in Notepad (Windows)
  1. Open your CSV file in Notepad.
  2. Click File in the top-left corner of your screen.
  3. Click Save as...
  4. In the dialog which appears, select the following options: In the "Save as type" drop-down, select All Files. In the "Encoding" drop-down, select UTF-8. ...
  5. Click Save.

Is UTF-8 the default encoding? ›

The Default property on .

NET Core, the Default property always returns the UTF8Encoding. UTF-8 is supported on all the operating systems (Windows, Linux, and macOS) on which .

Why UTF-8 is used in Python? ›

UTF-8 is fairly compact; the majority of commonly used characters can be represented with one or two bytes. If bytes are corrupted or lost, it's possible to determine the start of the next UTF-8-encoded code point and resynchronize. It's also unlikely that random 8-bit data will look like valid UTF-8.

Why is it called UTF-8? ›

UTF-8 is a variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from Unicode (or Universal Coded Character Set) Transformation Format – 8-bit.

Is UTF-8 outdated? ›

utf8 is currently an alias for utf8mb3 , but it is now deprecated as such, and utf8 is expected subsequently to become a reference to utf8mb4 . Beginning with MySQL 8.0.

Videos

1. Python Programming Tutorial | Program to split and join a string | GeeksforGeeks
(GeeksforGeeks)
2. Python standard library: Encoding and decoding strings
(Reuven Lerner)
3. Reverse string in Python | GeeksforGeeks
(GeeksforGeeks)
4. Decode the string💥😃Problem of the Day #geeksforgeeks#leetcode
(CODE with Rahul.)
5. Convert a list of characters into a string in Python | GeeksforGeeks
(GeeksforGeeks)
6. Run Length Encoding [GeeksforGeeks] [Python]
(Todd Chaney)
Top Articles
Latest Posts
Article information

Author: Horacio Brakus JD

Last Updated: 12/18/2022

Views: 5832

Rating: 4 / 5 (71 voted)

Reviews: 86% of readers found this page helpful

Author information

Name: Horacio Brakus JD

Birthday: 1999-08-21

Address: Apt. 524 43384 Minnie Prairie, South Edda, MA 62804

Phone: +5931039998219

Job: Sales Strategist

Hobby: Sculling, Kitesurfing, Orienteering, Painting, Computer programming, Creative writing, Scuba diving

Introduction: My name is Horacio Brakus JD, I am a lively, splendid, jolly, vivacious, vast, cheerful, agreeable person who loves writing and wants to share my knowledge and understanding with you.