We must escape special characters that are not markup text but may be misinterpreted as such when saving raw HTML in databases or variables.
<, >, ", ', and &. are examples of these characters.
If certain characters are not escaped, the browser may display a web page improperly. For example, the quote marks around “Python Programs” in the following HTML content may cause confusion between the end and beginning of a new string.
Hello this is "Python Programs"
Special entity names and entity numbers are essentially escape sequences that replace these characters in HTML. In HTML, escape sequences begin with an ampersand and terminate with a semicolon.
The table below lists the special characters that HTML 4 recommends escaping, as well as their entity names and numbers:
Character Entityname EntityNumber
> > >
< < <
” " "
& & &
We can use the html.escape() method in Python to encode your HTML in an ascii string to escape these characters. escape() takes one optional argument quote, which is set to True by default, and an HTML script as an argument. To use html.escape(), you must first import the html module, which is included in Python 3.2 and higher.
Python html.escape() Function:
Using the html.escape() method, we can convert an html script into a string by replacing special characters with ascii characters.
Syntax:
html.escape(string)
Return Value:
A string of ASCII character script from HTML is returned by the escape() function.
html.escape() Function in Python
Method #1: Using escape() Function (Static Input)
Approach:
- Import html module using the import keyword
- Give the HTML script as static input and store it in a variable.
- Pass the above-given HTML script as an argument to the escape() function to encode the given HTML string in ASCII string and store it in another variable.
- Print the ASCII string for the given HTML string.
- The Exit of the Program.
Below is the implementation:
# Import html module using the import keyword import html # Give the HTML script as static input and store it in a variable. gvn_html = '<html><head></head><body><h1>welcome to "Python-programs"</h1></body></html>' # Pass the above given HTML script as an argument to the escape() function to # encode the given HTML string in ascii string and store it in another variable. rslt = html.escape(gvn_html) # Print the ascii string for the given html string. print("The ASCII string for the given html string is:") print(rslt)
Output:
The ASCII string for the given html string is: <html><head></head><body><h1>welcome to "Python-programs"</h1></body></html>
Method #2: Using escape() Function (User Input)
Approach:
- Import html module using the import keyword
- Give the HTML script as user input using the input() function and store it in a variable.
- Pass the above-given HTML script as an argument to the escape() function to encode the given HTML string in ASCII string and store it in another variable.
- Print the ASCII string for the given HTML string.
- The Exit of the Program.
Below is the implementation:
# Import html module using the import keyword import html # Give the HTML script as user input using the input() function and store it in a variable. gvn_html = input("Enter some random HTML string:\n") print() # Pass the above given HTML script as an argument to the escape() function to # encode the given HTML string in ascii string and store it in another variable. rslt = html.escape(gvn_html) # Print the ascii string for the given html string. print("The ASCII string for the given html string is:") print(rslt)
Output:
Enter some random HTML string: '<html><body><p>Good morning,&& <all> </p></body></html>' The ASCII string for the given html string is: '<html><body><p>Good morning,&& <all> </p></body></html>'