Processing Markdown in Python
In a previous article titled 'Introduction To Markdown For Programmers' we covered how to write and work with Markdown. In this article I am going to show you how to process markdown in a programmatic way in Python. We’ll need the Markdown library of Python to work with this.
Setting Up
To get started you need to have Python installed on your system. You should use the latest version of Python. For writing code you can use any editor or IDE you like. I am going to use the IDLE provided with the Python installation of Windows.
We need the Markdown library and we can install it with pip command.
pip install markdown
Create a Python script named mark_it.py and start coding. We need to import the markdown library to start working.
import markdown
Writing Some Markdown
Throughout this article we will write some markdown texts. For now we are going to write very simple markdown text to test our code. You can use any plain text editor or a dedicated markdown editor to see a live preview. I am going to write markdown directly inside our Python script as a multiline string. Here is our initial markdown. It is a very small story I made up to show you Markdown processing with Python.
# Tale Of Two Villages
Once upon a time there were two villages in the distant part of the country side by side.
One village was full of doctors and the other was full of engineers. Everyone was living in peace until one day a programmer baby grew up in the engineer's village.
## The Programmer Baby
The programmer baby was different from the very first day that he saw the light of this earth.
The first day in his school when he was instructed to write something on the board, he wrote: `print("Hello World")`. Instead of getting surprised everyone got **scared** - that was Python code.
## The Kingdom Of Python
The villagers were afraid of pythons. Many people were attacked by pythons and thus none in that engineer's village were allowed to learn python programming.
## Doctors' Turn
The engineers went to the doctor’s village with that baby to test what was wrong. If no one taught him python coding then how could he learn it? Doctors searched through the deep web and found a website called [Python Baby](http://python-baby.org).
... *to be continued!*
Converting The Markdown Text To HTML
Open the Python script file to write some code in it. Import the Markdown library. There is a function called markdown() inside that module. It accepts markdown text as string as the first parameter. Put the above story inside it and use a multiline string for that.
import markdown
md_text = """
# Tale Of Two Villages
Once upon a time there were two villages in the distant part of the country side by side.
One village was full of doctors and the other was full of engineers. Everyone was living in peace until one day a programmer baby grew up in the engineer's village.
## The Programmer Baby
The programmer baby was different from the very first day that he saw the light of this earth.
The first day in his school when he was instructed to write something on the board, he wrote: `print("Hello World")`. Instead of getting surprised everyone got **scared** - that was Python code.
## The Kingdom Of Python
The villagers were afraid of pythons. Many people were attacked by pythons and thus none in that engineer's village were allowed to learn python programming.
## Doctors' Turn
The engineers went to the doctor’s village with that baby to test what was wrong. If no one taught him python coding then how could he learn it? Doctors searched through the deep web and found a website called [Python Baby](http://python-baby.org).
... *to be continued!*
"""
t = markdown.markdown(md_text)
print(t)
The output on the console is:
<h1>Tale Of Two Villages</h1>
<p>Once upon a time there were two village in the distant part of the country side by side.
One village was full of doctors and the other was full of engineers. Everyone were living in peace until one day a programmer baby grew up in the engineer's village.</p>
<h2>The Programmer Baby</h2>
<p>The programmer baby was different from the very first day that he saw the light of this earth.
The first day in his school when he was instructed to write something on the board, he wrote: <code>print("Hello World")</code>. Instead of getting surprised everyone got <strong>scared</strong> - that was Python code.</p>
<h2>The Kingdom Of Python</h2>
<p>The villagers were afraid of pythons. Many people were attacked by pythons and thus none in that engineer's village were allowed to learn python programming.</p>
<h2>Doctors' Turn</h2>
<p>The engineers went to the doctor’s village with that baby to test what is wrong. If no one taught him python coding then how could he learn ir? Doctors searched through the deep web and found a website called <a href="http://python-baby.org">Python Baby</a>.</p>
<p>... <em>to be continued!</em> </p>
Look at the output above.
- One hash (#) was converted to h1 tag.
- Two hashes (##) were converted to h2 tag.
- Newline resulted in paragraph - that is the section of text that was enclosed with html p tag.
- []() was converted to anchor link. Text inside [] was put inside the a tag as its content. And the text inside () was put inside the href attribute of the anchor tag.
- Text surrounded by double star symbols (**) was surrounded by html strong tag.
- Text surrounded by single star symbol (*) was surrounded by html em tag.
- Code surrounded by backticks is surrounded by html code tag after transformation.
The transformation went according to our plan. We materialized our learning of markdown from our previous article.
Doing More With Extension
If you are not happy with what the markdown() function is doing you can use extensions with it. You can pass extensions with the help of extensions keyword argument. The value of it will be a list of extensions class. The element inside the list can be instance extension classes or dotted Python notation to the extension. If you provide the dotted Python notation, make sure that it exists on the python path.
Working From Commandline
Without writing code for transforming markdown to HTML you can use the module from the commandline. Syntax for it is:
python -m markdown input_file.md > output.html
The '>' is nothing special here. It is not Python or markdown library specific stuff. It is used for command line i/o redirection.
Conclusion
We can’t cover all of Markdown in a single article but I’ve tried to show how to start coding with it. I also explained every bit of transformation from Markdown to HTML. This article is not even enough as a beginning topic for markdown processing with Python. I kept it simple intentionally so that beginners do not get intimidated. I also did not show a lot of alternatives for the same reason. In some future articles I will explain more and I also have a plan to write a dedicated article on extension development for the Markdown library.
Need to brush up on your Python or take it to the next level? Check out our Top 8 online Python courses.
Recent Stories
Top DiscoverSDK Experts
Compare Products
Select up to three two products to compare by clicking on the compare icon () of each product.
{{compareToolModel.Error}}
{{CommentsModel.TotalCount}} Comments
Your Comment