5G>> tomd>> 返回
项目作者: gaojiuli

项目描述 :
Convert HTML to Markdown.
高级语言: Python
项目地址: git://github.com/gaojiuli/tomd.git
创建时间: 2017-05-25T15:13:41Z
项目社区:https://github.com/gaojiuli/tomd

开源协议:GNU General Public License v3.0

下载


tomd

[License](https://pypi.python.org/pypi/tomd/)
[Pypi](https://pypi.python.org/pypi/tomd/)
[Python](https://pypi.python.org/pypi/tomd/)

When crawling online articles such as news, blogs, etc. I want to save them in markdown files but not databases.
Tomd has the ability of converting a HTML that converted from markdown. If a HTML can’t be described by markdown, tomd can’t convert it right.
Tomd is a python tool.

Road map

  • Basic support
  • Full support(Nested list)
  • Command line tool

Installation

pip install tomd

Getting Started

Input

  1. import tomd
  2. tomd.Tomd('<h1>h1</h1>').markdown
  3. # or
  4. tomd.convert('<h1>h1</h1>')

Output

  1. # h1

Usage

  1. from tomd import Tomd
  2. html="""
  3. <h1>h1</h1>
  4. <h2>h2</h2>
  5. <h3>h3</h3>
  6. <h4>h4</h4>
  7. <h5>h5</h5>
  8. <h6>h6</h6>
  9. <p>paragraph
  10. <a href="https://github.com">link</a>
  11. <img src="https://github.com" class="dsad">img</img>
  12. </p>
  13. <ul>
  14. <li>1</li>
  15. <li>2</li>
  16. <li>3</li>
  17. </ul>
  18. <ol>
  19. <li>1</li>
  20. <li>2</li>
  21. <li>3</li>
  22. </ol>
  23. <blockquote>blockquote</blockquote>
  24. <p><code>inline code</code></p>
  25. <pre><code>block code</code></pre>
  26. <p>
  27. <del>del</del>
  28. <b>bold</b>
  29. <i>italic</i>
  30. <b><i>bold italic</i></b>
  31. </p>
  32. <hr/>
  33. <table>
  34. <thead>
  35. <tr>
  36. <th>th1</th>
  37. <th>th2</th>
  38. </tr>
  39. </thead>
  40. <tbody>
  41. <tr>
  42. <td>td</td>
  43. <td>td</td>
  44. </tr>
  45. <tr>
  46. <td>td</td>
  47. <td>td</td>
  48. </tr></tbody></table>
  49. """
  50. Tomd(html).markdown

Result

  1. # h1
  2. ## h2
  3. ### h3
  4. #### h4
  5. ##### h5
  6. ###### h6
  7. paragraph
  8. [link](https://github.com)
  9. ![img](https://github.com)
  10. - 1
  11. - 2
  12. - 3
  13. 1. 1
  14. 1. 2
  15. 1. 3
  16. > blockquote
  17. `inline code`
  18. block code
  19. ~~del~~
  20. **bold**
  21. *italic*
  22. ***bold italic***
  23. ---
  24. |th1|th2
  25. |------
  26. |td|td
  27. |td|td