Pandas: DataFrame could have a to_markdown method.

Created on 10 Sep 2015  Â·  17Comments  Â·  Source: pandas-dev/pandas

Similar to to_latex and to_html.

IO HTML Output-Formatting

Most helpful comment

Would pandas be open to adding a dependency? tabulate does exactly this and is pip installable.

from tabulate import tabulate
print(tabulate(df, headers='keys', tablefmt='pipe'))
|    |   test1 | test2   |   test3 |   test4 |    test5 |
|---:|--------:|:--------|--------:|--------:|---------:|
|  0 |     385 | apple   |     288 |     745 |  64.9352 |
|  1 |     627 | banana  |       3 |     792 | 226.955  |
|  2 |     486 | pear    |     446 |     503 | 110.454  |
|  3 |     368 | orange  |     887 |     808 | 297.62   |
|  4 |     550 | grape   |     235 |      96 | 240.324  |
|  5 |     749 | peach   |      22 |     598 | 240.642  |

| | test1 | test2 | test3 | test4 | test5 |
|---:|--------:|:--------|--------:|--------:|---------:|
| 0 | 385 | apple | 288 | 745 | 64.9352 |
| 1 | 627 | banana | 3 | 792 | 226.955 |
| 2 | 486 | pear | 446 | 503 | 110.454 |
| 3 | 368 | orange | 887 | 808 | 297.62 |
| 4 | 550 | grape | 235 | 96 | 240.324 |
| 5 | 749 | peach | 22 | 598 | 240.642 |

All 17 comments

Is there a widely agreed upon format for markdown tables? It's not in Gruber's original version, and IIRC CommonMark even punted on them.

Good point. I'm not sure. Maybe the method could specify a flavor, e.g., GitHub or Pandoc?

I suspect that anything too complicated won't find much support here (maintaining stuff is no fun). Especially now that we have pipe. df.pipe(to_markdown) isn't much worse than df.to_markdown.

Is your usecase here to present / read the mardown, or convert it to something else? Since markdown is a superset of HTML, you may be able to get away with to_html before converting.

My specific use case would be to create a DataFrame, copy/paste to GitHub flavored Markdown document (or GitHub issues, PRs, wikis, etc.), but still be able to read/edit it later without the HTML mess.

As an example, below is a table I copied from a Jupyter Qt console. First I printed the DataFrame to the terminal, copied/pasted here, then entered the dashes and pipes manually for the GH Markdown. Then I generated HTML with the to_html method, which doesn't render quite right here.

GitHub Markdown

Source

    | speed | Re_tip | Re_root | Re_ave | Re_D
---|--------|-------|--------|--------|--------
0 | 4.0e-01 | 5.0e+04 | 8.3e+04 | 6.6e+04 | 4.3e+05
1 | 6.0e-01 | 7.4e+04 |  1.2e+05 | 9.9e+04 | 6.4e+05
2 | 8.0e-01 | 9.9e+04 | 1.7e+05 | 1.3e+05 | 8.6e+05
3 | 1.0e+00 | 1.2e+05 |  2.1e+05 | 1.7e+05 | 1.1e+06
4 | 1.2e+00 | 1.5e+05 |  2.5e+05 | 2.0e+05 | 1.3e+0

Results

| | speed | Re_tip | Re_root | Re_ave | Re_D |
| --- | --- | --- | --- | --- | --- |
| 0 | 4.0e-01 | 5.0e+04 | 8.3e+04 | 6.6e+04 | 4.3e+05 |
| 1 | 6.0e-01 | 7.4e+04 | 1.2e+05 | 9.9e+04 | 6.4e+05 |
| 2 | 8.0e-01 | 9.9e+04 | 1.7e+05 | 1.3e+05 | 8.6e+05 |
| 3 | 1.0e+00 | 1.2e+05 | 2.1e+05 | 1.7e+05 | 1.1e+06 |
| 4 | 1.2e+00 | 1.5e+05 | 2.5e+05 | 2.0e+05 | 1.3e+0 |

Pandas HTML

Source

<table border="1" class="dataframe">\n  <thead>\n    <tr style="text-align: right;">\n      <th></th>\n      <th>speed</th>\n      <th>Re_tip</th>\n      <th>Re_root</th>\n      <th>Re_ave</th>\n      <th>Re_D</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>4.0e-01</td>\n      <td>5.0e+04</td>\n      <td>8.3e+04</td>\n      <td>6.6e+04</td>\n      <td>4.3e+05</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>6.0e-01</td>\n      <td>7.4e+04</td>\n      <td>1.2e+05</td>\n      <td>9.9e+04</td>\n      <td>6.4e+05</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>8.0e-01</td>\n      <td>9.9e+04</td>\n      <td>1.7e+05</td>\n      <td>1.3e+05</td>\n      <td>8.6e+05</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>1.0e+00</td>\n      <td>1.2e+05</td>\n      <td>2.1e+05</td>\n      <td>1.7e+05</td>\n      <td>1.1e+06</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>1.2e+00</td>\n      <td>1.5e+05</td>\n      <td>2.5e+05</td>\n      <td>2.0e+05</td>\n      <td>1.3e+06</td>\n    </tr>\n  </tbody>\n</table

Results

n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n

One issue is what should the row headers be for MI columns/index. It seems that GH registers only the first (unless there is some syntactical trick).

For _most_ flavors you can include html.

As the new styler uses jinja (https://github.com/pydata/pandas/pull/10250), this shouldn't be too hard? I would also love this feature for knitpy, which is a markdown based format which is converted into all kind of formats (docx...pdf...html).

Up to now (and as workarounds...), I recommended tabulate to convert a DataFrame to markdown. There is also pandoc (e.g. via pypandoc), which can take the output of df.to_html() and convert that to markdown.

@JanSchulz yes, I could see .to_markdown() method in Styler (as better API than to put it directly in DataFrame (though could have that as well).

IMO, on .styler it doesn't make sense, markdown unfortunately do not provide styles :-( I only mentioned .styler as it already (soft) requires jinja and that should make it easy to build a markdown representation...

IMO it should be DataFrame.to_markdown() and DataFrame._repr_markdown_()

I'd love DataFrame.to_markdown() too. I was surprised when it didn't work already.

Would pandas be open to adding a dependency? tabulate does exactly this and is pip installable.

from tabulate import tabulate
print(tabulate(df, headers='keys', tablefmt='pipe'))
|    |   test1 | test2   |   test3 |   test4 |    test5 |
|---:|--------:|:--------|--------:|--------:|---------:|
|  0 |     385 | apple   |     288 |     745 |  64.9352 |
|  1 |     627 | banana  |       3 |     792 | 226.955  |
|  2 |     486 | pear    |     446 |     503 | 110.454  |
|  3 |     368 | orange  |     887 |     808 | 297.62   |
|  4 |     550 | grape   |     235 |      96 | 240.324  |
|  5 |     749 | peach   |      22 |     598 | 240.642  |

| | test1 | test2 | test3 | test4 | test5 |
|---:|--------:|:--------|--------:|--------:|---------:|
| 0 | 385 | apple | 288 | 745 | 64.9352 |
| 1 | 627 | banana | 3 | 792 | 226.955 |
| 2 | 486 | pear | 446 | 503 | 110.454 |
| 3 | 368 | orange | 887 | 808 | 297.62 |
| 4 | 550 | grape | 235 | 96 | 240.324 |
| 5 | 749 | peach | 22 | 598 | 240.642 |

I'm +0 to adding it as an optional dependency, and adding a to_markdown
method.

In the meantime, df.pipe(tabulate, header='keys', tablefmt='pipe') is a
workaround. I suppose it would be nice to avoid typing those keyword
arguments every time :)

On Wed, Oct 11, 2017 at 4:53 PM, Peter St. John notifications@github.com
wrote:

Would pandas be open to adding a dependency? tabulate
https://pypi.python.org/pypi/tabulate does exactly this and is pip
installable.

from tabulate import tabulateprint(tabulate(df, headers='keys', tablefmt='pipe'))

  test1  test2      test3    test4    test5

0 356 apple 876 510 904
1 90 banana 590 24 988
2 150 pear 652 731 792
3 399 orange 603 76 420
4 864 grape 322 703 324
5 281 peach 731 192 192

test1 test2 test3 test4 test5
0 356 apple 876 510 904
1 90 banana 590 24 988
2 150 pear 652 731 792
3 399 orange 603 76 420
4 864 grape 322 703 324
5 281 peach 731 192 192

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/pandas-dev/pandas/issues/11052#issuecomment-335960248,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABQHItGeXDBlVw5VfDZ7zWNRVEVwnHEdks5srTj1gaJpZM4F7BTx
.

i think adding a to_markdown method which calls tabulate as a dependency would be ok

print(tabulate(df, headers='keys', tablefmt='pipe')) is not as pretty as pandas for multi-index, would be cool if whatever pandas lands on is.

I'm +0 to adding it as an optional dependency, and adding a to_markdown method. In the meantime, df.pipe(tabulate, header='keys', tablefmt='pipe') is a workaround. I suppose it would be nice to avoid typing those keyword arguments every time :)

@TomAugspurger Is this all the method would have to do? If so, I'd be happy to work on a PR

i think we would likely accept a PR for something like the above

Sure, I've submitted a simple PR.

What should this method return if there's a wide DataFrame? (or is this an enhancement that would get taken care of at a later stage?)

Was this page helpful?
0 / 5 - 0 ratings
speedRe_tipRe_rootRe_aveRe_D
04.0e-015.0e+048.3e+046.6e+044.3e+05
16.0e-017.4e+041.2e+059.9e+046.4e+05
28.0e-019.9e+041.7e+051.3e+058.6e+05
31.0e+001.2e+052.1e+051.7e+051.1e+06
41.2e+001.5e+052.5e+052.0e+051.3e+06