Pandas: DataFrame could have a to_markdown method.

Created on 10 Sep 2015 · 17Comments · Source: pandas-dev/pandas

Similar to to_latex and to_html.

IO HTML Output-Formatting

Source

petebachant

👍22

Most helpful comment

Would pandas be open to adding a dependency? tabulate does exactly this and is pip installable.

from tabulate import tabulate
print(tabulate(df, headers='keys', tablefmt='pipe'))

|    |   test1 | test2   |   test3 |   test4 |    test5 |
|---:|--------:|:--------|--------:|--------:|---------:|
|  0 |     385 | apple   |     288 |     745 |  64.9352 |
|  1 |     627 | banana  |       3 |     792 | 226.955  |
|  2 |     486 | pear    |     446 |     503 | 110.454  |
|  3 |     368 | orange  |     887 |     808 | 297.62   |
|  4 |     550 | grape   |     235 |      96 | 240.324  |
|  5 |     749 | peach   |      22 |     598 | 240.642  |

| | test1 | test2 | test3 | test4 | test5 |
|---:|--------:|:--------|--------:|--------:|---------:|
| 0 | 385 | apple | 288 | 745 | 64.9352 |
| 1 | 627 | banana | 3 | 792 | 226.955 |
| 2 | 486 | pear | 446 | 503 | 110.454 |
| 3 | 368 | orange | 887 | 808 | 297.62 |
| 4 | 550 | grape | 235 | 96 | 240.324 |
| 5 | 749 | peach | 22 | 598 | 240.642 |

pstjohn on 11 Oct 2017

👍11

All 17 comments

Is there a widely agreed upon format for markdown tables? It's not in Gruber's original version, and IIRC CommonMark even punted on them.

TomAugspurger on 10 Sep 2015

Good point. I'm not sure. Maybe the method could specify a flavor, e.g., GitHub or Pandoc?

petebachant on 10 Sep 2015

I suspect that anything too complicated won't find much support here (maintaining stuff is no fun). Especially now that we have pipe. df.pipe(to_markdown) isn't much worse than df.to_markdown.

Is your usecase here to present / read the mardown, or convert it to something else? Since markdown is a superset of HTML, you may be able to get away with to_html before converting.

TomAugspurger on 10 Sep 2015

My specific use case would be to create a DataFrame, copy/paste to GitHub flavored Markdown document (or GitHub issues, PRs, wikis, etc.), but still be able to read/edit it later without the HTML mess.

As an example, below is a table I copied from a Jupyter Qt console. First I printed the DataFrame to the terminal, copied/pasted here, then entered the dashes and pipes manually for the GH Markdown. Then I generated HTML with the to_html method, which doesn't render quite right here.

GitHub Markdown

Source

    | speed | Re_tip | Re_root | Re_ave | Re_D
---|--------|-------|--------|--------|--------
0 | 4.0e-01 | 5.0e+04 | 8.3e+04 | 6.6e+04 | 4.3e+05
1 | 6.0e-01 | 7.4e+04 |  1.2e+05 | 9.9e+04 | 6.4e+05
2 | 8.0e-01 | 9.9e+04 | 1.7e+05 | 1.3e+05 | 8.6e+05
3 | 1.0e+00 | 1.2e+05 |  2.1e+05 | 1.7e+05 | 1.1e+06
4 | 1.2e+00 | 1.5e+05 |  2.5e+05 | 2.0e+05 | 1.3e+0

Results

| | speed | Re_tip | Re_root | Re_ave | Re_D |
| --- | --- | --- | --- | --- | --- |
| 0 | 4.0e-01 | 5.0e+04 | 8.3e+04 | 6.6e+04 | 4.3e+05 |
| 1 | 6.0e-01 | 7.4e+04 | 1.2e+05 | 9.9e+04 | 6.4e+05 |
| 2 | 8.0e-01 | 9.9e+04 | 1.7e+05 | 1.3e+05 | 8.6e+05 |
| 3 | 1.0e+00 | 1.2e+05 | 2.1e+05 | 1.7e+05 | 1.1e+06 |
| 4 | 1.2e+00 | 1.5e+05 | 2.5e+05 | 2.0e+05 | 1.3e+0 |

Pandas HTML

Source

<table border="1" class="dataframe">\n  <thead>\n    <tr style="text-align: right;">\n      <th></th>\n      <th>speed</th>\n      <th>Re_tip</th>\n      <th>Re_root</th>\n      <th>Re_ave</th>\n      <th>Re_D</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>4.0e-01</td>\n      <td>5.0e+04</td>\n      <td>8.3e+04</td>\n      <td>6.6e+04</td>\n      <td>4.3e+05</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>6.0e-01</td>\n      <td>7.4e+04</td>\n      <td>1.2e+05</td>\n      <td>9.9e+04</td>\n      <td>6.4e+05</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>8.0e-01</td>\n      <td>9.9e+04</td>\n      <td>1.7e+05</td>\n      <td>1.3e+05</td>\n      <td>8.6e+05</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>1.0e+00</td>\n      <td>1.2e+05</td>\n      <td>2.1e+05</td>\n      <td>1.7e+05</td>\n      <td>1.1e+06</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>1.2e+00</td>\n      <td>1.5e+05</td>\n      <td>2.5e+05</td>\n      <td>2.0e+05</td>\n      <td>1.3e+06</td>\n    </tr>\n  </tbody>\n</table

Results

n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n

petebachant on 10 Sep 2015

One issue is what should the row headers be for MI columns/index. It seems that GH registers only the first (unless there is some syntactical trick).

For _most_ flavors you can include html.

hayd on 11 Sep 2015

As the new styler uses jinja (https://github.com/pydata/pandas/pull/10250), this shouldn't be too hard? I would also love this feature for knitpy, which is a markdown based format which is converted into all kind of formats (docx...pdf...html).

Up to now (and as workarounds...), I recommended tabulate to convert a DataFrame to markdown. There is also pandoc (e.g. via pypandoc), which can take the output of df.to_html() and convert that to markdown.

jankatins on 11 Nov 2015

@JanSchulz yes, I could see .to_markdown() method in Styler (as better API than to put it directly in DataFrame (though could have that as well).

jreback on 11 Nov 2015

IMO, on .styler it doesn't make sense, markdown unfortunately do not provide styles :-( I only mentioned .styler as it already (soft) requires jinja and that should make it easy to build a markdown representation...

IMO it should be DataFrame.to_markdown() and DataFrame._repr_markdown_()

jankatins on 11 Nov 2015

I'd love DataFrame.to_markdown() too. I was surprised when it didn't work already.

ghost on 27 Nov 2015

👍1

Would pandas be open to adding a dependency? tabulate does exactly this and is pip installable.

from tabulate import tabulate
print(tabulate(df, headers='keys', tablefmt='pipe'))

|    |   test1 | test2   |   test3 |   test4 |    test5 |
|---:|--------:|:--------|--------:|--------:|---------:|
|  0 |     385 | apple   |     288 |     745 |  64.9352 |
|  1 |     627 | banana  |       3 |     792 | 226.955  |
|  2 |     486 | pear    |     446 |     503 | 110.454  |
|  3 |     368 | orange  |     887 |     808 | 297.62   |
|  4 |     550 | grape   |     235 |      96 | 240.324  |
|  5 |     749 | peach   |      22 |     598 | 240.642  |

pstjohn on 11 Oct 2017

👍11

I'm +0 to adding it as an optional dependency, and adding a to_markdown
method.

In the meantime, df.pipe(tabulate, header='keys', tablefmt='pipe') is a
workaround. I suppose it would be nice to avoid typing those keyword
arguments every time :)

On Wed, Oct 11, 2017 at 4:53 PM, Peter St. John notifications@github.com
wrote:

Would pandas be open to adding a dependency? tabulate
https://pypi.python.org/pypi/tabulate does exactly this and is pip
installable.

from tabulate import tabulateprint(tabulate(df, headers='keys', tablefmt='pipe'))
  test1  test2      test3    test4    test5
0 356 apple 876 510 904
1 90 banana 590 24 988
2 150 pear 652 731 792
3 399 orange 603 76 420
4 864 grape 322 703 324
5 281 peach 731 192 192

test1 test2 test3 test4 test5
0 356 apple 876 510 904
1 90 banana 590 24 988
2 150 pear 652 731 792
3 399 orange 603 76 420
4 864 grape 322 703 324
5 281 peach 731 192 192

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/pandas-dev/pandas/issues/11052#issuecomment-335960248,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABQHItGeXDBlVw5VfDZ7zWNRVEVwnHEdks5srTj1gaJpZM4F7BTx
.

TomAugspurger on 11 Oct 2017

i think adding a to_markdown method which calls tabulate as a dependency would be ok

jreback on 12 Oct 2017

print(tabulate(df, headers='keys', tablefmt='pipe')) is not as pretty as pandas for multi-index, would be cool if whatever pandas lands on is.

aflaxman on 7 Nov 2017

FYI : Google found this discussion for me: https://stackoverflow.com/questions/33181846/programmatically-convert-pandas-dataframe-to-markdown-table

kangwonlee on 16 Dec 2018

I'm +0 to adding it as an optional dependency, and adding a to_markdown method. In the meantime, df.pipe(tabulate, header='keys', tablefmt='pipe') is a workaround. I suppose it would be nice to avoid typing those keyword arguments every time :)

@TomAugspurger Is this all the method would have to do? If so, I'd be happy to work on a PR

MarcoGorelli on 12 Dec 2019

i think we would likely accept a PR for something like the above

jreback on 12 Dec 2019

Sure, I've submitted a simple PR.

What should this method return if there's a wide DataFrame? (or is this an enhancement that would get taken care of at a later stage?)

MarcoGorelli on 19 Dec 2019

🎉2

Was this page helpful?

0 / 5 - 0 ratings

Related issues

frame _apply_standard error when operating on 0 or NaN values

venuktan · 3Comments

Incompatibility between pandas.infer_freq and pandas.to_timedelta

idanivanov · 3Comments

can't plot multi-row subplots

ericdf · 3Comments

df.duplicated and drop_duplicates raise TypeError with set and list values.

Abrosimov-a-a · 3Comments

Suffixes ignored on second merge

MatzeB · 3Comments

	speed	Re_tip	Re_root	Re_ave	Re_D
0	4.0e-01	5.0e+04	8.3e+04	6.6e+04	4.3e+05
1	6.0e-01	7.4e+04	1.2e+05	9.9e+04	6.4e+05
2	8.0e-01	9.9e+04	1.7e+05	1.3e+05	8.6e+05
3	1.0e+00	1.2e+05	2.1e+05	1.7e+05	1.1e+06
4	1.2e+00	1.5e+05	2.5e+05	2.0e+05	1.3e+06