I am writing a python script that should take a excel spreadsheet and convert it to a markdown table. Currently my code looks like this:
import pandas as pd
import numpy as np
df = pd.read_excel('TestMap1.xlsx')
md_table = df.to_markdown(index=False)
print(md_table)
Basically this works, but I have the problem, that if there are multiple lines in a excel cell, the .to_markdown() method puts them into seperate cells in the markdown syntax.
Example:
This is the excel table
and the .to_markdown() method converts it into this:
As you can see, the "Some Info" column of the "Couch" object should have 3 seperate lines in them, but the markdown table has every line in a seperate cell.
I want it to look like this:
That means, that instead of generating a new table row for each line in that cell, the .to_markdown() method has to put <br>
tags between the lines.
The script should work with excel spreadsheets of various sizes.
Is there a way to do that with pandas? If no, is there a different library that supports such operations?
Data as dict: {'Object': {0: 'Lamp', 1: 'Couch', 2: 'Shoes'}, 'Price': {0: 20, 1: 200, 2: 50}, 'Some Info': {0: 'Pretty Bright', 1: '2 meters long\nblue\ncomfy', 2: 'Size 9.5'}}
CodePudding user response:
You can replace all new line characters by 'br'
tag.
Applying string method
to all non numeric values.
df = df.apply(lambda x: x.str.replace('\n', '<br>') if x.dtype == 'object' else x)
which gives us :
Object Price Some Info
0 Lamp 20 Pretty Bright
1 Couch 200 2 meters <br>long<br>blue<br>comfy
2 Shoes 50 Size 9.5
Output :
print(df.to_markdown())
For the data you shared.
This gives us the expected output :
Object | Price | Some Info | |
---|---|---|---|
0 | Lamp | 20 | Pretty Bright |
1 | Couch | 200 | 2 meters long blue comfy |
2 | Shoes | 50 | Size 9.5 |