Home > other >  Is there a way to print a Pandas Dataframe like how Pyspark displays Dataframes?
Is there a way to print a Pandas Dataframe like how Pyspark displays Dataframes?

Time:09-30

I would like to print my pandas dataframe with the same style as pyspark table without converting the pandas dataframe it to a pyspark's one. Like so:

> print(df.to_string(style='pyspark'))

| Id|groupId|matchId|assists|
 --- ------- ------- ------- 
|  0|     24|      0|      0|
|  1| 440875|      1|      1|
|  2| 878242|      2|      0|

intead of:

> print(df.to_string())

  Id  groupId  matchId  assists
0 0        24       0         0
1 1    440875       1         1
2 2    878242       2         0

Does anybody have a small script that reformats this?

CodePudding user response:

DataFrame.to_markdown provides several table_fmt options via tabulate:

import pandas as pd

df = pd.DataFrame({
    'Id': [0, 1, 2], 
    'groupId': [24, 440875, 878242],
    'matchId': [0, 1, 2],
    'assists': [0, 1, 0]
})

Some similar options include:

print(df.to_markdown(tablefmt="orgtbl", index=False))

|   Id |   groupId |   matchId |   assists |
|------ ----------- ----------- -----------|
|    0 |        24 |         0 |         0 |
|    1 |    440875 |         1 |         1 |
|    2 |    878242 |         2 |         0 |
print(df.to_markdown(tablefmt='pretty', index=False))

 ---- --------- --------- --------- 
| Id | groupId | matchId | assists |
 ---- --------- --------- --------- 
| 0  |   24    |    0    |    0    |
| 1  | 440875  |    1    |    1    |
| 2  | 878242  |    2    |    0    |
 ---- --------- --------- --------- 
print(df.to_markdown(tablefmt='psql', index=False))

 ------ ----------- ----------- ----------- 
|   Id |   groupId |   matchId |   assists |
|------ ----------- ----------- -----------|
|    0 |        24 |         0 |         0 |
|    1 |    440875 |         1 |         1 |
|    2 |    878242 |         2 |         0 |
 ------ ----------- ----------- ----------- 

CodePudding user response:

You can do this with tabulate

from tabulate import tabulate
import pandas as pd

df = pd.DataFrame({'id' : [1, 2 , 3],
                   'col' : ['a', 'b', 'c']})
print(tabulate(df, headers='keys', tablefmt='psql'))

 ---- ----------- ------------- 
|    |   id      | col         |
|---- ----------- -------------|
|  0 |    1      | a           |
|  1 |    2      | b           |
|  2 |    3      | c           |
 ---- ----------- ------------- 
  • Related