Panda, how to shorten/hide a list in a dataframe when using print-CodePudding

I'm using a dataframe with a lot of relevant data so I can have an easy access to all the relevant information when doing. It looks as such

df
     ion    m/n  qte   C1    C2   rCC  compte
0     H   1.00    1  0.1  0.25  0.50       2
1    H2   1.00    3  0.5  0.30  1.00       4
2  10B2   5.00    1  0.6  0.30  0.50       4
3  11B2   5.50    0  0.2  0.20  1.00       0
4   10B  10.00    0  0.2  0.20  1.00       0
5   11B  11.00    0  0.2  0.20  1.00       0
6   Si2  14.01    1  0.8  0.80  1.00       0
7   Fe2  26.90    1  0.8  0.35  0.65       3
8  Fe2*  27.90    1  0.5  0.50  1.00       7

Later in the code I add two columns which, for each row, contains a very long list. This makes any print(df) unreadable as it looks like this after adding the new rows.

df with fluxC1 and fluxC2 columns 
     ion  ...                                             fluxC2
0     H  ...  [0.0004506072467966082, 0.0004511067891697997,...
1    H2  ...  [9.65067757502627e-05, 9.663517466177602e-05, ...
2  10B2  ...  [9.65067757502627e-05, 9.663517466177602e-05, ...
3  11B2  ...  [0.0021039651287393384, 0.002105830883498985, ...
4   10B  ...  [0.0021039651287393384, 0.002105830883498985, ...
5   11B  ...  [0.0021039651287393384, 0.002105830883498985, ...
6   Si2  ...  [1.9595400763556396e-11, 1.966500053364854e-11...
7   Fe2  ...  [2.0668903644852728e-05, 2.070098966831758e-05...
8  Fe2*  ...  [2.030468908656194e-07, 2.0349733523508614e-07...

Is there any way for me to print my df so that the list is show as [...] or something similar so that my printed df would look like this

df
         ion    m/n  qte   C1    C2   rCC  compte  fluxC1  fluxC2
    0     H   1.00    1  0.1  0.25  0.50       2   [...]   [...]
    1    H2   1.00    3  0.5  0.30  1.00       4   [...]   [...]
    2  10B2   5.00    1  0.6  0.30  0.50       4   [...]   [...]
    3  11B2   5.50    0  0.2  0.20  1.00       0   [...]   [...]
    4   10B  10.00    0  0.2  0.20  1.00       0   [...]   [...]
    5   11B  11.00    0  0.2  0.20  1.00       0   [...]   [...]
    6   Si2  14.01    1  0.8  0.80  1.00       0   [...]   [...]
    7   Fe2  26.90    1  0.8  0.35  0.65       3   [...]   [...]
    8  Fe2*  27.90    1  0.5  0.50  1.00       7   [...]   [...]

I havn't even found a way to print a list as a [...] or with only very few elements in it, like printing a rounded list (without rounding anything in the df) so I'm a bit skeptical.

CodePudding user response：

This might be able to help you, I use this a lot when working with bigger dataframes

pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)

CodePudding user response：

You can use style formatting offered by Pandas. There are two ways to go about this. One is to use style.format() on your dataframe (df) on each invocation where you want to display your df.

df.style.format({'fluxC1': '[...]', 'fluxC2': '[...]'})

You can use whatever style you want for your columns. The syntax to remember is to use a dict where key is column name and value is the style you want, like above. Or, use a callable if you have some complicated logic. Consult documentation for more info on this.

Another approach is to set pd options for style formatting so that you no longer have to pass format each time. Here's one way to do that (with a sample dataframe for better understanding):

import string
import pandas as pd

# A Style formatter that styles only Lists which are of len > 2
def style_formatter(i):
    if isinstance(i, list):
        if len(i) > 2:
            return '[...]'
        return i
    return i
pd.set_option('styler.format.formatter', style_formatter)

# Sample dataframe
df = pd.DataFrame({'colA': [[*string.ascii_letters] for i in range(0, 6)], 'colB': [i for i in range(0, 6)]})

Default df output:

    colA                                                colB
0   [a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, ...   0
1   [a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, ...   1
2   [a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, ...   2
3   [a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, ...   3
4   [a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, ...   4
5   [a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, ...   5

To style output, call df as df.style. Example output after auto-styling based on pd styling option:

    colA    colB
0   [...]   0
1   [...]   1
2   [...]   2
3   [...]   3
4   [...]   4
5   [...]   5

Serious note: styling can take a while if you run it on a big dataframe. Better approach is to get your data and then use styling over it. E.g. df.head(20).style