Home > Software engineering >  How to remove index column when getting latex string from Pandas DataFrame
How to remove index column when getting latex string from Pandas DataFrame

Time:06-29

Originally, you could simply pass an argument in the to_latex method of the Pandas DataFrame object. Now you get a warning message about signature changes. Example:

>>> import pandas as pd

>>> import numpy as np

>>> data = {f'Column {i   1}': np.random.randint(0, 10, size=(10, )) for i in range(5)}

>>> df = pd.DataFrame(data)

>>> df
   Column 1  Column 2  Column 3  Column 4  Column 5
0         1         8         3         3         5
1         8         6         7         7         3
2         2         1         6         1         1
3         9         7         9         5         5
4         5         4         7         8         9
5         9         5         3         6         2
6         6         9         9         6         8
7         8         7         2         6         5
8         4         9         4         6         2
9         2         6         5         3         0

>>> lat_og = df.to_latex(index=False)
<ipython-input-7-986346043a05>:1: FutureWarning: In future versions `DataFrame.to_latex` is expected to utilise the base implementation of `Styler.to_latex` for formatting and rendering. The arguments signature may therefore change. It is recommended instead to use `DataFrame.style.to_latex` which also contains additional functionality.
  lat_og = df.to_latex(index=False)

>>> print(lat_og)
\begin{tabular}{rrrrr}
\toprule
 Column 1 &  Column 2 &  Column 3 &  Column 4 &  Column 5 \\
\midrule
        1 &         8 &         3 &         3 &         5 \\
        8 &         6 &         7 &         7 &         3 \\
        2 &         1 &         6 &         1 &         1 \\
        9 &         7 &         9 &         5 &         5 \\
        5 &         4 &         7 &         8 &         9 \\
        9 &         5 &         3 &         6 &         2 \\
        6 &         9 &         9 &         6 &         8 \\
        8 &         7 &         2 &         6 &         5 \\
        4 &         9 &         4 &         6 &         2 \\
        2 &         6 &         5 &         3 &         0 \\
\bottomrule
\end{tabular}

You get the desired output with no index column, but I don't want to have to keep using this if it will change, or if I have to continuously import warnings to fix it.

The warning message recommends we use the style attribute. How can I use the style attribute to ignore the index column? I read the documentation of the to_latex method associated with the style attribute, but it doesn't have the simple argument as above. Example:

>>> lat_new = df.style.to_latex(hrules=True)

>>> print(lat_new)
\begin{tabular}{lrrrrr}
\toprule
 & Column 1 & Column 2 & Column 3 & Column 4 & Column 5 \\
\midrule
0 & 1 & 8 & 3 & 3 & 5 \\
1 & 8 & 6 & 7 & 7 & 3 \\
2 & 2 & 1 & 6 & 1 & 1 \\
3 & 9 & 7 & 9 & 5 & 5 \\
4 & 5 & 4 & 7 & 8 & 9 \\
5 & 9 & 5 & 3 & 6 & 2 \\
6 & 6 & 9 & 9 & 6 & 8 \\
7 & 8 & 7 & 2 & 6 & 5 \\
8 & 4 & 9 & 4 & 6 & 2 \\
9 & 2 & 6 & 5 & 3 & 0 \\
\bottomrule
\end{tabular}

The index column is there in the LaTeX. What are some ways we can remove the index column without using the original method?

Edit

If you come across this issue, know that the pandas developers are planning to deprecate the DataFrame.to_latex() method. They are currently in the process of using the Styler.to_latex() method instead. The signatures as of now are not the same and require additional methods for hiding the index column or escaping latex syntax. See 41649 for more current updates on the process, and see 44411 for the start of the rabbit hole. They plan on having this fixed in pandas 2.0.

CodePudding user response:

It is possible to use the hide() method of the style attribute of a Pandas dataframe. The following code will produce a LaTeX table without the values of the index:

import pandas as pd
import numpy as np

data = {f'Column {i   1}': np.random.randint(0, 10, size=(10, )) for i in range(5)}
df = pd.DataFrame(data)
lat_new = df.style.hide(axis="index").to_latex(hrules=True)
print(lat_new)

The result is the following:

\begin{tabular}{rrrrr}
\toprule
Column 1 & Column 2 & Column 3 & Column 4 & Column 5 \\
\midrule
2 & 5 & 9 & 2 & 2 \\
2 & 6 & 0 & 9 & 5 \\
3 & 2 & 4 & 3 & 2 \\
8 & 9 & 3 & 7 & 8 \\
5 & 9 & 7 & 4 & 4 \\
0 & 3 & 2 & 2 & 6 \\
5 & 7 & 7 & 8 & 6 \\
2 & 2 & 9 & 3 & 3 \\
6 & 0 & 0 & 9 & 2 \\
4 & 8 & 7 & 5 & 9 \\
\bottomrule
\end{tabular}
  • Related