Home > Enterprise >  Convert list to DataFrame with a specific column (has no name) as a string
Convert list to DataFrame with a specific column (has no name) as a string

Time:06-11

I have a list that I will generate a dataframe with values that have zero on the right like 1.257000, so I need to generate it as a string because as a number these zeros on the right disappear, how should I proceed?

My attempts to identify column 5 (where the values are):

b = [
     ['string 1', 'string 2', 'string 3', 'string 4', 1.257000, 'string 6'],
     ['string 1', 'string 2', 'string 3', 'string 4', 1.546440, 'string 6']
    ]
df = pd.DataFrame(b, dtype={5: str})
df = pd.DataFrame(b, dtype={'5': str})

Current results in Column 5 using only pd.DataFrame(b):

1.257
1.54644

Expected result in Column 5:

1.257000
1.546440

Additional comment after response generated by Zaero Divide:

The numbers in my case can vary in size, it can be 1.230 1.23000 1.2300000, so I can't format by specifying an equal final number of characters for all after creating the DataFrame.

CodePudding user response:

The issue here is that the output of b is:

[
 ['string 1', 'string 2', 'string 3', 'string 4', 1.257, 'string 6'], 
 ['string 1', 'string 2', 'string 3', 'string 4', 1.54644, 'string 6']
]

Instantly, those 0s no longer exist.


If you instead had a string/file that looked like:

string 1,string 2,string 3,string 4,1.257000,string 6
string 1,string 2,string 3,string 4,1.546440,string 6

Then it could be read like you want:

file = """string 1,string 2,string 3,string 4,1.257000,string 6
string 1,string 2,string 3,string 4,1.546440,string 6"""

pd.read_csv(StringIO(file), dtype=str, header=None)

Output:

          0         1         2         3         4         5
0  string 1  string 2  string 3  string 4  1.257000  string 6
1  string 1  string 2  string 3  string 4  1.546440  string 6

CodePudding user response:

If you want to have the trailing zeroes, you can use format string, for instance:

>>> df[4].transform(lambda x: f"{x:0.6f}")
0    1.257000
1    1.546440
Name: 4, dtype: object

If you want to apply to all representations:

>>> pd.options.display.float_format = '{:,.8f}'.format
>>> df[4]
0   1.25700000
1   1.54644000
Name: 4, dtype: float64

Edit As long as numbers are introduced as numbers, the trailing zeros are automatically removed. They should be introduced as string. There is no conversion from float to str that will bring back the original number of zeros, because for python they were never really there

  • Related