Home > front end >  Convert list values in pandas dataframe to values
Convert list values in pandas dataframe to values

Time:05-05

I have a dataframe df with a mix of data types (str, list values, floats):

            Property Name 2234.1.1 2234.1.2.1 2234.1.3.1
0        Flash Point (°C)     [-5]        NaN        NaN
1        Flash Point (°C)    [Max]        NaN        NaN
2        Flash Point (°C)    [nan]        NaN        NaN
3        Flash Point (°C)      NaN        NaN        NaN
4        Flash Point (°C)      NaN        NaN        NaN
5        Flash Point (°C)      NaN        NaN        NaN
6        Flash Point (°C)      NaN        NaN       [29]
7        Flash Point (°C)      NaN        NaN      [nan]
8        Flash Point (°C)      NaN        NaN      [nan]
9        Flash Point (°C)      NaN      [159]        NaN
10       Flash Point (°C)      NaN      [nan]        NaN

I want to convert all the list values to regular values.

So desired df:

            Property Name 2234.1.1 2234.1.2.1 2234.1.3.1
0        Flash Point (°C)       -5        NaN        NaN
1        Flash Point (°C)      Max        NaN        NaN
2        Flash Point (°C)      NaN        NaN        NaN
3        Flash Point (°C)      NaN        NaN        NaN
4        Flash Point (°C)      NaN        NaN        NaN
5        Flash Point (°C)      NaN        NaN        NaN
6        Flash Point (°C)      NaN        NaN        29
7        Flash Point (°C)      NaN        NaN        NaN
8        Flash Point (°C)      NaN        NaN        NaN
9        Flash Point (°C)      NaN        159        NaN
10       Flash Point (°C)      NaN        NaN        NaN

I've tried:

for i in df.columns[1:]:
    df[i]=[''.join(map(str,l)) for l in df[i]]

But I get the following error:

TypeError: 'float' object is not iterable

I assume I shouldn't try looping over the df like that anyway.

Edit: adding code to reproduce the df:

{'Property Name': {0: 'Flash Point (°C)',
  1: 'Flash Point (°C)',
  2: 'Flash Point (°C)',
  3: 'Flash Point (°C)',
  4: 'Flash Point (°C)',
  5: 'Flash Point (°C)',
  6: 'Flash Point (°C)',
  7: 'Flash Point (°C)',
  8: 'Flash Point (°C)',
  9: 'Flash Point (°C)',
  10: 'Flash Point (°C)'},
 '2234.1.1': {0: ['-5'],
  1: ['Max'],
  2: [nan],
  3: nan,
  4: nan,
  5: nan,
  6: nan,
  7: nan,
  8: nan,
  9: nan,
  10: nan},
 '2234.1.2.1': {0: nan,
  1: nan,
  2: nan,
  3: nan,
  4: nan,
  5: nan,
  6: nan,
  7: nan,
  8: nan,
  9: ['159'],
  10: [nan]},
 '2234.1.3.1': {0: nan,
  1: nan,
  2: nan,
  3: nan,
  4: nan,
  5: nan,
  6: ['29'],
  7: [nan],
  8: [nan],
  9: nan,
  10: nan}}

CodePudding user response:

The method that deals with it is explode. But just calling explode won't works since columns don't have matching element counts. One way to circumvent that problem is to stack the DataFrame, explode and unstack back:

out = df.stack().explode().unstack()

If the lists are singleton lists (as in your example), then another option is to call explode in loop:

for c in df.columns:
    df = df.explode(c)

Another option could be to use str accessor and assign in a dict comprehension:

out = df.assign(**{f'{c}':df[c].str[0] for c in df.columns.drop('Property Name')})

Output:

       Property Name 2234.1.1 2234.1.2.1 2234.1.3.1
0   Flash Point (°C)       -5        NaN        NaN
1   Flash Point (°C)      Max        NaN        NaN
2   Flash Point (°C)      NaN        NaN        NaN
3   Flash Point (°C)      NaN        NaN        NaN
4   Flash Point (°C)      NaN        NaN        NaN
5   Flash Point (°C)      NaN        NaN        NaN
6   Flash Point (°C)      NaN        NaN         29
7   Flash Point (°C)      NaN        NaN        NaN
8   Flash Point (°C)      NaN        NaN        NaN
9   Flash Point (°C)      NaN        159        NaN
10  Flash Point (°C)      NaN        NaN        NaN
  • Related