add last value from list in column A into list in column b-CodePudding

I have the following data frame:

    df_test = pd.DataFrame({"f":['a', 'a', 'a', 'a', 'b', 'b', 'b', 'b'],
                            "d":['x', 'x', 'y', 'y', 'x', 'x', 'y', 'y'],
                            "low": [0,5,2,4,5,10,4,8],
                            "up": [5,10,4,6,10,15,8,12],
                            "z": [1,3,6,2,3,7,5,10]})

and what I first have to do is to convert the columns 'low', 'up' and 'z' to list for each (grouped by) 'f' and 'd'. so this is what I did:

    dff = df_test.groupby(['f','d'])[['low', 'up', 'z']].agg(list).reset_index()

and this is what I get:

Now I want to extract the last value from the lists in column 'up' and add it to the lists in column 'low'. But this is unfortunately not working:

    dff['last'] = (dff['up'].apply(lambda x: x[-1])).tolist()
    dff['new'] = dff['low'].append(dff['last'])

I get an error message "ValueError: cannot reindex from a duplicate axis". The column 'new' should have these values: [0,5,10], [2,4,6], [5,10,15], [4,8,12]

any help is very much appreciated!

CodePudding user response：

Try:

dff["new"] = dff.apply(lambda x: [*x["low"], x["up"].pop()], axis=1)
print(dff)

Prints:

   f  d      low    up        z          new
0  a  x   [0, 5]   [5]   [1, 3]   [0, 5, 10]
1  a  y   [2, 4]   [4]   [6, 2]    [2, 4, 6]
2  b  x  [5, 10]  [10]   [3, 7]  [5, 10, 15]
3  b  y   [4, 8]   [8]  [5, 10]   [4, 8, 12]

If you want to keep the last element in up column:

dff["new"] = dff.apply(lambda x: [*x["low"], x["up"][-1]], axis=1)

CodePudding user response：

Take advantage of the mutability of lists, use a pure python loop that should be more efficient than apply.

To copy the element:

for l, u in zip(dff['low'], dff['up']):
    l.append(u[-1])

Output:

   f  d          low        up        z
0  a  x   [0, 5, 10]   [5, 10]   [1, 3]
1  a  y    [2, 4, 6]    [4, 6]   [6, 2]
2  b  x  [5, 10, 15]  [10, 15]   [3, 7]
3  b  y   [4, 8, 12]   [8, 12]  [5, 10]

To move the element:

for l, u in zip(dff['low'], dff['up']):
    l.append(u.pop(-1))

Output:

   f  d          low    up        z
0  a  x   [0, 5, 10]   [5]   [1, 3]
1  a  y    [2, 4, 6]   [4]   [6, 2]
2  b  x  [5, 10, 15]  [10]   [3, 7]
3  b  y   [4, 8, 12]   [8]  [5, 10]

For a new column use slicing:

dff['new'] = dff['low']   dff['up'].str[-1:]

Or a list comprehension (should be slower):

dff['new'] = [l [u[-1]] for l, u in zip(dff['low'], dff['up'])]

Output:

   f  d      low        up        z          new
0  a  x   [0, 5]   [5, 10]   [1, 3]   [0, 5, 10]
1  a  y   [2, 4]    [4, 6]   [6, 2]    [2, 4, 6]
2  b  x  [5, 10]  [10, 15]   [3, 7]  [5, 10, 15]
3  b  y   [4, 8]   [8, 12]  [5, 10]   [4, 8, 12]

CodePudding user response：

Another possible solution:

dff['new'] = dff['low']   pd.Series([[x[1]] for x in dff['up']])

Output:

   f  d      low        up        z          new
0  a  x   [0, 5]   [5, 10]   [1, 3]   [0, 5, 10]
1  a  y   [2, 4]    [4, 6]   [6, 2]    [2, 4, 6]
2  b  x  [5, 10]  [10, 15]   [3, 7]  [5, 10, 15]
3  b  y   [4, 8]   [8, 12]  [5, 10]   [4, 8, 12]