How can i insert values from a list into a pandas data frame column?-CodePudding

i have this dataframe:

index	x	y
0	0	3
1	0.07	4
2	0.1	6
3	0. 13	5

i want to insert new x values to the x column

new_x = [0, 0.03, 0.07, 0.1, 0.13, 0.17, 0.2]

so that the dataframe becomes

index	x	y
0	0	3
1	0.03	NaN
2	0.07	4
3	0.1	6
4	0. 13	5
5	0. 17	NaN
6	0. 2	NaN

so basically for every new_x value that doesn't exist in column x, the y value is NaN

is it possible to do it in pandas? thank you

CodePudding user response：

You can use Numpy's `searchsorted`.

After you create a new_y array that is the same length as the new_x array. You use searchsorted to identify where in the new_y array you need to drop the old y values.

new_y = np.full(len(new_x), np.nan, np.float64)
new_y[np.searchsorted(new_x, df.x)] = df.y

pd.DataFrame({'x': new_x, 'y': new_y})

      x    y
0  0.00  3.0
1  0.03  NaN
2  0.07  4.0
3  0.10  6.0
4  0.13  5.0
5  0.17  NaN
6  0.20  NaN

CodePudding user response：

This is a straightforward application of the merge function for pandas. More specifically a left join.


import pandas as pd

x1 = [0, 0.07, 0.1, 0.13]
y1 = [3, 4, 6, 5]
df1 = pd.DataFrame({"x": x1, "y": y1})

print(df1)

x2 = [0, 0.03, 0.07, 0.1, 0.13, 0.17, 0.2]
df2 = pd.DataFrame({"x": x2})

print(df2)

df3 = df2.merge(df1, how="left", on="x")
print(df3)


      x  y
0  0.00  3
1  0.07  4
2  0.10  6
3  0.13  5

      x
0  0.00
1  0.03
2  0.07
3  0.10
4  0.13
5  0.17
6  0.20

      x    y
0  0.00  3.0
1  0.03  NaN
2  0.07  4.0
3  0.10  6.0
4  0.13  5.0
5  0.17  NaN
6  0.20  NaN

CodePudding user response：

You could try to use the join method. Here is some sample code that you could refer

d1 = {'x': [0, 0.07, 0.1, 0.13], 'y': [3,4,6,5]}
d2 = {'x': [0, 0.03, 0.07, 0.1, 0.13, 0.17, 0.2]}
df1 = pd.DataFrame(data=d1)
df2 = pd.DataFrame(data=d2)
df2.set_index('x').join(df1.set_index('x'), on='x', how='left').reset_index()

CodePudding user response：

try using this, you can compare the values already present in the column with the new one's and then you can create a df remaining values and concatenate it to the older df

import pandas as pd
import numpy as np

df = pd.DataFrame({'x':[0,0.07,0.1,0.13], 'y':[3,4,6,5]})
new_list = [0, 0.03, 0.07, 0.1, 0.13, 0.17, 0.2]
def diff(new_list, col_list):
    if len(new_list) > len(df['x'].to_list()):
        diff_list = list(set(new_list) - set(df['x'].to_list()))
    else:
        diff_list = list(set(df['x'].to_list()) - set(new_list))
    return diff_list
new_df = pd.DataFrame({'x':diff(new_list,df['x'].to_list()),'y':np.nan})
fin_df = pd.concat([df,new_df]).reset_index(drop=True)
fin_df 
x   y
0   0.00    3.0
1   0.07    4.0
2   0.10    6.0
3   0.13    5.0
4   0.03    NaN
5   0.20    NaN
6   0.17    NaN

You can use Numpy's searchsorted.

You can use Numpy's `searchsorted`.