I'm confused on how do I add a column to another in pandas
Here is what I'm trying to do :
from pandas import DataFrame
df1 = DataFrame({'a':[1,2], 'b':[3,4]})
concat((df1['a'], df1['b'].rename({'b':'a'}))).reset_index(drop=True)
Which do return what I want : A serie with my 4 values. What I don't understand is : Why I can't assign it to column 'a' ?
>>> from pandas import DataFrame
>>> df1 = DataFrame({'a':[1,2], 'b':[3,4]})
>>> concat((df1['a'], df1['b'].rename({'b':'a'}))).reset_index(drop=True)
0 1
1 2
2 3
3 4
dtype: int64
>>> df1['a'] = concat((df1['a'], df1['b'].rename({'b':'a'}))).reset_index(drop=True)
>>> df1
a b
0 1 3
1 2 4
Is there any way to make it more readable by the way? I'm confused on how it should worked... Note that I don't need column 'b' afterward.
Thanks for your help :)
Sam
CodePudding user response:
pandas series
don have columns.
if you want use column by Dataframe, use df[['a']]
instead df['a']
& you want change column's name need axis
or columns
pd.concat([df1[['a']], df1[['b']].rename(columns={'b':'a'})]).reset_index(drop=True)
output
a
0 1
1 2
2 3
3 4
If i create your output using your code, code like above. But I wouldn't use the above code.
i will use following code:
pd.concat([df1['a'], df1['b']]).to_frame('a').reset_index(drop='True')
CodePudding user response:
When you assign, you are not creating new rows/indices (except with a single value, which is not the case here).
pd.concat((df1['a'], df1['b'].rename({'b':'a'}))).reset_index(drop=True)
Gives you:
0 1
1 2
2 3
3 4
dtype: int64
Pandas aligns the indices before assignment. So, only the indices matching the existing df index are used, here 0 and 1, the rest is discarded