this is my DataFrame and I want to create a new column using loop with conditions.
import pandas as pd
student_card = pd.DataFrame({'ID':[20190103, 20190222, 20190531],
'name':['Kim', 'Yang', 'Park'],
'class':['H', 'W', 'S']})
student_card['new'] = pd.Series() #1.create new column
for i, v in student_card['name'].items(): #2.set index and values
if "Yang" in v: #3.if there's "Yang" in value
student_card['new'].append(v) #4. append the value of name column in new coulum
So I tried this method and got stuck with following error: TypeError: cannot concatenate object of type '<class 'str'>'; only Series and DataFrame objs are valid Which is not true btw (type of this column is Series)
CodePudding user response:
What append
does is to concatenate a series, which is not the case in your code as v
is a string, i
is the index of that string. You can try printing print(type(v))
and see for yourself. As for the documentation, you can find it here:
https://pandas.pydata.org/docs/reference/api/pandas.Series.append.html
What you are looking for is to set a value to a prexisting index on a column (or Series as its called in pandas). Something like that:
df.loc[index] = value
So in your code, this should do the trick
import pandas as pd
student_card = pd.DataFrame({'ID':[20190103, 20190222, 20190531],
'name':['Kim', 'Yang', 'Park'],
'class':['H', 'W', 'S']})
student_card['new'] = pd.Series() #1.create new column
for i, v in student_card['name'].items(): #2.set index and values
if "Yang" in v: #3.if there's "Yang" in value
student_card['new'].loc[i] = v #4. append the value of name column in new coulum
CodePudding user response:
Append will concatenate two Series. What you want is accesing a row. Use indexing like iloc or iat to do so:
import pandas as pd
student_card = pd.DataFrame({'ID':[20190103, 20190222, 20190531],
'name':['Kim', 'Yang', 'Park'],
'class':['H', 'W', 'S']})
student_card['new'] = pd.Series() #1.create new column
for i, v in student_card['name'].items(): #2.set index and values
if "Yang" in v: #3.if there's "Yang" in value
student_card['new'].iat[i] = v #4. append the value of name column in new coulum
Output:
(Index) | ID | name | class | new |
---|---|---|---|---|
0 | 20190103 | Kim | H | NaN |
1 | 20190222 | Yang | W | Yang |
2 | 20190531 | Park | S | NaN |
CodePudding user response:
You should really not use a loop to manipulate a pandas dataframe, this is an anti-pattern.
Also, append
is now deprecated.
Use a vectorial approach with boolean indexing:
# select the rows for which name==Yang and add the same name in the new column
student_card.loc[student_card['name'].eq('Yang'), 'new'] = student_card['name']
Or, using where
:
# mask all non matching values (name!=Yang) and copy the column
student_card['new'] = student_card['name'].where(student_card['name'].eq('Yang'))
output:
ID name class new
0 20190103 Kim H NaN
1 20190222 Yang W Yang
2 20190531 Park S NaN