Home > Back-end >  Rename a row with X unknown characters
Rename a row with X unknown characters

Time:03-10

If I have the following dataframe:

ID other
219218 34
823#32 47
unknown 42
8#3#32 32
1#3#5# 97
6#3### 27

I want to obtain the following result:

ID other
219218 34
823#32 47
unknown 42
8#3#32 32
unknown 97
unknown 27

I am using the following code which works.

for i in range(len(df)):
  ident = testing.loc[i, 'ID']
  if ident.count('#') > 2:
    df.loc[i, 'ID'] = 'unknown'

Is there a way to make it more optimal, bearing in mind that I am going to apply the code to a dataframe of more than 60,000 rows?

Thank you for your help.

CodePudding user response:

For an efficient solution, use vectorial methods and assign with loc:

df.loc[df['ID'].str.count('#').gt(2), 'ID'] = 'unknown'

output:

        ID  other
0   219218     34
1   823#32     47
2  unknown     42
3   8#3#32     32
4  unknown     97
5  unknown     27

CodePudding user response:

Personally speaking, I prefer apply function on the dataframe:

def replaceRow(value):
  if value.count("#") > 2:
    return "unknown"
  else:
    return value
df["ID"] = df["ID"].apply(replaceRow)
df

Output

ID other
219218 34
823#32 47
unknown 42
8#3#32 32
unknown 97
unknown 27
  • Related