Home > OS >  Is there a way to replace an entire object from a panda dataframe?
Is there a way to replace an entire object from a panda dataframe?

Time:09-21

I have a date frame that I need to switch out values from it, like "BOX TRUCK" to just "TRUCK". I have tried to use the str.replace function but unfortunately it only changes the value of "TRUCK" in "BOX TRUCK" to "TRUCK" so I am left with the same issue. I need to be able to change multiple different values "BOX TRUck, FIRE TRUCK, Dumpster truck, etc." and so I have tried to look for "truck" and change them but it has not been working.

dft['VEHICLE_TYPE']  = dft['VEHICLE_TYPE'].str.replace(r'truck', 'TRUCK', regex=True, flags=re.IGNORECASE)

CodePudding user response:

Another method would be using a mask with str.contains instead of str.replace.

Also, rather than using an apply, use pd.Series.mask

mask = dft['VEHICLE_TYPE'].str.contains('truck', case=False)
dft['VEHICLE_TYPE'] = dft['VEHICLE_TYPE'].mask(mask, 'TRUCK')

CodePudding user response:

Considering that OP has a dataframe like the following

df = pd.DataFrame(np.random.randint(10,100,size=(10, 2)), columns=['Count', 'Vehicle_type'])
df['Vehicle_type'] = np.random.choice(['BOX TRUCK', 'FIRE TRUCK', 'Dumpster truck'], p=[0.5, 0.3, 0.2], size=10)


[Out]:
   Count    Vehicle_type
0     35       BOX TRUCK
1     30  Dumpster truck
2     34      FIRE TRUCK
3     69       BOX TRUCK
4     13       BOX TRUCK
5     70       BOX TRUCK
6     53      FIRE TRUCK
7     56  Dumpster truck
8     43  Dumpster truck
9     40       BOX TRUCK

Assuming that the goal is to change to TRUCK anytime the value is BOX TRUCK, FIRE TRUCK, or, Dumpster truck, then the following will do the work

df['Vehicle_type'] = df['Vehicle_type'].apply(lambda x: 'TRUCK' if x.lower() in ['box truck', 'fire truck', 'dumpster truck'] else x)


   Count Vehicle_type
0     79        TRUCK
1     74        TRUCK
2     25        TRUCK
3     38        TRUCK
4     24        TRUCK
5     45        TRUCK
6     83        TRUCK
7     20        TRUCK
8     63        TRUCK
9     49        TRUCK

Notes:

  • One is using x.lower() to be able to ignore case.

  • If the column has additional categories/vehicle types that one wants to consider for the change, one can just add them to the list in the lambda function (write them with lower case).

An alternative, that doesn't require one to know the types that exist, as longs as it contains the string truck is using str.contains, as follows

df['Vehicle_type'] = df['Vehicle_type'].apply(lambda x: 'TRUCK' if x.lower().str.contains('truck') else x)

Or with str.replace, such as

df['Vehicle_type'] = df['Vehicle_type'].apply(lambda x: x.lower().str.replace('truck', 'TRUCK'))

If, on another hand, the goal is to replace, from the column Vehicle_type, BOX TRUCK to Regular Truck (the same logic can be applied to other changes), the following will do the work

df['Vehicle_type'] = df['Vehicle_type'].replace('BOX TRUCK', 'Regular Truck')

[Out]:
   Count    Vehicle_type
0     35   Regular Truck
1     30  Dumpster truck
2     34      FIRE TRUCK
3     69   Regular Truck
4     13   Regular Truck
5     70   Regular Truck
6     53      FIRE TRUCK
7     56  Dumpster truck
8     43  Dumpster truck
9     40   Regular Truck
  • Related