Home > database >  Manipulating Values in Pandas DataFrames
Manipulating Values in Pandas DataFrames

Time:02-06

I am trying to create and apply a function def change(x): which modifies a single column of values grocery in the grocery data frame as shown in the image below grocery data

I want to achieve the result in the image below output

I am at the beginner level in python but I know I can use the map() or apply() functions to solve this. My main problem is using the split() method to achieve the result as the values in the category column are of varying lengths. Or are there other string manipulation methods that can be used?

import pandas as pd

groceries = {
'grocery':['Tesco's wafers', 'Asda's shortbread', 'Aldi's lemon tea', 'Sainsbury's croissant', 'Morrison's doughnut', 'Amazon fresh's peppermint tea', 'Bar becan's pizza', 'Pound savers' shower gel'],
'category':['biscuit', 'biscuit', 'tea', 'bakery', 'bakery', 'tea', 'bakery', 'hygiene'],
'price':[0.99, 1.24, 1.89, 0.75, 0.50, 2.5, 4.99, 2]
}

df = pd.DataFrame(groceries)
df

# function to modify a single column of values - grocery
def change(x):
    return df['grocery].str.split(' ').str[1]

df = pd.DataFrame(groceries)

df['grocery'] = df['grocery'].map(change)
df


# Expected DataFrame
groceries = pd.DataFrame({
'grocery':['Wafers', 'Shortbread', 'Lemon Tea', 'Croissant', 'Doughnut', 'Peppermint Tea', 'Pizza', 'Shower Gel'],
'category':['biscuit', 'biscuit', 'tea', 'bakery', 'bakery', 'tea', 'bakery', 'hygiene'],
'price':[0.99, 1.24, 1.89, 0.75, 0.50, 2.5, 4.99, 2]
})

CodePudding user response:

Assuming you have a dataframe df with your original data:

df['grocery'] = df['grocery'].apply(lambda x: x.split()[1].title())

CodePudding user response:

I hope this works for your solution, I split it with "'" comma and then start it with from 1 index of a string. It depends on conditions

import pandas as pd

groceries = {
    'grocery': [
        "Tesco's wafers", "Asda's shortbread", "Aldi's lemon tea",
        "Sainsbury's croissant", "Morrison's doughnut",
        "Amazon fresh's peppermint tea", "Bar becan's pizza",
        "Pound savers' shower gel"
    ],
    'category': [
        'biscuit', 'biscuit', 'tea', 'bakery', 'bakery', 'tea', 'bakery',
        'hygiene'
    ],
    'price': [0.99, 1.24, 1.89, 0.75, 0.50, 2.5, 4.99, 2]
}

df = pd.DataFrame(groceries)
# split it with "'" comma and then start it with from 1 index of a string 
# if multiple conditions for grocery string then 

# def grocery_chng(x):
#     # specify multiple conditions to replace a string
#     return x
# df['grocery'] = df['grocery'].apply(grocery_chng)

df['grocery'] = df['grocery'].apply(lambda x: x.split("'")[1][1:].title())
df
  • Related