Is there another way to add the information to the datafame, an easier way?
I have a dataframe which looks like the following:
Product Price
2023-01-03 Apple 2.00
2023-01-04 Apple 2.10
2023-01-05 Apple 1.90
2023-01-03 Banana 1.10
2023-01-04 Banana 1.15
2023-01-05 Banana 1.30
2023-01-03 Cucumber 0.50
2023-01-04 Cucumber 0.80
2023-01-05 Cucumber 0.55
I have two additional information:
Apple Fuit
Banana Fruit
Cucumber Vegetable
and
Apple UK
Banana Columbia
Cucumber Mexico
this needs to be added in an additional column as shown below in the results.
Product Price Category Country
2023-01-03 Apple 2.00 Fuit UK
2023-01-04 Apple 2.10 Fuit UK
2023-01-05 Apple 1.90 Fuit UK
2023-01-03 Banana 1.10 Fuit Columbia
2023-01-04 Banana 1.15 Fuit Columbia
2023-01-05 Banana 1.30 Fuit Columbia
2023-01-03 Cucumber 0.50 Vegtable Mexico
2023-01-04 Cucumber 0.80 Vegtable Mexico
2023-01-05 Cucumber 0.55 Vegtable Mexico
The code lokks like the following:
import pandas as pd
df = pd.DataFrame({
'Product': ['Apple', 'Apple', 'Apple',
'Banana', 'Banana', 'Banana',
'Cucumber', 'Cucumber', 'Cucumber'],
'Price': [2.0, 2.1, 1.9,
1.1, 1.15, 1.3,
0.5, 0.8, 0.55]},
index=['2023-01-03', '2023-01-04', '2023-01-05',
'2023-01-03', '2023-01-04', '2023-01-05',
'2023-01-03', '2023-01-04', '2023-01-05'],
)
df.index = pd.to_datetime(df.index)
print(df)
def category(row):
if row['Product'] == 'Apple':
return 'Fuit'
elif row['Product'] == 'Banana':
return 'Fuit'
else:
return 'Vegtable'
def country(row):
if row['Product'] == 'Apple':
return 'UK'
elif row['Product'] == 'Banana':
return 'Columbia'
else:
return 'Mexico'
df['Category'] = df.apply(lambda row: category(row), axis=1)
df['Country'] = df.apply(lambda row: country(row), axis=1)
print(df)
CodePudding user response:
You can put the infos in a nested dict
, then make a intermediate DataFrame
for the merge
:
infos = pd.DataFrame({"Category": {"Apple": "Fuit", "Banana": "Fruit", "Cucumber": "Vegetable"},
"Country": {"Apple": "UK", "Banana": "Columbia", "Cucumber": "Mexico"}})
df = df.merge(infos, left_on="Product", right_index=True, how="left")
Output :
print(df)
Product Price Category Country
2023-01-03 Apple 2.00 Fuit UK
2023-01-04 Apple 2.10 Fuit UK
2023-01-05 Apple 1.90 Fuit UK
2023-01-03 Banana 1.10 Fruit Columbia
2023-01-04 Banana 1.15 Fruit Columbia
2023-01-05 Banana 1.30 Fruit Columbia
2023-01-03 Cucumber 0.50 Vegetable Mexico
2023-01-04 Cucumber 0.80 Vegetable Mexico
2023-01-05 Cucumber 0.55 Vegetable Mexico
CodePudding user response:
df['Category'] = df.apply(lambda row: 'Fruit' if (row['Product'] == 'Apple') or (row['Product'] == 'Banana') else 'Vegetable', axis=1)
df['Country'] = df.apply(lambda row: 'UK' if row['Product'] == 'Apple' else ('Columbia' if row['Product'] == 'Banana' else 'Mexico'), axis=1)