I want to create columns in a dataframe (df_joined) that contains as values tupels from a second df (df_tupels). The tupels are (10,50) and (20,60).
I tried various approaches to create it but I get the same error message: "Length of values (2) does not match length of index (4)". Seems I am not understanding a concept here.
This is the desired end state:
Season NY Berlin
Cities
NY spring (10, 50) (20, 60)
NY summer (10, 50) (20, 60)
Berlin spring (10, 50) (20, 60)
Berlin summer (10, 50) (20, 60)
data_ = {'Cities': ['NY', 'NY', 'Berlin', 'Berlin'], 'Season': ['spring','summer', 'spring', 'summer'], 'NY': ['(10, 50)', '(10, 50)', '(10, 50)', '(10, 50)'], 'Berlin': ['(20, 60)', '(20, 60)', '(20, 60)', '(20, 60)']}
df_intended = pd.DataFrame(data)
df_intended.set_index('Cities', inplace=True)
Code example is:
import pandas as pd
#create df in which tupels should be copied into new columns:
data = {'Cities': ['NY', 'NY', 'Berlin', 'Berlin'], 'Season': ['spring','summer', 'spring', 'summer']}
df_joined = pd.DataFrame(data)
df_joined.set_index('Cities', inplace=True)
# create source df (df_tupels)
df_tupels = pd.DataFrame({'Cities': ['NY', 'Berlin'], 'Lat': [10, 20], 'Long': [50, 60]})
df_tupels['tupels'] = df_tupels[['Lat', 'Long']].apply(tuple, axis=1)
df_tupels.set_index('Cities', inplace=True)
# trying to create new city columns from index and filling with tupels of df_tupels.
for city in df_tupels.index:
df_joined[city] = df_tupels['tupels'].loc[city]
#Following solutions don´t work here either
#df_joined.loc[:, city] = df_tupels['tupels'].loc[city]
#df_joined.insert(0, city, df_tupels['tupels'].loc[city])
Error message: ValueError: Length of values (2) does not match length of index (4)
I´ve notice posts with similar error messages but could not use them for my problem here.
Why isn´t the new dataframe column just filled with the respective tupel? What am I missing?
CodePudding user response:
You need to create a list of tuple that matches the dataframe length.
for city in df_tupels.index:
df_joined[city] = [df_tupels['tupels'].loc[city]] * len(df_joined)
print(df)
Season NY Berlin
Cities
NY spring (10, 50) (20, 60)
NY summer (10, 50) (20, 60)
Berlin spring (10, 50) (20, 60)
Berlin summer (10, 50) (20, 60)