I got this error when I try to split one column and create additional columns in current dataframe:
Columns must be same length as key
if 'FULLNAME' in dataset.columns:
dataset[['FIRSTNAME','LASTNAME']] = dataset.FULLNAME.str.split(" ", 1)
Before I set static columns names:
dataset.columns = ["NUM", "MMYY", "AGE", "FULLNAME", "ADDRESS", "CITY", "STATE", "ZIP", "COUNTRY", "PHONE"]
How to fit it?
I suppose some of columns has not FULLNAME
to extract
May be to use this:
dataset["FIRSTNAME"] = None
dataset["LASTNAME"] = None
if 'FULLNAME' in dataset.columns:
r = dataset.FULLNAME.str.split(" ", 1)
if (r[0]):
dataset["LASTNAME"] = r[0]
if (r[1]):
dataset["FIRSTNAME"] = r[1]
Dataset is:
"NUM", "MMYY", "AGE", "FULLNAME", "ADDRESS", "CITY", "STATE", "ZIP", "COUNTRY", "PHONE"
1 1010 18 OLEG Kirova Wage US 1911 US 9584345345
2 1011 19 Krina Kirova Wage US 1911 US 9584345345
3 1012 20 Marina Kirova Wage US 1911 US 9584345345
CodePudding user response:
I think you want to pass expand=True into split so that it returns a DataFrame...
if 'FULLNAME' in dataset.columns:
dataset[['FIRSTNAME','LASTNAME']] = dataset.FULLNAME.str.split(" ", 1, expand=True)
CodePudding user response:
Assuming you have data like this:
import pandas as pd
dataset = pd.DataFrame({"FULLNAME": ["John Smith", "Jane Smith", "Joe Bloggs"]})
Your problem is that
dataset.FULLNAME.str.split(" ", 1)
only produces one column.
print(dataset.FULLNAME.str.split().shape)
(3,)
The doc for split shows that you probably want expand=True
.
Here's a MWE...
import pandas as pd
dataset = pd.DataFrame({"FULLNAME": ["John Smith", "Jane Smith", "Joe Bloggs"]})
if 'FULLNAME' in dataset.columns:
dataset[['FIRSTNAME','LASTNAME']] = dataset.FULLNAME.str.split(expand=True)