I have the following line in my code where I'm taking a string and splitting it based on a delimiter:
task_df[['Project','Section']] = task_df.Projects.str.split(": ",expand=True)
#sample Projects = Rob's Project: Untitled Section
But I'm running into issues whenever someone adds a Project or Section name that also contains my delimiter ex. Project X: Section: Rob
error: ValueError: Columns must be same length as key
NOTE: Sometimes the duplicate : will be in the project, but most times it's in the Section Name
How would I account for this in my code? Is there any way to cleanly avoid this from being an error? If not, how can I make it just remove those that would cause the error?
CodePudding user response:
IIUC, you need only the first and second part, if the same sep exists more than once, they should be joined by the same sep. Therefore:
task_df['Project'] = task_df['Projects'].str.split(": ").str[0]
task_df['Section'] = task_df['Projects'].str.split(": ").str[1:].map(lambda x: ": ".join(x))
For this dataset:
Projects
0 Rob's Project: Untitled Section
1 Project X: Section: Rob
This is the output:
Projects Project Section
0 Rob's Project: Untitled Section Rob's Project Untitled Section
1 Project X: Section: Rob Project X Section: Rob