Home > Enterprise >  Excel, How to split cells by comma delimiter into new cells
Excel, How to split cells by comma delimiter into new cells

Time:05-12

So let's say I have data like this with some delimiter like commas that I want to split to new cells either across to columns or down into rows.

The Data Location
One Museum, Two Museum City A
3rd Park, 4th Park, 5th Park City B

How would you do it in either direction? There are lots of methods why is methods provided preferred?

CodePudding user response:

The Excel manual method: click on Data>Text to Column. Now just copy and past if you want the data in one column. This is only good when the data set is small and your are doing it once.

The Power Query method: This method you do it once for the data source then click refresh button when the data changes in the future. The data source can be almost anything like csv, enter image description here enter image description here

3 - Now choose the split method, there is delimiter and there is 6 other choices.

4 - For this data I when with custom and use ", "

5 & 6 - To split down you have to select Advanced options. Make the selection.

7 Close & Load

This is a good method because you don't have to code in Power Query unless you want to.

CodePudding user response:

The Python method Make sure you have pip installed pandas or use conda to install pandas.

The code is like so:

import pandas as pd

df = pd.read_excel('path/to/myexcelfile.xlsx')
df[['key.0','key.1','key.2']] = df['The Data'].str.split(',', expand=True)
df.drop(columns=['The Data'], inplace = True)
# stop here if you want the data to be split into new columns

The data looks like this

   Location key.0       key.1       key.2
0   City A  One Museum  Two Museum  None
1   City B  3rd park    4th park    5th park

To get the split into rows proceed with the next code part:

stacked = df.set_index('Location').stack()
# set the name of the new series created
df = stacked.reset_index(name='The Data')
# drop the 'source' level (key.*)
df.drop('level_1', axis=1, inplace=True)

Now this is done and it looks like this

    Location  The Data
0   City A    One Museum
1   City A    Two Museum
2   City B    3rd park
3   City B    4th park
4   City B    5th park

The benefit of python is that is faster for larger data sets you can split using regex in probable a 100 ways. The data source can be all types that you would use for power query and more.

  • Related