As I am new to python as I am trying to split the text data and convert into as row records. suppose I have 100 records as need i need to split into as 1-7 is one column,8-11 is second column,12-15 is third column and 16-25 is fourth column etc.. Can anyone help me that issue.
example text format :
1. animals120 redlivinginjungle
2. worldis2021 skybluecolour
The above 2 records are the example to split into the data as row.
Output format :
1. animals 120 red living injungle
2. worldis 202 sky blue colour
CodePudding user response:
Use the .str
accessor
column_splits = {'first': [0, 7], 'second': [7, 10]}
for column, limits in column_splits.items():
start, end = limits
df[column] = df['your_column'].str[start: end]
CodePudding user response:
You can combine itertools.tee
and zip_longest
Function to split:
from itertools import tee, zip_longest
def split_by_index(s):
indices = [0,7,10,14,20]
start, end = tee(indices)
next(end)
return " ".join([s[i:j] for i,j in zip_longest(start, end)])
You data:
import pandas as pd
df = pd.DataFrame()
df["sentence"] = ["animals120 redlivinginjungle",
"animals140 redlivinginjungle",
"animals160 redlivinginjungle"]
sentence
0 animals120 redlivinginjungle
1 animals140 redlivinginjungle
2 animals160 redlivinginjungle
Then apply function to create new dataframe:
new_df = df["sentence"].apply(split_by_index).str.split(expand=True)
Output
print(new_df)
0 1 2 3 4
0 animals 120 red living injungle
1 animals 140 red living injungle
2 animals 160 red living injungle