I have a data frame as below.
pl.DataFrame({'combine_address':[ ["Yes|#456 Lane|Apt#4|ABC|VA|50566", "Yes|#456 Lane|Apt#4|ABC|VA|50566", "No|#456 Lane|Apt#4|ABC|VA|50566"],
["No|#8495|APT#94|SWE|WA|43593", "No|#8495|APT#94|SWE|WA|43593", "Yes|#8495|APT#94|SWE|WA|43593"]
]})
Here combine address is a list type column which has elements with about 6 pipe(|) values, Here i would like to apply a split on each element with an separator(|) in a list.
Here is the expected output:
If a list has 3 elements the splitted columns will be 3*6=18
If a list has 5 elements the splitted columns will be 5*6=30 and so on so forth.
CodePudding user response:
import polars as pl
import pandas as pd
df=pd.DataFrame({'combine_address':[ ["Yes|#456 Lane|Apt#4|ABC|VA|50566", "Yes|#456 Lane|Apt#4|ABC|VA|50566", "No|#456 Lane|Apt#4|ABC|VA|50566"],
["No|#8495|APT#94|SWE|WA|43593", "No|#8495|APT#94|SWE|WA|43593", "Yes|#8495|APT#94|SWE|WA|43593"]
]})
The above is the original code. Then, you can try the following below.
a=[]
for i in range(len(df['combine_address'])):
a =[j.split('|') for j in df['combine_address'][i]]
b=[]
for i in range(len(a)):
b =a[i]
and you will get a list with 36 elements.
c=pd.DataFrame(b).T
pl.from_pandas(c)
This is like your expected output. shape:(1,36)
I hope this will help you.
CodePudding user response:
Is this what you are looking for?
df = pl.DataFrame({"combine_address":[
["Yes|#456 Lane|Apt#4|ABC|VA|50566", "Yes|#456 Lane|Apt#4|ABC|VA|50566", "No|#456 Lane|Apt#4|ABC|VA|50566"],
["No|#8495|APT#94|SWE|WA|43593", "No|#8495|APT#94|SWE|WA|43593", "Yes|#8495|APT#94|SWE|WA|43593"]
]})
(df.select(
pl.col("combine_address").reshape((1, -1))
.arr.join("|").str.split("|")
.arr.to_struct(n_field_strategy="max_width")
).unnest("combine_address"))
shape: (1, 36)
┌─────────┬───────────┬─────────┬─────────┬─────┬──────────┬──────────┬──────────┬──────────┐
│ field_0 ┆ field_1 ┆ field_2 ┆ field_3 ┆ ... ┆ field_32 ┆ field_33 ┆ field_34 ┆ field_35 │
│ --- ┆ --- ┆ --- ┆ --- ┆ ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str ┆ str ┆ ┆ str ┆ str ┆ str ┆ str │
╞═════════╪═══════════╪═════════╪═════════╪═════╪══════════╪══════════╪══════════╪══════════╡
│ Yes ┆ #456 Lane ┆ Apt#4 ┆ ABC ┆ ... ┆ APT#94 ┆ SWE ┆ WA ┆ 43593 │
└─────────┴───────────┴─────────┴─────────┴─────┴──────────┴──────────┴──────────┴──────────┘