I have the following input :
my_list = ["x d1","y d1","z d2","t d2"]
And would like to transform it into :
Expected_result = ["d1(x,y)","d2(z,t)"]
I had to use brute force, and also had to call pandas to my rescue, since I didn't find any way to do it in plain/vanilla python. Do you have any other way to solve this?
import pandas as pd
my_list = ["x d1","y d1","z d2","t d2"]
df = pd.DataFrame(my_list,columns=["col1"])
df2 = df["col1"].str.split(" ",expand = True)
df2.columns = ["col1","col2"]
grp = df2.groupby(["col2"])
result = []
for grp_name, data in grp:
res = grp_name "(" ",".join(list(data["col1"])) ")"
result.append(res)
print(result)
CodePudding user response:
If your values are already sorted by key (d1, d2), you can use itertools.groupby
:
from itertools import groupby
out = [f"{k}({','.join(x[0] for x in g)})"
for k, g in groupby(map(str.split, my_list), lambda x: x[1])]
Output:
['d1(x,y)', 'd2(z,t)']
Otherwise you should use a dictionary as shown by @Jamiu.
A variant of your pandas solution:
out = (df['col1'].str.split(n=1, expand=True)
.groupby(1)[0]
.apply(lambda g: f"{g.name}({','.join(g)})")
.tolist()
)
CodePudding user response:
Here is one approach
result = {}
for item in my_list:
key, value = item.split()
result.setdefault(value, []).append(key)
output = [f"{k}({', '.join(v)})" for k, v in result.items()]
print(output)
['d1(x, y)', 'd2(z, t)']
CodePudding user response:
my_list = ["x d1","y d1","z d2","t d2"]
res = []
for item in my_list:
a, b, *_ = item.split()
if len(res) and b in res[-1]:
res[-1] = res[-1].replace(')', f',{a})')
else:
res.append(f'{b}({a})')
print(res)
['d1(x,y)', 'd2(z,t)']
Let N be the number that follows d, this code works for any number of elements within dN, as long as N is ordered, that is, d1 comes before d2, which comes before d3, ... Works with any value of N , and you can use any letter in the d link as long as it has whatever value is in dN and then dN, keeping that order, "val_in_dN dN"
If you need something that works even if the dN are not in sequence, just say the word, but it will cost a little more
CodePudding user response:
import itertools as it
my_list = [e.split(' ') for e in ["x d1","y d1","z d2","t d2"]]
r=[]
for key, group in it.groupby(my_list, lambda x: x[1]):
l=[e[0] for e in list(group)]
r.append("{0}({1},{2})".format(key, l[0], l[1]))
print(r)
Output :
['d1(x,y)', 'd2(z,t)']
CodePudding user response:
Another possible solution, which is based on pandas
:
(pd.DataFrame(np.array([str.split(x, ' ') for x in my_list]), columns=['b', 'a'])
.groupby('a')['b'].apply(lambda x: f'({x.values[0]}, {x.values[1]})')
.reset_index().sum(axis=1).tolist())
Output:
['d1(x, y)', 'd2(z, t)']