I can't manage to write a good script. Script order table (first column) by order. It should arrange each line, and give only the first occurrence.... I don't know how I could fix it.
EXAMPLE TABLE
A B
1211 ds
3245 ssssd
3114 dsf
3114 apple
4324 sssvdff
4324 weewr
4324 bla
4324 orange
1211 something
1211 blue
ORDER LIST EXAMPLE
4324\n
3245\n
3114\n
1211\n
ORDERED TABLE (what I get)
A B
4324 sssvdff
3245 ssssd
3114 dsf
1211 something
ORDERED TABLE (what I want)
A B
4324 sssvdff
4324 weewr
4324 bla
4324 orange
3245 ssssd
3114 dsf
3114 apple
1211 something
1211 blue
CODE:
import re
def map_ids_to_row_list(_id):
_id = _id.strip('\n')
for s in lines:
if re.search(_id,s):
return s
file = open('Lorder.txt', 'r').readlines()
names = []
for name in file:
names.append(name.strip('\n'))
name = 'table_data.txt'
name_split = name.rsplit('.',1)
new_name = name_split[0] '_sorted.csv'
table = open(name,'r')
table_read = table.readlines()
lines = []
for s in table_read:
lines.append(s)
ordered_table = list(map(map_ids_to_row_list, names))
with open(new_name,'w') as fs:
for row in ordered_table:
fs.write(str(row))
CodePudding user response:
Code:
l = [4324,3245,3114,1211]
df.A = df.A.astype("category")
df.A = df.A.cat.set_categories(l)
df.sort_values(["A"])
Output:
A B
4 4324 sssvdff
5 4324 weewr
6 4324 bla
7 4324 orange
1 3245 ssssd
2 3114 dsf
3 3114 apple
0 1211 ds
8 1211 something
9 1211 blue
CodePudding user response:
Another way could be to fall back to native python sorting -
order_list = [4324, 3245, 3114, 1211]
pd.DataFrame.from_records(sorted(df.to_records(index=False), key=lambda x: order_list.index(x[0])), columns=df.columns)
Output
A B
0 4324 sssvdff
1 4324 weewr
2 4324 bla
3 4324 orange
4 3245 ssssd
5 3114 dsf
6 3114 apple
7 1211 ds
8 1211 something
9 1211 blue