So I have this list:
['test.csv', 'test2.csv']
I need it to have its final form like this:
[('test.csv', 'test.csv'),('test2.csv', 'test2.csv')]
What would be the best way, performance wise, to do this in Python?
Thanks.
CodePudding user response:
You are looking for list comprehension.
Try this:
files = ['test.csv', 'test2.csv']
result = [(file, file) for file in files]
CodePudding user response:
I'd zip:
[*zip(lst, lst)]
Benchmark on list(map(str, range(1000)))
:
28 us [*zip(lst, lst)]
44 us [(file, file) for file in files]
89 us [(file,) * 2 for file in files]
259 us list(map(lambda x: tuple([x] * 2), lst))
Or for repeating each value 10 instead of 2 times (just because someone proposed generalizing like that... I don't think it's something you'd realistically do... already your duplication is an odd thing to do):
67 us [*zip(*[lst] * 10)]
115 us [(file,) * 10 for file in files]
287 us list(map(lambda x: tuple([x] * 10), lst))
Code (Try it online!):
from timeit import repeat
setup = '''
lst = files = list(map(str, range(1000)))
'''
codes = '''
[*zip(lst, lst)]
[(file, file) for file in files]
[(file,) * 2 for file in files]
list(map(lambda x: tuple([x] * 2), lst))
'''.strip().splitlines()
for _ in range(3):
for code in codes:
t = min(repeat(code, setup, number=1000))
print('= us ' % (t * 1e3), code)
print()
CodePudding user response:
Generic version of @grfreitas answer.
num_times_to_duplicate = 2
files = ["test.csv", "test2.csv"]
result = [(file,) * num_times_to_duplicate for file in files]
print(result)
CodePudding user response:
You can use map
lst = ['test.csv', 'test2.csv']
lst = list(map(lambda x: tuple([x] * 2), lst))
print(lst) # [('test.csv', 'test.csv'), ('test2.csv', 'test2.csv')]