I have this data frame:
Metric ProcId TimeStamp Value
CPU proce_123 Mar-11-2022 11:00:00 1.4453125
CPU proce_126 Mar-11-2022 11:00:00 0.058320373
CPU proce_123 Mar-11-2022 11:00:00 0.095274389
CPU proce_000 Mar-11-2022 11:00:00 0.019654088
CPU proce_144 Mar-11-2022 11:00:00 0.019841269
CPU proce_1 Mar-11-2022 11:00:00 0.234741792
CPU proce_100 Mar-11-2022 11:00:00 5.32945776
CPU proce_57777 Mar-11-2022 11:00:00 0.25390625
CPU proce_0000 Mar-11-2022 11:00:00 0.019349845
CPU proce_123 Mar-11-2022 11:00:00 0.019500781
CPU proce_123 Mar-11-2022 11:00:00 2.32421875
CPU proce_123 Mar-11-2022 11:00:00 68.3903656
CPU proce_123 Mar-11-2022 11:00:00 0.057781201
CPU proce_123 Mar-11-2022 11:00:00 0.416666627
this is just a sample data frame, the actual data frame is in thousands of rows. I need to go though this data frame in chunks the "ProdID" column and I need to create a string combining these ProdID in chunks for each iteration.
For example the string needs to be like this given the chunks size 3:
proce_123",%2proce_126",%2proce_123")
Please note after the 3rd chunk, we need to add "")". After the first ad second we need to add "",%2".
I can do something like this to print out the chunks:
n = 3 #size of chunks
chunks = [] #list of chunks
for i in range(0, len(id), n):
chunks.append(id[i:i n])
I am not sure how would I combine these 3 items in one string and add the others strings at the end. Can anybody help here?
CodePudding user response:
chunk_size = 3
list_of_proc_ids = []
# First, generate a list of the procIds
for obj in range(0, len(id)):
list_of_proc_ids.append(procId) # Not sure how you're appending this, guessing you use a slice on the string line?
final_str = ''
# Then enumerate through that list, adding a unique ending at every third
for index, obj in enumerate(list_of_proc_ids]:
final_str = str(obj)
if (index 1) % chunk_size == 0: # Checks if divisible by 3, accounting for 0 index
final_str = '")'
else:
final_str = '",%2'
CodePudding user response:
For an efficiency, use a vectorial approach:
import numpy as np
N = 3
# map code every N procid
s = np.where(np.arange(len(df))%N < N-1, '",%2', '")')
# concatenate strings
out = (df['ProcId'] '_' s).str.cat()
Output: 'proce_123_",%2proce_126_",%2proce_123_")proce_000_",%2proce_144_",%2proce_1_")proce_100_",%2proce_57777_",%2proce_0000_")proce_123_",%2proce_123_",%2proce_123_")proce_123_",%2proce_123_",%2'