I've created a code in Python that suggests which vehicles I should use based on the volume I need to fit inside it.
Now I need to create X different lines based on the sum of the vehicle types my code suggested. In this example, it suggested 4 type_1
and 1 type_3
vehicles, so I need to create 5 different lines with the same information (route, date) in each of them, and in the column "vehicle type" it will reproduce the vehicle type.
I hope I was clear and please let me know how I can help you understand my question.
Thanks!
CodePudding user response:
This method can run through every row and output a new dataframe as you mentioned. I added another row to show its efficiency.
df = pd.DataFrame({
"route" : ["Route ABC", "Route DEF"],
"date" : ["2022-07-11", "2022-09-10"],
"vehicle_type_1" : [4,1],
"vehicle_type_2" : [0,2],
"vehicle_type_3" : [1,3]
})
# Empty lists to append data into
route = []
date = []
vType = []
for x in range(len(df)):
for y in df.columns:
if "vehicle_type" in y:
route.append([df['route'][x]] * df[y][x]) # Multiply all values by the number of vehicle types
date.append([df['date'][x]] * df[y][x])
vType.append([y.split("_")[2]] * df[y][x])
# "Explode" your list of lists into a list of values
route = [x for xx in route for x in xx]
date = [x for xx in date for x in xx]
vType = [x for xx in vType for x in xx]
newDF = pd.DataFrame({"route":route, "date":date, "vehicle_type":vType})
newDF
Output:
route date vehicle_type
0 Route ABC 2022-07-11 1
1 Route ABC 2022-07-11 1
2 Route ABC 2022-07-11 1
3 Route ABC 2022-07-11 1
4 Route ABC 2022-07-11 3
5 Route DEF 2022-09-10 1
6 Route DEF 2022-09-10 2
7 Route DEF 2022-09-10 2
8 Route DEF 2022-09-10 3
9 Route DEF 2022-09-10 3
10 Route DEF 2022-09-10 3
CodePudding user response:
One approach is to create a list of rows to create your output from your input. You could do something like this, adding a variable number of rows based on how the count for each vehicle type.
rows = []
data_dictionaries = input.to_dict(orient='records')
for entry in data_dictionaries:
route = entry['route']
date = entry['date']
num_type1 = entry['vehicle_type_1']
num_type2 = entry['vehicle_type_2']
num_type3 = entry['vehicle_type_3']
rows = [{'route': route, 'date': date, 'vehicle_type': 1}]*num_type1
rows = [{'route': route, 'date': date, 'vehicle_type': 2}]*num_type2
rows = [{'route': route, 'date': date, 'vehicle_type': 3}]*num_type3
And then convert back to a dataframe at the end!
output = pd.Dataframe(rows)