I've read a few other posts about this but the other solutions haven't worked for me. I'm trying to look at 2 different CSV files and compare data from 1 column from each file. Here's what I have so far:
import pandas as pd
import numpy as np
dataBI = pd.read_csv("U:/eu_inventory/EO BI Orders.csv")
dataOrderTrimmed = dataBI.iloc[:,1:2].values
dataVA05 = pd.read_csv("U:\eu_inventory\VA05_Export.csv")
dataVAOrder = dataVA05.iloc[:,1:2].values
dataVAList = []
ordersInBoth = []
ordersInBI = []
ordersInVA = []
for order in np.nditer(dataOrderTrimmed):
if order in dataVAOrder:
ordersInBoth.append(order)
else:
ordersInBI.append(order)
So if the order number from dataOrderTrimmed is also in dataVAOrder I want to add it to ordersInBoth, otherwise I want to add it to ordersInBI. I think it splits the information correctly but if I try to print ordersInBoth each item prints as array(5555555, dtype=int64)
I want to have a list of the order numbers not as an array and not including the dtype information. Let me know if you need more information or if the way I've typed it out is confusing. Thanks!
CodePudding user response:
The way you're using .iloc
is giving you a DataFrame, which becomes 2D array when you access values
. If you just want the values in the column at index 1, then you should just say:
dataOrderTrimmed = dataBI.iloc[:, 1].values
Then you can iterate over dataOrderTrimmed
directly (i.e. you don't need nditer
), and you will get regular scalar values.