I have a Pandas Dataframe df
and a numpy array ar
of the same size. I can extract rows from df
like this:
subdf = df[df['column'] == value]
But how can I extract corresponding rows from ar
, i.e. rows with the same indices?
In my case, df
is also a subset of bigger Dataframe, meaning that df.index
is not a set of consecutive integers.
CodePudding user response:
You can use:
df = pd.DataFrame({'value': [1,2,3,2,1]})
ar = np.array([10,20,30,20,10])
ar[df['value'] == 2]
output:
array([20, 20])
or, if you have higher dimensions:
ar = np.arange(20).reshape(4,5)
ar[:, df['value'] == 2]
input:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]])
output:
array([[ 1, 3],
[ 6, 8],
[11, 13],
[16, 18]])
CodePudding user response:
Not so clear but lets make an attempt
import pandas as pd
df = pd.DataFrame({
'Date': ['2021-09-14','2021-09-14','2021-09-14','2021-09-13','2021-09-12','2021-09-12','2021-09-11'],
'Date_Yesterday': ['2021-09-13','2021-09-13','2021-09-13','2021-09-12','2021-09-11','2021-09-11','2021-09-10'],
'Clicks': [100,100,100,50,10,10,1]
})
df
array=df['Clicks'].values# Array
s=df.loc[3:4, 'Clicks'].index# define dataframe index range
or
s=df['Clicks'].isin([50, 10])
array[np.r_[s]]#array slice based on df
CodePudding user response:
There are a couple of ways to do this. Given the array and dataframe:
import numpy as np
import pandas as pd
arr = np.array(([21, 22, 23], [11, 22, 33], [21, 77, 89]))
df = pd.DataFrame(data=arr, columns=['c1', 'c2', 'c3'])
If you extract rows from the dataframe, you can use the index from that sub-dataframe to get the corresponding rows from the numpy array.
df2 = df[df['c1'] == 21]
arr1 = arr[df2.index]
print(arr1)
Output:
[[21 22 23]
[21 77 89]]
You can also directly use the same syntax that you used to get the sub-dataframe to get the rows from the array.
arr2 = arr[df['c2'] == 21]
print(arr2)
Output:
[[21 22 23]
[21 77 89]]