I am trying to count how many times a value, "i", is in the range i<0.5. I am counting from a csv file. To be clear, I only want the number of times to be appended to a dictionary. I will post my code and the result I get. the result I want is like this:(86 is a place holder, not the actual number that is true) {'0.0-0.5':[86].....
input:
\import pandas as pd
df = pd.read_csv("gen_pop.csv", index_col=0)
new_d = {'0.0-0.5':[],'0.5-1':[],'1-10':[],'10-100':[],'100-450':[]}
for i in df.values:
count=0
for col in df:
if str(i) == "nan":
continue
if (i<0.5).any():
count =1
new_d['0.0-0.5'].append(count)
print (new_d)
output: {'0.0-0.5': [0, 32, 0, 0, 32, 0, 0, 32,(... and so on a thousand times)], '0.5-1': [], '1-10': [], '10-100': [], '100-450': []}
Thanks in advance!
So I tried counting the number of times is true for i<0.5, but it doesn't work. to be clear the csv file contanes data in the form of table with 12,000 names and 32 tissues in which the names expressed. example of the data: my data
CodePudding user response:
Okay, let's say you have a toy data frame.
>>> pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
A B
0 1 4
1 2 5
2 3 6
Your comment seemed to imply something like this, where you could cut your data into bins and count.
>>> pd.cut(df.values.flatten(), bins=[0, 1, 4, 6]).value_counts()
>>> # specific bins are arbitrary etc
(0, 1] 1
(1, 4] 3
(4, 6] 2
dtype: int64
But your code seems to imply that you want to execute a cut action for every row separately. You could do that by applying the cut function with appropriate parameters df.apply(lambda r: pd.cut(r, bins=[ arbitrary... bins... here... ]).value_counts()
but I'm not entirely sure what output you want anyway. To get it into the form implied by your code would just require transposing the output.