Using loop to compare a value in dataframe column-CodePudding

I need to filter all values where T becomes 1, and when the value reaches 0, i need to record '10' time instances before dropping other zeros in the list.

i thought of using if and while loop combination, create a filter column and delete values based on this

counter = 0
if df["T"] = 1 : 
    df["Filter"] = 1
else:
    while df["T"] =0 :
        if counter <=10:
            df["Filter"] = 1
        else:
            df["F"] = 0
        counter = counter  1

I'm getting following error

    if df["T"] = 1 :
                    ^
SyntaxError: invalid syntax

help me

CodePudding user response：

When assigning a value to a variable, you use a single =. When checking if a value is equal to some value you need two equal signs: ==. But I am not sure if this code does what you want as df["T"] == 1 will raise an error as it does not return True or False. Rather it will return a pd.Series.

Probably, something you want to do is:

df.loc[df["T"] == 1, 'Filter'] = 1

This will set Filter to 1 when T is equal to 1.

counter = 0
if df["T"] == 1 : 
    df["Filter"] = 1
else:
    while df["T"] == 0 :
        if counter <= 10:
            df["Filter"] = 1
        else:
            df["F"] = 0
        counter = counter   1

CodePudding user response：

If I interpreted correctly, you have to put the field ['Filter] equal to 1 if ['T'] is 1 OR if ['T'] is 0 and it is one of the first 10 results.

The right way to write an equality if is with ==: if a == b: But in your case, even if you had written it correctly, an error would have risen, because you can't ask Python to tell if a pd.Series (df['T']) is equal to an int. Another problem of your code is with the use of the variable counter: if you start from 0 and put the if condition as if counter <= 10:, you will replace 11 occurrencies, not 10.

What I would do is:

df['Filter'] = 0
df.loc[(df.groupby('T').cumcount() <10) | df['T'] == 1, 'Filter'] = 1