Consider the following Data Frame:
Index | foo | String A | bar | String B |
---|---|---|---|---|
First | 15 | nan | 0 | 3 |
Second | 0 | 2 | 25 | nan |
I have the following logic that allows me to iterate through a single column of a data frame and access a specific cell and change its value. If the value of the cell in the column with the name String A
is NOT NaN
then the value of the cell in the column that comes before it should be changed to NaN
for idx, val in enumerate(df['String A']):
if not math.isnan(val) :
df.iloc[: , df.columns.get_loc('String A')-1].iloc[idx] = np.nan
Is there a way I could generalize the loop so I could iterate through all columns that contain the substring String
such that the loop applies for both columns String A
and String B
(or any column that contains this substring).
The table should then look like this:
Index | foo | String A | bar | String B |
---|---|---|---|---|
First | 15 | nan | nan | 3 |
Second | nan | 2 | 25 | nan |
Edit: Added more details to the question
CodePudding user response:
Try:
for idx in [i for i, c in enumerate(df.columns) if c.startswith("String")]:
df.iloc[:, idx-1] = df.iloc[:, idx-1].where(df.iloc[:,idx].isnull())
>>> df
foo String A bar String B
Index
First 15.0 NaN NaN 3.0
Second NaN 2.0 25.0 NaN
CodePudding user response:
In order to maintain use of your original for loop you could do the following:
for idx, val in enumerate(df):
if 'String' in val:
print(val)
CodePudding user response:
You could subset the columns and use apply
to iterate:
relevant = df.columns.str.contains('String')
# select the columns that come right bevor a string column
relevant = np.where(relevant)[0] - 1
# apply your function to the relevant columns
df.iloc[:, relevant] = df.iloc[:, relevant].apply(...)