Home > other >  delete row from numpy array based on partial string in python
delete row from numpy array based on partial string in python

Time:12-06

I have a very large numpy array that looks similar to my example. The partial string that I'm trying to detect is "F_H" and it's usually on column 0 of my array.

a = np.array([['#define', 'bad_stringF_H', 'some_value'],
              ['#define', 'good_string', 'some_value2'],
              ['#define', 'good_string_2', 'some_value3'],
              ['#define', 'bad_string2F_H', 'some_value4']])

I just want to delete the whole array if that partial string is detected in the row so the desired output would be like this.

[['#define' 'good_string' 'some_value2']
 ['#define' 'good_string_2' 'some_value3']]

CodePudding user response:

You can use NumPy's Boolean indexing to create a new array that only includes the rows that do not contain the string 'F_H':

import numpy as np

a = np.array([['#define', 'bad_stringF_H', 'some_value'],
              ['#define', 'good_string', 'some_value2'],
              ['#define', 'good_string_2', 'some_value3'],
              ['#define', 'bad_string2F_H', 'some_value4']])

mask = np.array('F_H' not in x[1] for x in a])
print(mask)
new_a = a[mask]

print(new_a)
  • Related