Home > Blockchain >  How to filter Pandas Unique Result by only Non-Numeric Values?
How to filter Pandas Unique Result by only Non-Numeric Values?

Time:10-16

Context

I have a Pandas Series and am calling the .unique() Method to get all unique Values. I now would like to filter this result to only see unique Values that are Non-Numeric. However, I get the following Error:

Error: AttributeError: 'float' object has no attribute 'isnumeric'


Code

values = data['Column Name'].unique()
non_numeric_values = [value for value in values if not value.isnumeric()] # Error

Question

  • How can I achieve my goal of being able to select only those Values, that are Non-Numeric?
  • Please note, that the Column may has an Object dtype and the Values may be of type String.

CodePudding user response:

You can use the built-in isinstance:

from numbers import Number
import pandas as pd

elements = [1.0, "foo", 2, "bar", 3.0, "baz", 4, "qux"]
df = pd.DataFrame(elements, columns=["column name"])

uniques = df["column name"].unique()

types = [type(x) for x in uniques]
print(types)  # [float, str, int, str, float, str, int, str]

uniques_non_numeric = [x for x in uniques if not isinstance(x, Number)]
print(uniques_non_numeric)  # ['foo', 'bar', 'baz', 'qux']

or switching the list-comprehension with pandas apply:

df[df["column name"].apply(lambda x : not isinstance(x, Number))]["column name"].unique()

CodePudding user response:

Lets try:

    

data[data['Column Name'].str.isnumeric()].unique()

If you have mixed types, coerce it to str and then query

data[data['Column Name'].astype(str).str.isnumeric()]

CodePudding user response:

The below solution could help where we try to convert to float if it converts then we skip and rest all we add to array

output = []
for x in actors['actor'].unique():
    try:
        float(str(x).strip())
    except:
        output.append(x)

CodePudding user response:

import pandas as pd
import numpy as np

data = np.array(['a', 'e', 'i', 'o', 'u',1.2,3.5,1,5])
df = pd.DataFrame(data, columns=['col1'])

values = df['col1'].unique()

def isfloat(num):
    try:
        float(num)
        return True
    except ValueError:
        return False

list1 = []
for item in values:
    if not (item.isdigit() or isfloat(item)):
        list1.append(item)

print(list1)
  • Related