Home > OS >  Remove non-int from rows in CSV
Remove non-int from rows in CSV

Time:06-22

I'm trying to teach myself Python for data analysis and to practice I'm working on analyzing a CSV file containing 27k survey responses.

The responses take the form of "# - Rating" (For example: "10 - Extremely Interested")

Would anyone be able to tell me how would I go about removing everything but the numerical value so that I could plot this data with matplotlib?

Thank you :)

Edit: My apologies, here is the code for the df I'm working with:

likely_recc = pd.read_csv('test_data.csv', usecols = (['How likely are you to recommend this product to a friend?']))

CodePudding user response:

This should do the trick

df['col'].str.split(' ').str[0].astype(int)

Take a look at the pandas docs on string methods for more information.

CodePudding user response:

input_array = ["10 - Extremely Interested", "9 - Very Interested"]

only_my_numbers = []
for element in input_array:
    # element.split("-") will split based on "-"
    # But it will will have trailing space. Remove that with strip()

    print(element.split("-")[0].strip())
    
    only_my_numbers.append(element.split("-")[0].strip())
  • Related