I have a csv file that contains sequence and gene name. I want to take an input from user and print all the rows that contains user input as a part. As an example my data is;
Gene 1 ATGCGGTCTA
Gene 2 ACGCCCATGA
Gene 3 TCGAC
When user enters GC the outcome must be
Gene 1 ATGCGGTCTA
Gene 2 ACGCCCATGA
since both has GC in the sequences.
So far I try;
import csv
import sys
import pandas as pd
csv_file = csv.reader(open('DATA.csv', "r"), delimiter=",")
z=input('what would you like to search?').lower()
if z=='sequence':
s=input('Enter sequence : ').upper()
df = pd.read_csv('DATA.csv')
a = list(df['seq'])
b = ' '.join(str(s) for s in a)
c= b.find(s)
CodePudding user response:
Using pandas
and assuming the column of your dataframe with the sequences is called sequences
, you can do :
filtered_df = df[df['sequences'].str.contains(s)]