I have a a csv file. How do I print the row that follows a row that has a particular string? I need to print all rows that contain "ixation" in them and then the row that follows this row.
Here is my current code
df = pd.read_csv('locationof.csv')
df = pd.DataFrame(data, columns = ['Trial', 'Code','Time','Duration'])
list1 = ['100_1to3_start','fixation','Fixation','66_1to3_start']
contain_values = df[df['Code'].str.contains('|'.join(list1), na=False)]
Here is my current output...
2 1.0 fixation_dummy 50637.0 25086.0
4 2.0 fixation_dummy 75889.0 25086.0
7 3.0 fixation_dummy 101141.0 25086.0
9 4.0 fixation_dummy 126393.0 25086.0
13 6.0 100_1to3_start_2034_1_0_1060 151811.0 20268.0
23 9.0 100_1to3_start_2456_4_0_2054 216104.0 24587.0
33 12.0 100_1to3_start_1507_7_0_2446 283885.0 15118.0
43 15.0 Fixation 332229.0 130081.0
55 17.0 66_1to3_start_2369_2_0_2352 484904.0 23590.0
76 23.0 66_1to3_start_1539_8_0_2518 615150.0 15285.0
82 25.0 Fixation 654357.0 130081.0
123 35.0 Fixation 996089.0 130081.0
164 45.0 Fixation 1343635.0 130081.0
174 46.0 66_1to3_start_1884_1_0_2537 1473882.0 18773.0
197 53.0 66_1to3_start_1541_8_0_2545 1621074.0 15284.0
204 55.0 Fixation 1662939.0 130080.0
213 56.0 100_1to3_start_2115_1_0_2528 1793186.0 21098.0
223 59.0 100_1to3_start_1892_4_0_2544 1859638.0 18939.0
233 62.0 100_1to3_start_2315_7_0_2537 1918282.0 23259.0
But I want...
2 1.0 fixation_dummy 50637.0 25086.0
4 2.0 fixation_dummy 75889.0 25086.0
7 3.0 fixation_dummy 101141.0 25086.0
9 4.0 fixation_dummy 126393.0 25086.0
13 6.0 100_1to3_start_2034_1_0_1060 151811.0 20268.0
43 15.0 Fixation 332229.0 130081.0
55 17.0 66_1to3_start_2369_2_0_2352 484904.0 23590.0
82 25.0 Fixation 654357.0 130081.0
123 35.0 Fixation 996089.0 130081.0
164 45.0 Fixation 1343635.0 130081.0
174 46.0 66_1to3_start_1884_1_0_2537 1473882.0 18773.0
204 55.0 Fixation 1662939.0 130080.0
213 56.0 100_1to3_start_2115_1_0_2528 1793186.0 21098.0
How do I only print out the lines 66_1to3.., 100_1to3...) that follow only a row with "ixation" in it? This code will be run over a series of csv files in which the exact lines that I need vary between csv files.
CodePudding user response:
To answer this description: "I need to print all rows that contain "ixation" in them and then the row that follows this row.", the solution would be:
# identify rows with "ixation"
mask = df['Code'].str.contains('ixation')
# select them and one row below
out = df[mask|mask.shift()]
CodePudding user response:
Try boolean indexing with shift
since we only care about the rows that follow "ixation"
list1 = ['100_1to3_start', '66_1to3_start']
df[df[2].str.contains('|'.join(list1), na=False) & df[2].shift().str.contains('ixation')]
0 1 2 3 4
4 13 6.0 100_1to3_start_2034_1_0_1060 151811.0 20268.0
8 55 17.0 66_1to3_start_2369_2_0_2352 484904.0 23590.0
13 174 46.0 66_1to3_start_1884_1_0_2537 1473882.0 18773.0
16 213 56.0 100_1to3_start_2115_1_0_2528 1793186.0 21098.0
Note that df[2]
, based on your example, would be df['Code']