Objective
Using magnetic beads probe to capture of mRNA, each magnetic beads containing a specific sequence, become UMI, according to a magnetic beads only capture a transcription of this feature, use UMI the same transcript can be read for clustering, for later assembly, the analysis of the quantitative and variable shear
Mission requirements:
According to the composition of UMI characteristic, found in the read UMI, UMI sequence and read the barcode of relationship between records,
Description:
Is UMI composed of the following characteristics:
GGAAACAGCTATGACCATGNNNNNNNNNNNNNNNNTTTTTTTT
Fixed sequence: GGAAACAGCTATGACCATG
UMI sequence: NNN is a random sequence of 16 bp UMI
Looking for strategists need to simultaneously satisfy several conditions:
1) traverse fastq, find a fixed sequence
2) after the interval and bp, find 3 oligo dT series
(note that considering the reverse complementary sequence)
Meet the two conditions can establish the relationship between them:
UMI sequences 1.
2. To find the corresponding barcode number
3. Establish UMI and barcode corresponding relation table
Data:
/hwfssz5 ST_BIGDATA/USER/xujunhao/project/course/result/split_read. 1 _rename. Fq. Gz
/hwfssz5 ST_BIGDATA/USER/xujunhao/project/course/result/split_read. 2 _rename. Fq. Gz
Asking a heavyweight said should do, thank you'd better have a code