I have a Endote Export File, looking like this:
%0 Journal Article
%A Abu-Rous, M.
%A Ingolic, E.
%A Schuster, K. C.
%D 2006
%Z Cellulose
Article
CODEN: CELLE
% Christian Doppler-Laboratory of Fibre and Textile Chemistry in Cellulosics, Institute of Texile Chemistry and Textile Physics, Leopold-Franzens-University Innsbruck, Hoechsterstrasse 73, A-6850 Dornbirn, Austria
Research Institute for Electron Microscopy (FELMI), Technical University of Graz, A-8010 Graz, Austria
Textile Innovation, Lenzing AG, A-4860 Lenzing, Austria
Schuster, K.C.; Textile Innovation, , A-4860 Lenzing, Austria; email: [email protected]
%~ Scopus
%G English
%0 Journal Article
%P 5003-5011
%! Ursolic acid from Trailliaedoxa gracilis induces apoptosis in medullary thyroid carcinoma cells
%@ 17912997
%R 10.3892/mmr.2015.4053
%1 in_author_address;
%F 8
%K Apoptosis
Bioactive agents
%Z Mol. Med. Rep.
Article
Chemicals/CAS: caspase 8; ursolic acid, 77-52-1; I kappa B kinase, 209902-66-9; Antineoplastic Agents, Phytogenic; Caspase 8; I-kappa B Kinase; IKBKG protein, human; Plant Extracts; Triterpenes; ursolic acid
Tradenames: cpt, Sigma Aldrich; rotichrom, karlsruhe, Germany
% Department of Pathophysiology and Immunology, Center of Molecular Medicine, Medical University of Graz, 31a Heinrichstrasse, Graz A, 8010, Austria
Department of Biochemistry and Molecular Biology, Shanghai Medical School, Fudan University, Shanghai, 200433, China
Department of Pharmacognosy, Institute of Pharmacy, Center of Molecular Biosciences, Leopold Franzens University of Innsbruck, Innsbruck A, 6010, Austria
Core Unit of Biomedical Research, Division of Laboratory Animal Science and Genetics, Medical University of Vienna, Himberg A, 2325, Austria
Research Institute for Electron Microscopy and Fine Structure Research, University of Technology Graz, Graz A, 8010, Austria
Pfragner, R.; Department of Pathophysiology and Immunology, 31a Heinrichstrasse, Austria; email: [email protected]
%~ Scopus C2 - 26151624
%G English
the field beginning with % contains the Author Address which could consits of more lines according the author addresses (1 to n relationship) of the authors. Each authore address are separated by a newline (linebreak)
Now I would like to ask how I can sort this field by the lines where GRAZ are in the line. this lines where Graz are in it, they should be the first line in the listing.
Is there a way to do this by bash text processing tools, or need I write a Delphi`s programm to access and import the endnote Export dump.
The output of this example above should be
%0 Journal Article
%A Abu-Rous, M.
%A Ingolic, E.
%A Schuster, K. C.
%D 2006
%Z Cellulose
Article
CODEN: CELLE
% Research Institute for Electron Microscopy (FELMI), Technical University of Graz, A-8010 Graz, Austria
Christian Doppler-Laboratory of Fibre and Textile Chemistry in Cellulosics, Institute of Texile Chemistry and Textile Physics, Leopold-Franzens-University Innsbruck, Hoechsterstrasse 73, A-6850 Dornbirn, Austria
Textile Innovation, Lenzing AG, A-4860 Lenzing, Austria
Schuster, K.C.; Textile Innovation, , A-4860 Lenzing, Austria; email: [email protected]
%~ Scopus
%G English
%0 Journal Article
%P 5003-5011
%! Ursolic acid from Trailliaedoxa gracilis induces apoptosis in medullary thyroid carcinoma cells
%@ 17912997
%R 10.3892/mmr.2015.4053
%1 in_author_address;
%F 8
%K Apoptosis
Bioactive agents
%Z Mol. Med. Rep.
Article
Chemicals/CAS: caspase 8; ursolic acid, 77-52-1; I kappa B kinase, 209902-66-9; Antineoplastic Agents, Phytogenic; Caspase 8; I-kappa B Kinase; IKBKG protein, human; Plant Extracts; Triterpenes; ursolic acid
Tradenames: cpt, Sigma Aldrich; rotichrom, karlsruhe, Germany
% Department of Pathophysiology and Immunology, Center of Molecular Medicine, Medical University of Graz, 31a Heinrichstrasse, Graz A, 8010, Austria
Department of Biochemistry and Molecular Biology, Shanghai Medical School, Fudan University, Shanghai, 200433, China
Department of Pharmacognosy, Institute of Pharmacy, Center of Molecular Biosciences, Leopold Franzens University of Innsbruck, Innsbruck A, 6010, Austria
Core Unit of Biomedical Research, Division of Laboratory Animal Science and Genetics, Medical University of Vienna, Himberg A, 2325, Austria
Research Institute for Electron Microscopy and Fine Structure Research, University of Technology Graz, Graz A, 8010, Austria
Pfragner, R.; Department of Pathophysiology and Immunology, 31a Heinrichstrasse, Austria; email: [email protected]
%~ Scopus C2 - 26151624
%G English
I would be happy and thankful for any interesting advices.
CodePudding user response:
Would you please try the following:
#!/bin/bash
shopt -s nocasematch # make the match case-insensitive
# print the arrays and empty them
flush() {
printf "%s" "% "
printf "%s\n" "${out1[@]}" "${out2[@]}"
out1=(); out2=()
}
while IFS= read -r line; do # read the file line by line
if (( auth )); then # now in the "Author Address" context
if [[ $line = "%"* ]]; then
# end of the "Author Address" context
auth=0
flush # print the arrays
echo "$line" # print current line
else
[[ $line = *"GRAZ"* ]] && out1 =("$line") || out2 =("$line")
# if the line contains "GRAZ" store it in the array out1, else out2
fi
else
if [[ $line = "% "* ]]; then
auth=1 # enter in the "Author Address" context
[[ $line = *"GRAZ"* ]] && out1 =("${line:3}") || out2 =("${line:3}")
# ${line:3} removes leading 3 characters
else
echo "$line"
fi
fi
done < input_file
Output:
%1 Journal Article
%A Abu-Rous, M.
%A Ingolic, E.
%A Schuster, K. C.
%D 2006
%Z Cellulose
Article
CODEN: CELLE
% Research Institute for Electron Microscopy (FELMI), Technical University of Graz, A-8010 Graz, Austria
Christian Doppler-Laboratory of Fibre and Textile Chemistry in Cellulosics, Institute of Texile Chemistry and Textile Physics, Leopold-Franzens-University Innsbruck, Hoechsterstrasse 73, A-6850 Dornbirn, Austria
Textile Innovation, Lenzing AG, A-4860 Lenzing, Austria
Schuster, K.C.; Textile Innovation, , A-4860 Lenzing, Austria; email: [email protected]
%~ Scopus
%G English
%0 Journal Article
%P 5003-5011
%! Ursolic acid from Trailliaedoxa gracilis induces apoptosis in medullary thyroid carcinoma cells
%@ 17912997
%R 10.3892/mmr.2015.4053
%1 in_author_address;
%F 8
%K Apoptosis
Bioactive agents
%Z Mol. Med. Rep.
Article
Chemicals/CAS: caspase 8; ursolic acid, 77-52-1; I kappa B kinase, 209902-66-9; Antineoplastic Agents, Phytogenic; Caspase 8; I-kappa B Kinase; IKBKG protein, human; Plant Extracts; Triterpenes; ursolic acid
Tradenames: cpt, Sigma Aldrich; rotichrom, karlsruhe, Germany
% Department of Pathophysiology and Immunology, Center of Molecular Medicine, Medical University of Graz, 31a Heinrichstrasse, Graz A, 8010, Austria
Research Institute for Electron Microscopy and Fine Structure Research, University of Technology Graz, Graz A, 8010, Austria
Pfragner, R.; Department of Pathophysiology and Immunology, 31a Heinrichstrasse, Austria; email: [email protected]
Department of Biochemistry and Molecular Biology, Shanghai Medical School, Fudan University, Shanghai, 200433, China
Department of Pharmacognosy, Institute of Pharmacy, Center of Molecular Biosciences, Leopold Franzens University of Innsbruck, Innsbruck A, 6010, Austria
Core Unit of Biomedical Research, Division of Laboratory Animal Science and Genetics, Medical University of Vienna, Himberg A, 2325, Austria
%~ Scopus C2 - 26151624
%G English
The output slightly differs from your expected output, one is because the ... University of Technology Graz ...
line comes later than other lines which do not contain GRAZ
in your expected result, the other is because medunigraz
is considered to match GRAZ
.