i have a file with below content. i need to separate the content into 2 files o/p1 should have content everything within first braces () and ` removed and only 1&2 columns printed. o/p2 should have location with its value
$ cat dt.txt
CREATE EXTERNAL TABLE `rte.fteff_ft`(
`dt` date,
`wk_id` int,
`yq_id` int(10,00),
`te_ind` string,
`yw_dt` date,
`em_dt` date comment dfdsf sdfsdf)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\u0007'
LINES TERMINATED BY '\n'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'hdfs://dfdf/data/ffff/ODE/TdddfT/'
TBLPROPERTIES (
'last_modified_by'='asdas',
'last_modified_time'='1639551681',
'numFiles'='1',
'totalSize'='2848434',
'transient_lastDdlTime'='1639551681')
i need output from the above in two files.
o/p1: a.txt
dt date,
wk_id int,
yq_id int(10,00),
te_ind string,
yw_dt date,
em_dt date
o/p2: b.txt
LOCATION
'hdfs://dfdf/data/ffff/ODE/TdddfT/'
CodePudding user response:
First, use sed
to run a couple of commands, to operate on the range of lines between 'CREATE EXTERNAL' and 'ROW DELIMITED FORMAT' where they occur at the start of the line, not including those lines. Then replace grave accent marks with nothing, then keep only the first 2 words.
sed -E '/CREATE EXTERNAL/,/ROW FORMAT DELIMITED/!d;//d;s/`//g; s/(([^ ] ){2}).*/\1/' dt.txt > a.txt
Second, use the -A option to grep
to match the word 'LOCATION' on a line by itself plus the following 1 line.
grep -A 1 '^LOCATION$' dt.txt > b.txt