Home > Back-end >  Regex for string before curly brackets with specific white-spacing
Regex for string before curly brackets with specific white-spacing

Time:04-06

I very much suck at regex and have been banging my head against a wall for a few days. I'm trying to pull out info from a .nk nuke script with Python. The formatting is as follows:

> BackdropNode {
 inputs 0
 name BackdropNode6
 tile_color 0x555555ff
 label RDN
 note_font_size 42
 xpos -1136
 ypos -4272
 bdwidth 451
 bdheight 529
}
Write {
 file "\[value project_directory]/_output_/\[string range \[file tail \[value root.name]] 0 20].mov"
 colorspace sRGB
 raw true
 file_type mov\{(.*?)\}
 mov64_format "mov (QuickTime / MOV)"
 mov64_codec AVdh
 mov64_dnxhd_codec_profile "DNxHD 422 8-bit 145Mbit"
 mov64_dnxhr_codec_profile "SQ 4:2:2 8-bit"
 mov64_pixel_format {{0} "yuv420p\tYCbCr 4:2:0 8-bit"}
 mov64_quality High
 mov64_advanced 1
 mov64_write_timecode true
 mov64_gop_size 12
 mov64_b_frames 0
 mov64_bitrate 20000
 mov64_bitrate_tolerance 40000000
 mov64_quality_min 2
 mov64_quality_max 31
 render_order 3
 checkHashOnRead false
 version 132
 in_colorspace scene_linear
 out_colorspace scene_linear
 name Write2
 xpos -1025
 ypos 2230
}

There is a one word string, space, open curly then everything til the new line curly bracket. I want to match the first string and then everything inside the curly bracket.

I've gotten here:

^([a-zA-Z] \s) (\{[^}]*.*\})$

which gets me to the first close curly bracket. I'd like it to go all the way until the last curly bracket.

So match 1 would be BackdropNode {...} with group 1 of BackdropNode and group 2 of {...}. Then match 2 would be Write {...} with group 1 of Write and group 2 of {...}.

CodePudding user response:

You can use

(?sm)^([a-zA-Z] )\s*\{(.*?)}(?=\n[a-zA-Z] \s*{\n|\Z)

See the regex demo.

Details:

  • (?sm) - re.S / re.DOTALL re.M / re.MULTILINE inline modifier
  • ^ - start of a line
  • ([a-zA-Z] ) - Group 1: one or more letters
  • \s* - zero or more whitespaces
  • \{ - a { char
  • (.*?) - Group 2: any zero or more chars as few as possible
  • } - a } char
  • (?=\n[a-zA-Z] \s*{\n|\Z) - a positive lookahead that requires either a newline and then one or more letters, zero or more whitespaces, { and a newline, or end of the whole string.
  • Related