Given the following pandas DataFrame -
json_path | Reporting Group | Entity/Grouping | Entity ID | Adjusted Value (Today, No Div, USD) | Adjusted TWR (Current Quarter, No Div, USD) | Adjusted TWR (YTD, No Div, USD) | Annualized Adjusted TWR (Since Inception, No Div, USD) | Adjusted Value (No Div, USD) | TWR Audit Note |
---|---|---|---|---|---|---|---|---|---|
data.attributes.total.children.[0].children.[0].children.[0] | Barrack Family | William and Rupert Trust | 9957007 | -1.44 | -1.44 | ||||
data.attributes.total.children.[0].children.[0].children.[0].children.[0] | Barrack Family | Cash | - | -1.44 | -1.44 | ||||
data.attributes.total.children.[0].children.[0].children.[1] | Barrack Family | Gratia Holdings No. 2 LLC | 8413655 | 55491732.66 | -0.971018847 | -0.971018847 | 11.52490309 | 55491732.66 | |
data.attributes.total.children.[0].children.[0].children.[1].children.[0] | Barrack Family | Investment Grade Fixed Income | - | 18469768.6 | 18469768.6 | ||||
data.attributes.total.children.[0].children.[0].children.[1].children.[1] | Barrack Family | High Yield Fixed Income | - | 3668982.44 | -0.205356545 | -0.205356545 | 4.441190127 | 3668982.44 |
I try and save only rows that contain 4x occurances of .children.[]
using the following statement -
Code: perf_by_entity_df = df[df['json_path'].str.contains(r'(\.children\.\[\d \]){4}')]
However receive the following:
Error:UserWarning: This pattern is interpreted as a regular expression, and has match groups. To actually get the groups, use str.extract.
Any suggestions why this is happening?
CodePudding user response:
Use the code below to suppress the warning:
perf_by_entity_df = df[df['json_path'].str.contains(r'(?:\.children\.\[\d \]){4}')]
Replace:
r'(\.children\.\[\d \]){4}'
By:
r'(?:\.children\.\[\d \]){4}'
# ^^-- HERE: Non capturing group
From the documentation:
(?:...)
A non-capturing version of regular parentheses. Matches whatever regular expression is inside the parentheses, but the substring matched by the group cannot be retrieved after performing a match or referenced later in the pattern.