Hi I have a very long string that has the following structure:
"IF (
ISFILTERED ( Table1[Column_1] ),
VAR ___f = FILTERS ( Table1[Column_1] )
VAR ___r = COUNTROWS ( ___f )
VAR ___t = TOPN ( MaxFilters, ___f, Table1[Column_1])
VAR ___d = CONCATENATEX ( ___t, Table1[Column_1], ", " )
VAR ___x = "Table1[Column_1] = " & ___d & IF(___r > MaxFilters, ", ... [" & ___r & " items selected]") & " "
RETURN ___x & UNICHAR(13) & UNICHAR(10)
)
& IF (
ISFILTERED ( Table1[Column_2] ),
VAR ___f = FILTERS ( Table1[Column_2] )
VAR ___r = COUNTROWS ( ___f )
VAR ___t = TOPN ( MaxFilters, ___f, Table1[Column_2])
VAR ___d = CONCATENATEX ( ___t, Table1[Columnw_1], ", " )
VAR ___x = "Table1[Column_2] = " & ___d & IF(___r > MaxFilters, ", ... [" & ___r & " items selected]") & " "
RETURN ___x & UNICHAR(13) & UNICHAR(10)
)
& IF (
...
It basically continues like that iterating through each column of each table in a schema. As you can imagine it this string will be very large for a lot of tables/columns.
I have a list of columns from specific tables like this:
['Table1[Column_1]', 'Table2[Column_4]', 'Table6[Column_22]']
These are the only columns I am interested in keeping in the string.
So I need to go through the string and remove the entire IF statement it relates too if the table/column is not in the list.
So based on the above example the expected output would be:
"IF (
ISFILTERED ( Table1[Column_1] ),
VAR ___f = FILTERS ( Table1[Column_1] )
VAR ___r = COUNTROWS ( ___f )
VAR ___t = TOPN ( MaxFilters, ___f, Table1[Column_1])
VAR ___d = CONCATENATEX ( ___t, Table1[Column_1], ", " )
VAR ___x = "Table1[Column_1] = " & ___d & IF(___r > MaxFilters, ", ... [" & ___r & " items selected]") & " "
RETURN ___x & UNICHAR(13) & UNICHAR(10)
)
& IF (
...
we just got rid of the second IF because Table1[Column_2] was not in the list.
Would regex be useful for this case? Or maybe I should iterate through the list and build a new string that just keeps the relevant parts.
I know it is best practise to show what you have attempted so far but I am not sure where to start with this it seems like it should be easy but I am having trouble.
Can anyone help me please?
Python solutions would be best as I know that more, but happy to investigate other methods if easier. I know there are regex tools online maybe I can just use one of those?
CodePudding user response:
Assuming string structure is constant. You can try this, but it depends on structure.
data = YOUR_STRING_FROM_QUESTION
# This is the delimeter, which will help us to split query on parts
prefix = '& IF (\n'
# define list of allowed tables
allowed_tables = ['Table1[Column_1]', 'Table2[Column_4]', 'Table6[Column_22]']
# split full string on parts by prefix
query_by_part = data.split(prefix)
# build clean list of query parts, where tables in allowed_tables
clean_query_parts = [part for part in query_by_part
if any(table in part for table in allowed_tables)]
# finally join list to string using prefix and print
print(prefix.join(clean_query_parts))
Input:
IF (
ISFILTERED ( Table1[Column_1] ),
VAR ___f = FILTERS ( Table1[Column_1] )
VAR ___r = COUNTROWS ( ___f )
VAR ___t = TOPN ( MaxFilters, ___f, Table1[Column_1])
VAR ___d = CONCATENATEX ( ___t, Table1[Column_1], ", " )
VAR ___x = "Table1[Column_1] = " & ___d & IF(___r > MaxFilters, ", ... [" & ___r & " items selected]") & " "
RETURN ___x & UNICHAR(13) & UNICHAR(10)
)
& IF (
ISFILTERED ( Table1[Column_2] ),
VAR ___f = FILTERS ( Table1[Column_2] )
VAR ___r = COUNTROWS ( ___f )
VAR ___t = TOPN ( MaxFilters, ___f, Table1[Column_2])
VAR ___d = CONCATENATEX ( ___t, Table1[Columnw_1], ", " )
VAR ___x = "Table1[Column_2] = " & ___d & IF(___r > MaxFilters, ", ... [" & ___r & " items selected]") & " "
RETURN ___x & UNICHAR(13) & UNICHAR(10)
)
Output:
IF (
ISFILTERED ( Table1[Column_1] ),
VAR ___f = FILTERS ( Table1[Column_1] )
VAR ___r = COUNTROWS ( ___f )
VAR ___t = TOPN ( MaxFilters, ___f, Table1[Column_1])
VAR ___d = CONCATENATEX ( ___t, Table1[Column_1], ", " )
VAR ___x = "Table1[Column_1] = " & ___d & IF(___r > MaxFilters, ", ... [" & ___r & " items selected]") & " "
RETURN ___x & UNICHAR(13) & UNICHAR(10)
)
CodePudding user response:
You can use jinja2 to easily generate the string from scratch in this way:
- Add a new file with the double extension 'txt.jinja', e.g., example.txt.jinja2 with the following code inside (this is your template) and save it in the same path of your script:
"
{%- for column in columns_to_add -%}
IF (
ISFILTERED ( Table1[Column_2] ),
VAR ___f = FILTERS ( Table1[Column_2] )
VAR ___r = COUNTROWS ( ___f )
VAR ___t = TOPN ( MaxFilters, ___f, Table1[Column_2])
VAR ___d = CONCATENATEX ( ___t, Table1[Columnw_1], ", " )
VAR ___x = "{{column}} = " & ___d & IF(___r > MaxFilters, ", ... [" & ___r & " items selected]") & " "
RETURN ___x & UNICHAR(13) & UNICHAR(10)
)
&
{% endfor -%}
"
- Execute this Python script:
import os
import jinja2
TEMPLATE_NAME = "example.txt.jinja2"
LIST_OF_TABLES = [
'Table1[Column_1]',
'Table2[Column_4]',
'Table6[Column_22]'
]
template = os.path.join(os.path.dirname(os.path.abspath(__file__)), "")
jinja_env = jinja2.Environment(loader=jinja2.FileSystemLoader(template))
template2= jinja_env.get_template(TEMPLATE_NAME)
string = template2.render(columns_to_add=LIST_OF_TABLES)
output = template TEMPLATE_NAME.split("\\")[-1].replace('.jinja2', '')
with open(output, "w ") as f:
f.write(string)
This will create a new file (in the same path) called example.txt with your string:
"IF (
ISFILTERED ( Table1[Column_2] ),
VAR ___f = FILTERS ( Table1[Column_2] )
VAR ___r = COUNTROWS ( ___f )
VAR ___t = TOPN ( MaxFilters, ___f, Table1[Column_2])
VAR ___d = CONCATENATEX ( ___t, Table1[Columnw_1], ", " )
VAR ___x = "Table1[Column_1] = " & ___d & IF(___r > MaxFilters, ", ... [" & ___r & " items selected]") & " "
RETURN ___x & UNICHAR(13) & UNICHAR(10)
)
&
IF (
ISFILTERED ( Table1[Column_2] ),
VAR ___f = FILTERS ( Table1[Column_2] )
VAR ___r = COUNTROWS ( ___f )
VAR ___t = TOPN ( MaxFilters, ___f, Table1[Column_2])
VAR ___d = CONCATENATEX ( ___t, Table1[Columnw_1], ", " )
VAR ___x = "Table2[Column_4] = " & ___d & IF(___r > MaxFilters, ", ... [" & ___r & " items selected]") & " "
RETURN ___x & UNICHAR(13) & UNICHAR(10)
)
&
IF (
ISFILTERED ( Table1[Column_2] ),
VAR ___f = FILTERS ( Table1[Column_2] )
VAR ___r = COUNTROWS ( ___f )
VAR ___t = TOPN ( MaxFilters, ___f, Table1[Column_2])
VAR ___d = CONCATENATEX ( ___t, Table1[Columnw_1], ", " )
VAR ___x = "Table6[Column_22] = " & ___d & IF(___r > MaxFilters, ", ... [" & ___r & " items selected]") & " "
RETURN ___x & UNICHAR(13) & UNICHAR(10)
)
&
"
This works fine no matter how big your list of desired columns is, and the best part is that you only have to keep this list updated.