How to find which currency pairs from a list have both currencies missing in all of the cells in a c-CodePudding

Suppose you have the following list called currency_trading_pairs with the following elements:

currency_trading_pairs = ['USD/CAD', 'EUR/USD', 'GBP/USD', 'NZD/USD', 'AUD/USD', 
                          'USD/JPY', 'EUR/CAD', 'EUR/AUD', 'EUR/JPY', 'EUR/GBP',
                          'AUD/CAD', 'GBP/JPY', 'CHF/JPY', 'AUD/JPY', 'AUD/NZD']

Now, suppose that you have the following dataframe called filtered_df with the following data:

    Time    Currency    Volatility expected        Event
24  04:30     GBP   Low Volatility Expected        Inflation Expectations
25  05:00     EUR   High Volatility Expected       EU Leaders Summit  
26  05:10     EUR   Low Volatility Expected        Italian 15-Year BTP Auction
27  05:10     EUR   Low Volatility Expected        Italian 3-Year BTP Auction
28  05:10     EUR   Low Volatility Expected        Italian 7-Year BTP Auction
29  06:00     EUR   Low Volatility Expected        Spanish Consumer Confidence
30  06:30     INR   Low Volatility Expected        Bank Loan Growth
31  06:30     INR   Low Volatility Expected        Deposit Growth
32  06:30     INR   Low Volatility Expected        FX Reserves, USD
33  07:00     INR   Low Volatility Expected        Cumulative Industrial Production (Jan)
34  07:00     INR   Low Volatility Expected        Industrial Production (YoY) (Jan)
35  07:00     INR   Low Volatility Expected        Manufacturing Output (MoM) (Jan)
36  07:00     BRL   Moderate Volatility Expected   CPI (YoY) (Feb)
37  07:00     BRL   Moderate Volatility Expected   CPI (MoM) (Feb)
38  08:06     BRL   Moderate Volatility Expected   Brazilian IPCA Inflation Index SA (MoM)(Feb)
39  08:30     CAD   Low Volatility Expected        Capacity Utilization Rate (Q4)
40  08:30     CAD   High Volatility Expected       Employment Change (Feb)
41  08:30     CAD   Low Volatility Expected        Full Employment Change (Feb)
42  08:30     CAD   Low Volatility Expected        Part Time Employment Change (Feb)
43  08:30     CAD   Low Volatility Expected        Participation Rate (Feb)
44  08:30     CAD   Moderate Volatility Expected   Unemployment Rate (Feb)

How could you find which currency pairs (elements) from currency_trading_pairs (list) have both currencies missing in all of the cells in the column Currency from the filtered_df so that you can get the following output in a variable called the_missing_pairs:

the_missing_pairs = ['NZD/USD', 'AUD/USD', 'USD/JPY', 'CHF/JPY', 'AUD/JPY', 'AUD/NZD']

Further explanation: Basically, assuring that all of the currency names in the_missing_pairs list do not appear in any of the cells in the column Currency from the filtered_df.

CodePudding user response：

I would actually convert currency_trading_pairs to a Series object. Then you can split by / and explode, then use isin, and finally groupby(level=0) any to produce the perfect mask:

ctp = pd.Series(currency_trading_pairs)
the_missing_pairs = ctp[~ctp.str.split('/').explode().isin(df['Currency']).groupby(level=0).any()].tolist()

Output:

>>> the_missing_pairs
['NZD/USD', 'AUD/USD', 'USD/JPY', 'CHF/JPY', 'AUD/JPY', 'AUD/NZD']