I have a table where I have scraped each PGA tournament and the count of the number of times a player finished in the top 30. Example:
Event | Marc Leishman | Patrick Rodgers | Scott Stallings |
---|---|---|---|
AT&T Pebble Beach Pro-Am | 0 | 1 | 3 |
Barbasol Championship | 1 | 1 | 1 |
The goal for me is to find how each event correlates to other events in the table, using the number of players with a non zero value below as my measurement. So the more players who have a non-zero value across the two events, the more correlation exists.
To accomplish this, It seems I have to go row by row (event AT&T Pebble 1st), find all columns > 0 (Patrick Rodgers and Scott Stallings) and sum those columns only, ignoring the marc leishman column. And then sum values in each row.
There are about 200 columns and 100 rows in this google sheet, yet I have to repeat this same process for each row, since there will be distinct results for each event. The results would look like this:
Event | Total |
---|---|
AT&T Pebble Beach Pro-Am | 6 |
Barbasol Championship | 7 |
I was researching arrayformula to accomlish this but didnt see a way to have the row by row evaluation of columns in google sheets or excel. If there is a python solution to this problem, I'd be open to it too. Thanks for any guidance you can offer- and other approaches to finding the correlation between golf events would be welcome too.
CodePudding user response:
I made a new tab on the sample sheet called MK_Idea.
First I just made a table of each pair of tournaments using:
=ARRAYFORMULA(QUERY(SPLIT(FLATTEN(IF(ROW('Build Sample Here'!A2:A)<TRANSPOSE(ROW('Build Sample Here'!A2:A)),'Build Sample Here'!A2:A&"|"&TRANSPOSE('Build Sample Here'!A2:A),)),"|",0,0),"where Col1<>'' and Col2<>''"))
Then I used an MMULT() around a two boolean VLOOKUP() statments multiplied against one another to sum up the number of columns/players) which both have values greater than 0 for that tournament.
This is that formula:
=ARRAYFORMULA(IF(B2:B="",,MMULT(IFERROR((VLOOKUP(B2:B,'Build Sample Here'!A1:10000,COLUMN('Build Sample Here'!B1:1),0)>0)*(VLOOKUP(C2:C,'Build Sample Here'!A1:10000,COLUMN('Build Sample Here'!B1:1),0)>0),0),TRANSPOSE(COLUMN('Build Sample Here'!B1:1)^0))))