Using Excel, we have a list of TV shows that have audience overlap. In the format below:
Show A | Show B | Show C |
---|---|---|
Show B | Show A | Show A |
Show C | Show C | Show B |
This data is repeated, for > 20 shows.
I'm trying to find an easy way to uncover all the "clusters". For example, Shows A, B, and C all have audience crossover, indicating a "cluster" or "network" of potentially similar shows.
Is there any easy formula for this?
I tried using Microsoft Co-Pilot. It provided me with an interesting visual, but upon closer inspection, it appeared incorrect, and was ultimately unable to accurately identify all the existing "sub-graphs". Instead, when asked, it continually outputted the same list of all 20+ shows. I clarified asking for only "reciprocal relationships", but it still provided no improvement on the answer - listing all shows as the "one subgraph".
Answer
You could do it using PowerQuery, parsing the column headers and using them as filters for each column:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
ColumnNames = Table.ColumnNames(Source),
FilteredTable = Table.SelectRows(Source, each List.AllTrue(
List.Transform(Record.ToList(_), each List.Contains(ColumnNames, _))
))
in
FilteredTable
You could also use a formula:
=LET(
data, A2:C8,
headers, A2:C2,
matchMatrix, BYROW(data, LAMBDA(row, AND(BYCOL(row, LAMBDA(cell, ISNUMBER(MATCH(cell, headers, 0))))))),
FILTER(data, matchMatrix)
)