Finding cruxes in T3C: Sort by controversy to surface solutions

Initial progress on visualizing & exploring the key differences in speakers' perspectives
Created on October 29|Last edited on November 6
Comment
﻿
OverviewMethod: Extend T3C pipeline to find points of max controversyKey visual: Controversy matrix for sampled AI manifestosTracing and visualizing crux claims and speaker alignmentLLM extracts crux claims from topic + speaker claims & explains reasoning Explore cruxes by net agreement/disagreementSort for most controversial statementsCompare speaker perspectives across each cruxExample: Trace/debug LLM chains end-to-end to understand edge casesNext stepsEndnotes0: Baseline toy dataset: AI Manifestos1: Current crux extraction prompt
﻿
OverviewCurrent Talk to the City (T3C) reports sort the claims and topics in opinion summaries by frequency, prioritizing the most commonly shared ideas while preserving the breadth of perspectives expressed in freeform participant responses. Sometimes we'd prefer to easily see and understand the biggest areas of disagreement, in hopes of addressing and resolving them. We extend T3C to find crux claims: crucial, maximally-controversial statements which best divide individuals' views (e.g. determine which side of a debate the speaker ultimately takes). Surfacing and analyzing these crux claims can help us pinpoint why folks disagree and suggest paths towards common ground. Below I present a prototype, initial findings, & promising next steps using a toy dataset of recent AI manifestos [0].﻿﻿
Method: Extend T3C pipeline to find points of max controversyGiven a set of input comments—e.g. interviews or survey responses as text, audio, or video—the existing T3C LLM pipeline clusters the comments by topic, summarizes specific claims grounded in exact source quotes, and shows the results as an interactive report with two nested levels of topic clustering and associated claims (latest T3C report on 10 AI Manifesto snippets). This extension is an optional stage added to the T3C LLM pipeline, consisting of these steps:
extract crux claims: for each subtopic and the list of related claims—tagged by speaker—prompt an LLM for a summary crux claim to most-evenly split the speakers into "agree"/"disagree" sides [1] and explain why each speaker would choose that side 
explore cruxes for shared, controversial, and missing opinions: for each crux claim, for each speaker, score agreement as 0.5, disagreement as 1, and no opinion/unknown as 0.
calculate a controversy matrix: for each pair of cruxes, for each speaker, score a speaker's consistent opinion (agrees with both/disagrees with both) as a 0, either opinion unknown as a 0.5, and difference as a 1 (agrees with one claim, disagrees with the other)—this forms a symmetric matrix, where highest scores correspond to the pairs of cruxes on which speaker opinions maximally diverge
Key visual: Controversy matrix for sampled AI manifestosThree examples of a controversy matrix are shown below. Click on each thumbnail to see the full detail.
From left to right, we increase the number of "speakers"/distinct AI manifesto essays sampled, the total amount of text pulled per essay, the number of comments rows (ie distinct LLM calls), and correspondingly the number of extracted claims. The size of the matrix / number of crux claims changes based on the total number of subtopics. This varies with the distribution of input text per LLM call in a way we'll keep tuning as we adapt the T3C pipeline for long-form essays instead of more naturally-distributed/bounded/standardized "conversational turns", "survey responses", or "recorded interviews". Below, for the first run we use 7 different speakers/AI manifesto essays, 10K letters from each, sampled in 4K blocks, second run: 16K letters per essay, third run: 10 essays, 12K letters from each, sampled in 3K blocks. Click on the parenthetical links to see the full logs of each experiment.
While this can be tuned for more consistent level of claim granularity/representation across speakers, we can see some intuitive max-controversy pairs:
Will the future impact of AI be net positive? (full detail: Crux Finding V0)
The development and integration of AI should prioritize safety and ethical considerations to prevent potential harm to humanity
The development of advanced AI will primarily have positive impacts on society
Are the risks fundamentally addressable and if not, can they be accepted as worth the benefits? (full detail: Crux Finding V1)
The development of superintelligent AI poses insurmountable control and alignment challenges that current methodologies cannot address
The potential benefits of AI development outweigh its existential risks
Is the risk of AI surpassing humans worth it? (full detail: Crux Finding V2)
The development of AI will lead to superhuman intelligence that surpasses human control and understanding.
The development and integration of AI should be aggressively pursued despite the potential risks it poses.
﻿
Num speakers, chars per speaker, block size3
﻿
Tracing and visualizing crux claims and speaker alignment
LLM extracts crux claims from topic + speaker claims & explains reasoning Each row below shows the LLM's output in the first two columns: the Crux Claim and Reasoning/explanation, given the input in the remaining three columns: all the speakers' claims on the relevant topic (T3C Extracted Topic + Topic Description). Use the arrows at the bottom middle of the panel to scroll. Hover over individual cells to see the full text. This panel contains 64 example cruxes extracted across 6 slightly different T3C topic trees/runs of the LLM pipeline.
﻿
﻿
Explore cruxes by net agreement/disagreementFor each subtopic, starting with the most popular (having the most claims)—extract the crux and show the speakers who agree/disagree, along with their specific claims about the subtopic, and finally an explanation of the LLM's reasoning. You can use the arrows at the very bottom of the panel to page through the later cruxes.
Note that Row 2 and 3 are essentially duplicates of a very intuitive core disagreement in these essays.
﻿
﻿
Sort for most controversial statementsThe previous view is sorted from most popular to least popular subtopic. Instead we can sort by how many distinct speakers expressed agreement/disagreement/any opinion, as in this view. Claims/explanation details are available in the rightmost columns.
﻿
﻿
Compare speaker perspectives across each cruxAnother view shows each speaker's extracted opinion alongside the crux claim—0.5 for agree, 1 for disagree, and 0 for unknown/no opinion expressed. The second table uses column expressions to compare speaker perspectives, e.g. highlighting where they agree/disagree. For example, we can highlight the statements where our LLM pipeline predicts:
Eliezer Yudkowsky and Katja Grace disagree (unexpected, as they are generally on the same side of the AGI debate)
Marc Andreessen and Vitalik Buterin disagree (nowhere in this run, which is also very surprising, as the latter is more cautious)
Sam Altman and Dario Amodei disagree (perhaps the most interesting crystalization of their takes to trace further)
Askell, Brundage, & Hadfield disagree with the DeepMind team (also fascinating to explore)
Key: 0: unknown, 0.5: agree, 1: disagree
﻿
﻿
Example: Trace/debug LLM chains end-to-end to understand edge casesTracing a call across these views can help us debug and improve the crux-finding component. For example, the first crux claim below has Katja Grace disagreeing based on the extracted claim "Working on AI capabilities is crucial for learning and contributing to AI alignment." Where did this claim come from? If we search the logs of claims to underlying source quote, we can see this was extracted from the line "avoiding working on AI capabilities research is bad because it’s so helpful for learning on the path to working on alignment". If we pull up the full essay, we'll see that this quote is explicitly part of a list of others' opinions Katja Grace addresses in her essay. This is a tough but necessary case for our extraction pipeline—can we reliably detect when one speaker/writer is quoting/paraphrasing/otherwise referencing a different perspective they do not hold? 
Examples of opinions referenced but not necessarily matching the author's
Top left: full T3C report for this experiment, top right: LLM-extracted crux claims and explanations, bottom: finding the original quote which led to the claim
﻿
﻿
Next stepsComing soon: a Google Colab for running the T3C pipeline, extracting cruxes, and generating a controversy matrix, as well as easy-to-use versions of this sample dataset of AI manifestos. There are plenty of opportunities for exploring & improving this pipeline—some are listed below. Many of these apply to T3C beyond crux finding. If any of these are interesting, please reach out :)
prompt engineering: what is a good/useful crux? what needs to be in the context of the question? how many cruxes do we need?
parsing essays vs interviews—how to best handle/split up very long pieces? how to deal with style/format, from academic research papers, to detailed literature reviews with many images/charts/captions, to personal blogs? more generally, how to tune speaker/content balance across a larger dataset/multiple LLM calls or how to specify the desired level of granularity when extracting claims and topics 
how to deduplicate/merge/threshold/filter cruxes — some crux claims are effectively rephrasing the same idea, and some of the less-popular subtopics only have 0-2 speakers participating.
consider different LLM APIs, costs / throughput efficiency, & helpful visualizations for analysis
Endnotes
0: Baseline toy dataset: AI Manifestosaa_mb_gh: Amanda Askell, Miles Brundage, and Gillian Hadfield, The Role of Cooperation in Responsible AI Development﻿
dario: Dario Amodei, Machines of Loving Grace﻿
deepmind: Anca Dragan, Helen King, and Allan Dafoe, Introducing the Frontier Safety Framework 		
eliezer: Eliezer Yudkowsky, Pausing AI Developments Isn’t Enough. We Need to Shut it All Down﻿
katja: Katja Grace, Let's think about slowing down AI﻿
leopold: Leopold Aschenbrenner, Situational Awareness﻿
pmarca: Marc Andreessen, The Techno-Optimist Manifesto﻿
richard_ngo: Richard Ngo, The Alignment Problem from a Deep Learning Perspective﻿
sama: Sam Altman, The Intelligence Age﻿
vitalik: Vitalik Buterin, My Techno-Optimism﻿
1: Current crux extraction promptThis early iteration matches the style of long-running T3C prompts and seems to work well—suggestions for improvement most welcome as comments on this report :)
I'm going to give you a topic with a description and a list of high-level claims about this topic made by different participants.
Each claim is prefixed by the participant's name. I want you to formulate a new, specific statement called a "cruxClaim"
which would best split the participants into two groups, based on all their
statements on this topic: one group which would agree with the statement, and one which would disagree.
Please explain your reasoning and assign participants into "agree" and "disagree" groups.
return a JSON object of the form
{
  "crux" : {
    "cruxClaim" : string // the new extracted claim
    "agree" : list of strings // list of the given participant names who would agree with the cruxClaim
    "disagree" : list strings // list of the given participant names who would disagree with the cruxClaim
    "explanation" : string // reasoning for why you synthesized this cruxClaim from the participants' perspective
  }
}
﻿
Add a comment