Gregory Mermoud’s Post

𝗔𝗜 + 𝗦𝘆𝘀𝘁𝗲𝗺𝘀 · Distinguished Engineer @ Cisco · Innovation & Engineering Leader · 190+ granted patents 💡

2mo Edited

Very insightful work by Anthropic’s interpretability team. And an amazing paper, with outstanding writing and figures. The idea is very simple: interpret LLMs by leveraging sparse autoencoders as surrogate models of the MLP of transformer blocks, which allow one to disambiguate the superposition of features captured by a single neuron. A simple idea, but a very careful and complex execution, as it is often the case in our line of work. The paper goes into many details and provide a large array of insights, although the gist of the implementation remains obfuscated due to the closed source nature of Claude. Too bad, because this is the kind of work that we need to better understand and eventually trust LLMs. This is demonstrated by the authors in the section ‘Influence on Behavior’, where they show that clamping some features to either high or low value during inference is “remarkably effective at modifying model outputs in specific, interpretable ways”. Hopefully this kind of work is going to be replicated and generalized to open-weights models, such that we have new ways to steer their behavior. https://lnkd.in/eVym7f_f #interpretability #xai #explainableai #steerableai #anthropic #claude #anthropic

Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet

transformer-circuits.pub

To view or add a comment, sign in

More Relevant Posts

Jörgen Larsson

Change Management in the Era of AI and Data Governance @ Jaxbird AB | Leadership, Operations Development
9mo
Report this post
Maybe it can work in the future but not now. This means that AI must also learn to understand what is not expressed. Remember that the ontology (and its stored data) only represents selected parts of the entire business communication. Moreover, that communication is full of various types of biases, misunderstandings, grammatical errors, aiming errors and much more. I agree with #Ronald here that it is not always the case that we humans interpret communication correctly and then we also can include intonations and body language as well. Of course, we will have big improvements in the long run when it comes to AI's interpretation of existing data. However, I don't think it will be completely perfect as long as the base for the AI's conclusions is not perfect.

Graham Berrisford

Director and Principal Tutor, Avancier Limited
9mo Edited

Can artificial intelligence create a new ontology? A concept graph is drawn to represent the ontology in a bounded context (associated with a particular system or domain of interest) in which the ambiguities of natural language are removed by the use of a domain-specific language. Many software systems have one. Every science has something approaching one. Can AI reveal new ideas by drawing a concept graph? As I understand it, AI of the LLM kind works by statistical analysis of word usage in vast cross-domain data sources, and rules for constructing grammatically sound sentences. But grammatical soundness is no assurance of ontological soundness. I can say: Unicorns <have> horns. Horns <are blown by> fox hunters. Fox hunters <chase> unicorns. But it doesn’t make sense. What if we restrict AI analysis to sentences uttered in a bounded context? Surely, if an LLM is capable of drawing a coherent concept graph, it is only because the ontology it represents is already understood in that domain of knowledge?
Like Comment
To view or add a comment, sign in
Graham Berrisford

Director and Principal Tutor, Avancier Limited
9mo Edited
Report this post
Can artificial intelligence create a new ontology? A concept graph is drawn to represent the ontology in a bounded context (associated with a particular system or domain of interest) in which the ambiguities of natural language are removed by the use of a domain-specific language. Many software systems have one. Every science has something approaching one. Can AI reveal new ideas by drawing a concept graph? As I understand it, AI of the LLM kind works by statistical analysis of word usage in vast cross-domain data sources, and rules for constructing grammatically sound sentences. But grammatical soundness is no assurance of ontological soundness. I can say: Unicorns <have> horns. Horns <are blown by> fox hunters. Fox hunters <chase> unicorns. But it doesn’t make sense. What if we restrict AI analysis to sentences uttered in a bounded context? Surely, if an LLM is capable of drawing a coherent concept graph, it is only because the ontology it represents is already understood in that domain of knowledge?

11 Comments
Like Comment
To view or add a comment, sign in
Murat Durmus

CEO & Founder @ AISOMA AG | Thought-Provoking Thoughts on AI | Member of the Advisory Board AI Frankfurt | Author of the book "MINDFUL AI" | AI | AI-Strategy | AI-Ethics | XAI | Philosophy
9mo
Report this post
Interesting approach "Toward A Logical Theory Of Fairness and Bias" by Vaishak Belle Fairness in machine learning is of considerable interest in recent years owing to the propensity of algorithms trained on historical data to amplify and perpetuate historical biases. In this paper, we argue for a formal reconstruction of fairness definitions, not so much to replace existing definitions but to ground their application in an epistemic setting and allow for rich environmental modelling. Consequently we look into three notions: fairness through unawareness, demographic parity and counterfactual fairness, and formalise these in the epistemic situation calculus. #AI #Fairness #Bias #Research

2 Comments
Like Comment
To view or add a comment, sign in
Carolyn Hough

Founder and CEO Policy Australia | GAICD (Order of Merit) | Consultant to Netflix | Director Torqn | Senior Adviser Flint Global | Previously Google, Minter Ellison, Telstra, Australian Government
11mo
Report this post
How do we know what we don’t know? What variables haven’t we considered? Always great questions to ask, but very interesting in the context of the potential - positive and negative - for AI #ai #insights

Edward Y. Chang

Adjunct Professor, Stanford University | ACM Fellow | IEEE Fellow
11mo

Following the remarkable reception of my July paper, "Examining GPT-4: Capabilities, Implications, and Future Directions," I am excited to introduce our latest research. This new work proposes an innovative method, leveraging the capabilities of foundation models like GPT-4, to uncover insights and knowledge that might have remained elusive to human understanding. GPT-4, with its polydisciplinary nature—akin to an AI agent possessing PhDs across all known disciplines—boasts an unmatched breadth and depth of knowledge compared to any individual human. Such vastness presents a challenge: How do we pose questions about the unknown? How can we access insights we aren't even aware we lack? Our paper delves into this challenge, presenting an experimental approach designed to navigate these uncharted waters, shedding light on how we might tap into the vast potential of GPT-4 to discover novel information and insights. https://lnkd.in/gxF7GYgG

(PDF) Discovering Insights Beyond the Known: A Dialogue Between GPT-4 Agents from Adam and Eve to the Nexus of Ecology, AI, and the Brain

researchgate.net
Like Comment
To view or add a comment, sign in
Michael Barnathan

Leading Data & ML Engineering at OpenStore. Prev. Google, Meta, Block. ML/AI PhD, Two Startup Exits. Angel Investor.
7mo Edited
Report this post
after many years at the fringes of the field, evolutionary methods appear to be inching back into favor as viable meta-learning approaches (alongside GANs and related evolving critic methods, which have always had elements of sexual selection and predator/prey dynamics). I expect FunSearch to be the first of a wave of new approaches that combine evolution with LLMs, renewing interest in the field. Intuitively, I can see three basic archetypes for combining evolution and LLMs: you can evolve a population of LLMs (evolution changes the networks), you can evolve points within an embedding space of an LLM and then decode them (evolution changes complex objects mapped into a simple space where points are more meaningful, as mentioned in my prior post), or you can use the LLM itself as part or all of the mechanism of evolution (actually more akin to intelligent design than natural evolution because an AI is guiding the process. That's sure to be a controversial suggestion, but I have to call a spade a spade here) FunSearch is an example of the third archetype, in which an LLM generates a population of programs that are then selected into prompting for the next generation. Theoretically you can do population based incremental learning in this manner with an unlimited context window, generating not just isolated programs but an entire generation at once.
Like Comment
To view or add a comment, sign in
Edward Y. Chang

Adjunct Professor, Stanford University | ACM Fellow | IEEE Fellow
11mo
Report this post
Following the remarkable reception of my July paper, "Examining GPT-4: Capabilities, Implications, and Future Directions," I am excited to introduce our latest research. This new work proposes an innovative method, leveraging the capabilities of foundation models like GPT-4, to uncover insights and knowledge that might have remained elusive to human understanding. GPT-4, with its polydisciplinary nature—akin to an AI agent possessing PhDs across all known disciplines—boasts an unmatched breadth and depth of knowledge compared to any individual human. Such vastness presents a challenge: How do we pose questions about the unknown? How can we access insights we aren't even aware we lack? Our paper delves into this challenge, presenting an experimental approach designed to navigate these uncharted waters, shedding light on how we might tap into the vast potential of GPT-4 to discover novel information and insights. https://lnkd.in/gxF7GYgG

(PDF) Discovering Insights Beyond the Known: A Dialogue Between GPT-4 Agents from Adam and Eve to the Nexus of Ecology, AI, and the Brain

researchgate.net

16 Comments
Like Comment
To view or add a comment, sign in
Jaron Collis

Founder, CTO, Agent Builder
11mo Edited
Report this post
This is an essay is about Generative AI, the vastness of possibility spaces, and the search for an answer to the following riddle: Would infinite books give us infinite knowledge? The moral of The Library of Babel is that abundance is not always a blessing, and can actually be a curse. The vast majority of books in the library are worse than worthless, because their sheer number obfuscates the miniscule number of books that do actually contain valuable meaning. Only a tiny number of books contain authentic truths, and they are sunk deep within an ocean of gibberish and deceptions. The story warns us that for every profound truth in the library there will be countless other variations that are subtly wrong or treacherously misleading. Just because a book appears grammatically correct doesn’t make it a treasure, the majority of what glitters in the infinite library is really Fool’s Gold. Borges’ parable reminds us that art is so precious because works with soulful meaning are so scarce, just finding one might take a whole lifetime. What do you think? https://lnkd.in/ezVURW2B

AI Adventures in The Library of Babel

jaroncollis.medium.com

1 Comment
Like Comment
To view or add a comment, sign in
Dr. Subhabaha Pal

Co-Founder, Chief AI & Analytics Advisor @ InstaDataHelp | Innovator and Patent-Holder in Gen AI and LLM | Data Science Thought Leader and Blogger | FRSS(UK) FSASS FRIOASD | 16+ Years of Excellence
11mo
Report this post
The Role of Artificial Intelligence in Knowledge Discovery

The Role of Artificial Intelligence in Knowledge Discovery

https://instadatahelp.com
Like Comment
To view or add a comment, sign in
Copenhagen Economics

12,452 followers
3mo
Report this post
CE Economist Marco Islam shares his thoughts on using AI to turn unstructured information into structured data. Read more below! #copenhageneconomics #AI #data #LLM

Marco Islam

Economist
3mo

How do you turn unstructured information from 380 long public reports into structured data for further analyses? 🤔 In a recent project with Christian Jervelund, Malwina Mejer, Nikolaj Siersbæk and Mads Thorkild Nissen, we tackled this challenge using AI. Our approach not only saved us countless hours and resources but also spared us the repetitive task of manual data extraction. #AI #LLM #Economics Copenhagen Economics
Like Comment
To view or add a comment, sign in
Christian Jervelund

Economics based advisory for HC&LS and Competition cases in HC&LS | Chairman of the Board
3mo
Report this post
Adding to the pool of use cases for AI in economics consultancy and research. Here we extracted otherwise hard to get insights from 380 assessments of medicines from the European Medicines Agency (EMA).

Marco Islam

Economist
3mo

How do you turn unstructured information from 380 long public reports into structured data for further analyses? 🤔 In a recent project with Christian Jervelund, Malwina Mejer, Nikolaj Siersbæk and Mads Thorkild Nissen, we tackled this challenge using AI. Our approach not only saved us countless hours and resources but also spared us the repetitive task of manual data extraction. #AI #LLM #Economics Copenhagen Economics
Like Comment
To view or add a comment, sign in

1,788 followers

283 Posts

View Profile Follow

Gregory Mermoud’s Post

More Relevant Posts

Explore topics