Mechanistic Interpretability

CLUE: Conflict-guided Localization for LLM Unlearning Framework

The LLM unlearning aims to eliminate the influence of undesirable data without affecting causally unrelated information. This process typically involves using a forget set to …

hang-chen-jiaying-zhu-xinyu-yang-wenya-wang

• Jan 26, 2026 • 1 min read

Large Language Models

Skill Path: Unveiling Language Skills from Circuit Graphs

Circuit graph discovery has emerged as a fundamental approach to elucidating the skill mechanistic of language models. Despite the output faithfulness of circuit graphs, they …

hang-chen-xinyu-yang-jiaying-zhu-wenya-wang

• Jan 1, 2026 • 1 min read

Large Language Models

Rethinking Circuit Completeness in Language Models: AND, OR, and ADDER Gates

Circuit discovery has gradually become one of the prominent methods for mechanistic interpretability, and research on circuit completeness has also garnered increasing attention. …

hang-chen-jiaying-zhu-xinyu-yang-wenya-wang

• Dec 15, 2025 • 1 min read

No results found

Mechanistic Interpretability

CLUE: Conflict-guided Localization for LLM Unlearning Framework

Skill Path: Unveiling Language Skills from Circuit Graphs

Rethinking Circuit Completeness in Language Models: AND, OR, and ADDER Gates