Paper-Conference

CLUE: Conflict-guided Localization for LLM Unlearning Framework featured image

CLUE: Conflict-guided Localization for LLM Unlearning Framework

The LLM unlearning aims to eliminate the influence of undesirable data without affecting causally unrelated information. This process typically involves using a forget set to …

hang-chen-jiaying-zhu-xinyu-yang-wenya-wang
Skill Path: Unveiling Language Skills from Circuit Graphs featured image

Skill Path: Unveiling Language Skills from Circuit Graphs

Circuit graph discovery has emerged as a fundamental approach to elucidating the skill mechanistic of language models. Despite the output faithfulness of circuit graphs, they …

hang-chen-xinyu-yang-jiaying-zhu-wenya-wang
Rethinking Circuit Completeness in Language Models: AND, OR, and ADDER Gates featured image

Rethinking Circuit Completeness in Language Models: AND, OR, and ADDER Gates

Circuit discovery has gradually become one of the prominent methods for mechanistic interpretability, and research on circuit completeness has also garnered increasing attention. …

hang-chen-jiaying-zhu-xinyu-yang-wenya-wang
Quantifying Semantic Emergence in Language Models featured image

Quantifying Semantic Emergence in Language Models

Large language models (LLMs) are widely recognized for their exceptional capacity to capture semantics meaning. Yet, there remains no established metric to quantify this …

hang-chen-xinyu-yang-jiaying-zhu-wenya-wang
Debiasing the Fine-Grained Classification Task in LLMs with Bias-Aware PEFT featured image

Debiasing the Fine-Grained Classification Task in LLMs with Bias-Aware PEFT

Fine-grained classification via LLMs is susceptible to more complex label biases compared to traditional classification tasks. Existing bias mitigation strategies, such as …

daiying-zhao-xinyu-yang-hang-chen
How to enhance causal discrimination of utterances: A case on affective reasoning featured image

How to enhance causal discrimination of utterances: A case on affective reasoning

Our investigation into the Affective Reasoning in Conversation (ARC) task highlights the challenge of causal discrimination. Almost all existing models, including large language …

hang-chen-xinyu-yang-jing-luo-wenjing-zhu