Abstract
The widespread use of cloud-based Large Language Models (LLMs) has heightened concerns over user privacy, as sensitive information may be inadvertently exposed during interactions with these services. To protect privacy before sending sensitive data to those models, we suggest sanitizing sensitive text using two common strategies used by humans: i) deleting sensitive expressions, and ii) obscuring sensitive details by abstracting them. To explore the issues and develop a tool for text rewriting, we curate the first corpus, coined NAP2, through both crowdsourcing and the use of large language models (LLMs). Compared to the prior works on anonymization, the human-inspired approaches result in more natural rewrites and offer an improved balance between privacy protection and data utility, as demonstrated by our extensive experiments. Our dataset is available at https://github.com/shuo956/NAP2-privacyrewrite.
| Original language | English |
|---|---|
| Title of host publication | EMNLP 2025 |
| Subtitle of host publication | the 2025 Conference on Empirical Methods in Natural Language Processing : Findings of EMNLP 2025 |
| Place of Publication | Kerrville, TX |
| Publisher | Association for Computational Linguistics |
| Pages | 8954-8970 |
| Number of pages | 17 |
| ISBN (Electronic) | 9798891763357 |
| DOIs | |
| Publication status | Published - 2025 |
| Event | 30th Conference on Empirical Methods in Natural Language Processing, EMNLP 2025 - Suzhou, China Duration: 4 Nov 2025 → 9 Nov 2025 |
Conference
| Conference | 30th Conference on Empirical Methods in Natural Language Processing, EMNLP 2025 |
|---|---|
| Country/Territory | China |
| City | Suzhou |
| Period | 4/11/25 → 9/11/25 |
Bibliographical note
Alternative title of the host publication: "Findings of the Association for Computational Linguistics: EMNLP 2025"Fingerprint
Dive into the research topics of 'NAP2: a benchmark for naturalness and privacy-preserving text rewriting by learning from human'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver