TY - GEN
T1 - Follow-on question suggestion via voice hints for voice assistants
AU - Fetahu, Besnik
AU - Faustini, Pedro
AU - Fang, Anjie
AU - Castellucci, Giuseppe
AU - Rokhlenko, Oleg
AU - Malmasi, Shervin
PY - 2023
Y1 - 2023
N2 - The adoption of voice assistants like Alexa or Siri has grown rapidly, allowing users to instantly access information via voice search. Query suggestion is a standard feature of screen-based search experiences, allowing users to explore additional topics. However, this is not trivial to implement in voice-based settings. To enable this, we tackle the novel task of suggesting questions with compact and natural voice hints to allow users to ask follow-up questions. We define the task, ground it in syntactic theory and outline linguistic desiderata for spoken hints. We propose baselines and an approach using sequence-to-sequence Transformers to generate spoken hints from a list of questions. Using a new dataset of 6681 input questions and human written hints, we evaluated the models with automatic metrics and human evaluation. Results show that a naive approach of concatenating suggested questions creates poor voice hints. Our approach, which applies a linguistically-motivated pretraining task was strongly preferred by humans for producing the most natural hints.
AB - The adoption of voice assistants like Alexa or Siri has grown rapidly, allowing users to instantly access information via voice search. Query suggestion is a standard feature of screen-based search experiences, allowing users to explore additional topics. However, this is not trivial to implement in voice-based settings. To enable this, we tackle the novel task of suggesting questions with compact and natural voice hints to allow users to ask follow-up questions. We define the task, ground it in syntactic theory and outline linguistic desiderata for spoken hints. We propose baselines and an approach using sequence-to-sequence Transformers to generate spoken hints from a list of questions. Using a new dataset of 6681 input questions and human written hints, we evaluated the models with automatic metrics and human evaluation. Results show that a naive approach of concatenating suggested questions creates poor voice hints. Our approach, which applies a linguistically-motivated pretraining task was strongly preferred by humans for producing the most natural hints.
UR - http://www.scopus.com/inward/record.url?scp=85183295135&partnerID=8YFLogxK
U2 - 10.18653/v1/2023.findings-emnlp.24
DO - 10.18653/v1/2023.findings-emnlp.24
M3 - Conference proceeding contribution
T3 - Findings of the Association for Computational Linguistics: EMNLP 2023
SP - 310
EP - 325
BT - Findings of the Association for Computational Linguistics
PB - Association for Computational Linguistics
CY - Stroudsburg
T2 - 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023
Y2 - 6 December 2023 through 10 December 2023
ER -