PhD student · Speech technology · Conversational AI

Shree Harsha

I research speech interaction dynamics, SpeechLLM evaluation, and responsible conversational AI.

My current work studies enchronic speech dynamics with self-supervised learning, voice-based bias evaluation, and failure modes in speech foundation models. I am supervised by Éva Székely and Gustav Eje Henter, supported by the WASP program, and based at KTH Royal Institute of Technology.

Publications Google Scholar GitHub

About

Research Focus

Enchronic speech dynamics are the moment-by-moment patterns that make conversation work: when people take turns, how quickly they respond, how meaning is built together, and how local social context shapes what comes next.

My research uses self-supervised learning to study these dynamics and to ask how SpeechLLMs and conversational AI systems can better reflect the timing, variation, and responsibility that real interaction requires.

Research

Current Directions

Current focus JSALT 2026

Evaluating full-duplex conversation as interaction

I am developing evaluation methods for full-duplex conversational systems that use human-human conversation as the reference. We plan to use distributional metrics to measure human-human, human-AI and AI-AI conversations.

The work asks whether a few salient positive moments and longer stretches of conversational burden can explain how an interaction is remembered and rated. It connects turn-taking, responsiveness, entrainment, affect, and distributional speech metrics to participant-level judgements of conversational success.

Human interaction trace 12:48

salient moment sustained burden

Enchronic speech dynamics

Modelling the moment-by-moment timing, turn-taking, and coordination through which people build meaning together.

Responsible SpeechLLM evaluation

Revealing voice-based bias and robustness failures through controlled variation, open-ended interaction, and human validation.

Selected work

Research in view

Selected papers on voice-based bias, interactive evaluation, and robustness in speech foundation models.

Heatmap and interval plot comparing SpeechLLM helpfulness across accents, models, and gender. — Helpfulness across accent × gender profiles

Interspeech 2026 · Accepted

More publications

2026

“Walk a Mile in My Voice”: Voice Conversion Shapes Trust, Attribution, and Empathy in Human-AI Speech Interactions

SH Bokkahalli Satish, M Teleki, C Minixhofer, O Klejch, P Bell, É Székely.

IUI Companion 2026

Google Scholar
2025

Lost in Phonation: Voice Quality Variation as an Evaluation Dimension for Speech Foundation Models

H Lameris, SH Bokkahalli Satish, J Gustafson, É Székely.

arXiv:2510.25577 · Submitted to LREC 2026

arXiv
2025

When Voice Matters: Evidence of Gender Disparity in Positional Bias of SpeechLLMs

SH Bokkahalli Satish, GE Henter, É Székely.

SPECOM 2025 · Published
2025

Hear Me Out: Interactive Evaluation and Bias Discovery Platform for Speech-to-Speech Conversational AI

SH Bokkahalli Satish, GE Henter, É Székely.

Interspeech 2025 Show and Tell/Demo · Published
2021

Predicting Lexical Skills from Oral Reading with Acoustic Measures

SH Bokkahalli Satish, C Vitthal, K Sabu, P Rao.

arXiv:2112.00635

arXiv

Background

Education and Experience

Current

PhD student

Researching enchronic speech dynamics using self-supervised learning and bias discovery/mitigation in conversational AI. Supervised by Éva Székely and Gustav Eje Henter at KTH Royal Institute of Technology, with support from WASP.
Pre-PhD

Speech and AI systems

Worked on speech assessment systems, voice activity detection, and LLM/RAG solutions for aerospace repair-scheme creation.
2022-2024

MSc in Data Science & AI

Deepened work across machine learning, AI, and applied data-driven systems.
2018-2021

Master's in Electrical Engineering, Signal Processing

Built ASR models for literacy assessments in India, later used in the field by an NGO.

Interests

Research Interests

Contact

Shree Harsha

Research Focus

Current Directions

Evaluating full-duplex conversation as interaction

Enchronic speech dynamics

Responsible SpeechLLM evaluation

Research in view

The Voice Behind the Words

From Seeing it to Experiencing it

What Counts as Real?

Speak Your Mind

Do Bias Benchmarks Generalise?

More publications

“Walk a Mile in My Voice”: Voice Conversion Shapes Trust, Attribution, and Empathy in Human-AI Speech Interactions

Lost in Phonation: Voice Quality Variation as an Evaluation Dimension for Speech Foundation Models

When Voice Matters: Evidence of Gender Disparity in Positional Bias of SpeechLLMs

Hear Me Out: Interactive Evaluation and Bias Discovery Platform for Speech-to-Speech Conversational AI

Predicting Lexical Skills from Oral Reading with Acoustic Measures

Education and Experience

PhD student

Speech and AI systems

MSc in Data Science & AI

Master's in Electrical Engineering, Signal Processing

Research Interests

Contact