The Kinect™ has been developed to recognize gestures and voice commands, through a set of cameras and microphones, respectively. This paper proposes and evaluates low-cost Sound Source Localization (SSL) solution based this off-the-shelf equipment. It consists of employing a pair of Kinect devices as an alternative for microphone array, and executing the Steered Response Power using the PHAse Transform (SRP-PHAT) localization algorithm over acquired sound data. A fully functional prototype has been implemented and put to test under a realistic scenario. Experimental results indicate that although our approach is capable of achieving limited position estimation, and it can accurately point towards the source’s direction. Two different high performance versions of the algorithm have been implemented to improve overall system performance under 3D Sound Source Localization setup.

Expert Systems with Applications