In most microphone array applications, it is necessary to localize sound sources in a noisy and reverberant environment. For that purpose, many different Sound Source Localization (SSL) algorithms have been proposed, where the SRP-PHAT (Steered Response Power using the Phase Transform) has been known as one of the state-of-the-art methods. Its original formulation allows two different practical implementations, one that is computed in the frequency-domain (FDSP), and another in the time-domain (TDSP), which can be enhanced by interpolation. However, the main problem of this algorithm is its high computational cost due to intensive grid scan in search for the sound source. Considering the power of GPUs (Graphics Processing Units) for working with massively parallelizable compute-intensive algorithms, we present two highly scalable GPU-based versions of the SRP-PHAT, one for each formulation, and also an implementation of the cubic splines interpolation in the GPU. These approaches exploit the parallel aspects of the SRP-PHAT, allowing real-time execution for large search grids. Comparing our GPU approaches against traditional multi-threaded CPU approaches, results show a speed up of 275×for the FDSP, and 70×for the TDSP with interpolation, when comparing high-end GPUs to high-end CPUs.