(5min read)

A few weeks ago, I’ve been developing a mobile application using video streaming on Android offline. I was not expecting such a journey in front of me. Besides all information you can find on the web, it was not that trivial and the solution was not obvious. The goal of this article is to share my experience doing video streaming on Android online, using Wireless Display, to help you on your own Android development journey. I assume you’ve tried, are trying or did implement a solution to do it. Hopefully, it will be useful – do not hesitate to tell me if this is the case!

First trial : basic WiDi implementation using sockets

I followed the Android developer guide to connect two phones using WiDi. Then, I thought innocently that I could stream the video from one phone to another, using the localhost network with MediaRecorder to MediaPlayer. I did not know that video streaming required specific protocols to work correctly: after a few hours of research, you understand that it’s because the live stream is not “seekable”. With the black screen and the obscur errors of MediaPlayer, I got to know about RTP and RTSP protocols.

Second trial : RTP/RTSP protocol with libstreaming

RTSP stands for Real Time Streaming Protocol. This famous library on GitHub, “libstreaming”, seemed to do exactly what I wanted, and the implementation was very simple. There are nevertheless few things you need to be careful about:

  • The video won’t stream if you don’t “startPreview” on the session of the emitter
  • The RTSP won’t work if you “setDestinationAddress” to the session
  • You need to stream the RTSP address from the right local IP! Indeed, when using WiDi, there is always a group owner (which is kind of a master/slave relation between devices). If device A is the group owner, device B needs to stream from the IP of the group owner device (easy to get in the WifiP2pManager.ConnectionInfoListener). When device B streams to device A, however, it gets a bit more complicated. It means device A, as a group owner, needs to know in advance what is the IP of its client B, and to do this you need to setup some basic thread.
// In the method setting your connected state
when (info.isGroupOwner) {
    true -> createServerThread(info)
    else -> createClientThread(info)
}
private fun createServerThread(info: WifiP2pInfo) {
    // Your logic to store info.groupOwnerAddress.hostAddress
    // Your logic to store isGroupOwner = true
    thread {
        val serverSocket = ServerSocket()
        try {
            serverSocket.reuseAddress = true
            serverSocket.bind(InetSocketAddress(PORT))
            Log.i(TAG, "ServerThread waiting for incoming client...")
            // The method accept() will block until a client connect
            val client = serverSocket.accept()
            val clientIP = client.inetAddress.toString()
            Log.i(TAG, "Client connected : $clientIP")             
            val strippedClientIP = clientIP.replace("/", "")
            // Your logic to store strippedClientIP
            serverSocket.close()
        } catch (e: IOException) {
            Log.e(TAG, e.message, e)
        } finally {
            serverSocket.close()
        }
    }
}
private fun createClientThread(info: WifiP2pInfo) {
    // Your logic to store info.groupOwnerAddress.hostAddress
    // Your logic to store isGroupOwner = false
    thread {
        try {
            var retry = true
            Log.e(TAG, "ClientThread trying to connect...")
            while (retry) {
                val clientSocket = Socket()
                try {
                    clientSocket.connect(InetSocketAddress(info.groupOwnerAddress, PORT), 10000)
                } catch (e: ConnectException) {
                    retry = true
                    continue
                }
                val clientIP = clientSocket.inetAddress.toString()
                Log.e(TAG, "Local client : $clientIP")
                val strippedClientIP = clientIP.replace("/", "")                    
                // Your logic to store strippedClientIP
                var stream: DataOutputStream? = null
                try {
                    stream = DataOutputStream(clientSocket.getOutputStream())
                    val str = "random_string"
                    stream.writeUTF(str)
                } catch (e: Exception) {
                    Log.e(TAG, e.message, e)
                    retry = true
                    continue
                } finally {
                    stream?.close()
                    clientSocket.close()
                }
                retry = false
            }
        } catch (e: Exception) {
            Log.e(TAG, e.message, e)
        }
    }
}

With this code, you are now prepared to stream in both directions, based on the assumption you are the group owner or not.

Nevertheless, I encountered two unpleasant surprises:

  1. The quality was not good enough. If I was increasing the resolution, I was getting a black screen (is that because of not enough bandwidth?)
  2. The camera flow was rotated, because yeah, the camera always streams in landscape natively, but the EXIF information are sent to rotate if it is in the right direction. When using RTSP, no EXIF information is sent, which means there is no way to use libstreaming without modifying it.

After few hours of digging into it, I then gave up this approach and started to try a new one, after getting some inspiration on iOS Multipeer connectivity.

Third trial: Android Nearby

As I discovered Android Nearby, the new (but not that new) connectivity framework by Google, which is so simplified for communication and automatically supports WiDi (or Bluetooth connectivity in case WiFi is disabled), I thought:

Yeah that’s it, that’s the way!

Moreover, I was very confident when I saw on Google sample on GitHub a working implementation of a walkie talkie. Of course, I tried to search if someone did it before with the video, and the only topic I could find about it was a guy who tried it, without success. I tried to convince myself that, maybe, it was still “new” and not enough people tried it, so I dove in the implementation of video streaming using Google Nearby.

And it was… a big mistake!

With non concluding trial, I realised that Google Nearby supports sending Stream payload of 32 KB. And that a video won’t be “seekable” as an InputStream neither, that is, the same mistake as the first trial ! I thought then about an approach of sending one image after each other and for this I just needed 30 FPS to make it look like a video. But, hey, remember? 32 KB. Which is not enough to send a descent quality of picture at 30 FPS. I guess this limitation is due to the fact that Android Nearby is supposed to work properly on Bluetooth also, which has a much lower bandwidth than WiFi.

Final (and successful) trial: WiDi and picture by picture

With all the experience I got in my trials, I was finally prepared for my ultimate trial. At the beginning, I didn’t want to do that because this approach wouldn’t work if audio was required. But quality of the image and the speed were major pre-requisites for the Android app I was building.

I used a thread socket connection, adapted a 7 years old code using zip compression and a LinkedBlockingQueue buffering system to send picture by picture to the other device. Few bugs to correct, for instance:

  • the dezipping;
  • the need to send the rotation, because cameras have different rotations according to the smartphone model and Android version in use.

That was it: the end of this rich, instructive development journey!

In the server thread, you should have something like this:

try {
    Log.i(TAG, "Writing out uncompressed " + frame.uncompressedSize + " compressed " + frame.compressedSize + " of width " + width + " and height " + height)
    dataOutputStream.writeInt(frame.uncompressedSize)
    dataOutputStream.writeInt(frame.compressedSize)
    dataOutputStream.writeInt(width)
    dataOutputStream.writeInt(height)
    dataOutputStream.writeInt(frame.rotation)
    dataOutputStream.write(frame.data, 0, frame.compressedSize)
} catch (e: IOException) {
    Log.e(TAG, "Error writing to output stream", e)
    break
} finally {
    freeFrames.add(frame)
}

And in the client thread :

val uncompressed: Int
val compressed: Int
val width: Int
val height: Int
val rotation: Int
try {
    uncompressed = dataInputStream.readInt()
    compressed = dataInputStream.readInt()
    width = dataInputStream.readInt()
    height = dataInputStream.readInt()
    rotation = dataInputStream.readInt()
} catch (e: Exception) {
    Log.e(TAG, "Error reading header information", e)
    break
}

Log.i(TAG, "Receiving uncompressed $uncompressed compressed $compressed of width $width and height $height")
if (hasError(uncompressed, width, height)) break

val output = ByteArray(uncompressed)
try {
    dataInputStream.readFully(buffer, 0, compressed)
    val inflater = Inflater()
    inflater.setInput(buffer)
    inflater.inflate(output, 0, uncompressed)
} catch (e: IOException) {
    Log.e(TAG, "Failed reading stream!")
    break
}
try {
    val bitmap = processImage(output, width, height, rotation)
    EventBus.getDefault().post(LivePictureEvent(bitmap))
} catch (e: InterruptedException) {
    e.printStackTrace()
} catch (e: IllegalArgumentException) {
    Log.e(TAG, "Error processing image")
    break
}

How about you? Did you manage to do it in a less verbose way ?