DEV Community

Cover image for Using Android to stream to Twitch. Part 2. RTMP handshake
Tristan Elliott
Tristan Elliott

Posted on

Using Android to stream to Twitch. Part 2. RTMP handshake

Table of contents

  1. Warning
  2. Goal of this series
  3. Steps in this series
  4. What is a RTMP?
  5. The RTMP handshake
  6. Creating the handshake code (THE ACTUAL CODE)
  7. The lengthy explanation

My app on the Google play store

My app's GitHub code

Resources

Warning

  • THIS IS NOT A BEGINERS TUTORIAL. This blog series will fall under the intermediate/ advanced tutorial. I say this not to discourage people from reading but simply to let people know that they may encounter some topics that might seem complicated

The Goal of this series

  • As the title states, this entire series will be about how to get the video from our Android device to stream on Twitch.

The steps you should take

  • 1) get a preview working on your application
  • 2) Allow your application to capture video
  • 3) Create a secure Socket to connect to the Twitch injection servers
  • 4) Perform the RTMP handshake (what this blog post is talking about )
  • 5) Encode the video from the device (very hard)
  • 6) Send the encoded data to the Twitch injection server via the socket

What is RTMP and why are we using it?

  • According to the RTMP specification documentation, Real Time Messaging Protocol (RTMP) provides a bidirectional message multiplex service over a reliable stream transport, such as TCP [RFC0793], intended to carry parallel streams of video, audio, and data messages, with associated timing information, between a pair of communicating peers. Implementations typically assign different priorities to different classes of messages, which can affect the order in which messages are enqueued to the underlying stream transport when transport capacity is constrained. Which is really just nerd speak for, RTMP lets us send audio and video over the internet

  • We are using RTMP because if we look at the Twitch documentation, we can see that rtmp://<ingest-server>/app/<stream-key>[?bandwidthtest=true] uses the rtmp protocol. So once we have a secure socket, previous post on how to create a secure socket, we can initialize the RTMP connection

The RTMP handshake

  • RTMP documentation
  • Full warning, we are about to get into literal bits and bytes here. So buckle in and lets create a RTMP handshake
  • The RTMP connection begins with a handshake, which is just an exchange of data between the client(our android app) and the server to make sure the both understand what they are doing.

The actual code

  • first I will show you the code and then I will try to explain it,
private suspend fun performRtmpHandshake() {
        withContext(Dispatchers.IO) {
            try {

                val timestamp = System.currentTimeMillis().toInt()
                val randomData = ByteArray(1528).apply { Random().nextBytes(this) }

                // Build C0 + C1
                val handshake = ByteArray(1537).apply {
                    //C0
                    this[0] = 3 // RTMP version

                    //C1
                    // Copy timestamp (4 bytes) directly
                    val timestampBytes = ByteBuffer.allocate(4).putInt(timestamp).array()
                    this[1] = timestampBytes[0]
                    this[2] = timestampBytes[1]
                    this[3] = timestampBytes[2]
                    this[4] = timestampBytes[3]

                    // Copy 4 zero bytes directly
                    this[5] = 0
                    this[6] = 0
                    this[7] = 0
                    this[8] = 0

                    // Copy randomData (1528 bytes) directly
                    for (i in randomData.indices) {
                        this[9 + i] = randomData[i]
                    }
                }

                // Send C0 + C1
                val outputStream = sslSocket.getOutputStream()
                outputStream.write(handshake)
                outputStream.flush()

                // Read S0 + S1
                val inputStream = sslSocket.getInputStream()
                val response = ByteArray(1537)
                inputStream.read(response)
                if (response[0] != 3.toByte()) {
                    throw IllegalStateException("Invalid RTMP handshake version from server")
                }

                val s1 = response.copyOfRange(1, 1537)

                // Build C2
                val c2 = ByteArray(1536).apply {
                    // Copy the first 4 bytes of S1 (timestamp)
                    this[0] = s1[0]
                    this[1] = s1[1]
                    this[2] = s1[2]
                    this[3] = s1[3]

                    // Copy the current timestamp (4 bytes) into the next 4 bytes
                    val currentTimestamp = ByteBuffer.allocate(4).putInt(System.currentTimeMillis().toInt()).array()
                    this[4] = currentTimestamp[0]
                    this[5] = currentTimestamp[1]
                    this[6] = currentTimestamp[2]
                    this[7] = currentTimestamp[3]

                    // Copy the random data from S1 (starting from the 8th byte)
                    for (i in 8 until 1536) {
                        this[i] = s1[i]
                    }
                }

                // Send C2
                outputStream.write(c2)
                outputStream.flush()


                // Read S2
                val s2 = ByteArray(1536)
                inputStream.read(s2)

                Log.i(TAG, "RTMP handshake successful")
            } catch (e: Exception) {
                Log.e(TAG, "Handshake failed: ${e.message}", e)
            }
        }
    }


Enter fullscreen mode Exit fullscreen mode

The lengthy explanation

  • The logic of the handshake goes like this, we send a chunk of data, wait for a chunk of data, send a chunk of data and then wait for a chunk of data. Once we have received that final chunk the handshake is complete

  • The ByteArray(1537), is actually how we transport the data over the socket. It obviously consists of bytes(octets for you hard core nerds) where each byte contains 8 bits. The size of this byte array is very specific, the documentation states that we need to have 1 byte for the version and 1536 bytes for all the other data

  • As you can see from the first section of the handshake:

val randomData = ByteArray(1528).apply { Random().nextBytes(this) }

                // Build C0 + C1
                val handshake = ByteArray(1537).apply {
                    //C0
                    this[0] = 3 // RTMP version

                    //C1
                    // Copy timestamp (4 bytes) directly
                    val timestampBytes = ByteBuffer.allocate(4).putInt(timestamp).array()
                    this[1] = timestampBytes[0]
                    this[2] = timestampBytes[1]
                    this[3] = timestampBytes[2]
                    this[4] = timestampBytes[3]

                    // Copy 4 zero bytes directly
                    this[5] = 0
                    this[6] = 0
                    this[7] = 0
                    this[8] = 0

                    // Copy randomData (1528 bytes) directly
                    for (i in randomData.indices) {
                        this[9 + i] = randomData[i]
                    }
                }
Enter fullscreen mode Exit fullscreen mode
  • Now the code: //C0 this[0] = 3 might seem a little strange but the C and S just represent client and server. The this[0] = 3 is us setting the first byte to 3. Again this might sound a little off but remember a byte is 8 bits and a single 8-bit number can represent 0 to 255 for unsigned and -128 to 127 for signed. But 3 is used to tell the server which version of RTMP we want to use. You can read more about that, here

  • Now we can talk about the time stamping:

// Copy timestamp (4 bytes) directly
 val timestamp = System.currentTimeMillis().toInt()
                    val timestampBytes = ByteBuffer.allocate(4).putInt(timestamp).array()
                    this[1] = timestampBytes[0]
                    this[2] = timestampBytes[1]
                    this[3] = timestampBytes[2]
                    this[4] = timestampBytes[3]

Enter fullscreen mode Exit fullscreen mode
  • According to the documentation we are given 4 bytes(32 bits) to represent our times stamps. It helps ensure that messages (or chunks) are sent in the correct order and can be synchronized between different streams or endpoints. Technically speaking this can be any number, it just has to increase over time. The ByteBuffer.allocate(4).putInt(timestamp).array() allocates 4 bytes and places out timestamp into those bytes. Again that might seem like a weirdly specific number but 4 bytes is just the industry standard for timestamps. Also, each timestampBytes[n] represents a different section of the time stamp

  • The weird zeros:

// Copy 4 zero bytes directly
                    this[5] = 0
                    this[6] = 0
                    this[7] = 0
                    this[8] = 0

Enter fullscreen mode Exit fullscreen mode
  • Are called padding bytes which are used to contain a consistent structure and a boundary between byte information

  • The next value is the strange one, its the randomness:

// Copy randomData (1528 bytes) directly
                    for (i in randomData.indices) {
                        this[9 + i] = randomData[i]
                    }

Enter fullscreen mode Exit fullscreen mode
  • Once again, the documentation tells us that we need to assign 1528 bytes a bunch of literal random data to inform the server that the message being sent over has finished

  • Now that we have to send data to the server and wait for a reply:

// Read S0 + S1
                val inputStream = sslSocket.getInputStream()
                val response = ByteArray(1537)
                inputStream.read(response)
                if (response[0] != 3.toByte()) {
                    throw IllegalStateException("Invalid RTMP handshake version from server")
                }

Enter fullscreen mode Exit fullscreen mode

Rinse an repeat

  • Then we just follow the documentation and do the exact same thing over again. Once this data is returned we know that the RTMP hand shake is complete!!!!!

Conclusion

  • Thank you for taking the time out of your day to read this blog post of mine. If you have any questions or concerns please comment below or reach out to me on Twitter.

Top comments (0)