Using Websocket on Android

This post is the sequel to this post about WebSockets.

I know what you are thinking.

But why?

Using WebSockets is a really useful way of building realtime applications, because they provide a fast, full-duplex communication channel between the server and the client. There are many libraries out there, usually providing straightforward, easy-to-use APIs for developers (and more will come…). WebSocket itself is a TCP based protocol, which uses an HTTP 1.1 upgrade request during handshake. One really useful thing about it is that it can communicate through networks which only allow web Internet connections, behind firewalls and/or with deep NAT. (Like common WiFi networks and the 3G/LTE systems…)

websocket android development

WebSocket on Android

Like I said, there are lots of libraries that can be used on Android. My two personal favourites were NV-WebSocket-Client and later OkHttp‘s built-in WebSocket interface. The first one has tons of callbacks, so it makes it easier to handle all those events that can occur during the communication, and has a really good API for setting connections up properly. OkHttp’s WebSocket has a more abstract API, but you can manage to get almost every function that’s included in the previous one out. The reason why I prefer OkHttp’s WS is because it’s now part of the OkHttp library, so you won’t need to include another library for Sockets if you already have a REST interface in your app. Furthermore it’s scheduled to be part of the Retrofit library in the future.

Building a WebSocket based App (OkHttp)

First of all, WebSocket communication should run on a separate background thread or in a Service like any other IO process or else the Android OS will warn us and shut down the application because of using IO actions on the Main thread. For this example I will use a lightweight RxJava wrapper.

Building up the WebSocket connection

It’s hard to believe that the first and last lines will do everything to establish a connection. Below are the classes invoked after the constructor and connect() function call.

So lets skip the Rx Java related parts, and see what the most important things are. We need an OkHttpClient to provide us a WebSocket instance, and we need a Request or at least an endpoint url on which we want to connect. These are the parameters which are created or assigned in the constructors above. Next stop is when we get our WebSocket instance, which is the OkHttpClient.newWebSocket(Request request, WebSocketLisnter listener) call. This call will return a WebSocket instance, but we will use the ones given in the Listener’s functions. The WebSocketListener in this case is a class called WebSocketEventRouter, which forwards the listener’s callback functions to the provided Emitter provided by the subscribed Rx stream.

Sending messages

As the callbacks above show, there are two onMessage events, one for text and one for binary type messages. Following this analogy we have two separate calls for sending our messages. Internally both calls will forward an array of bytes through the socket, but before the actual transmission, the OkHttp’s WebSocket creates a framed payload, which contains an OPCODE byte, which indicates, whether the payload is a binary or text content. The same mechanism works on the receiver side, it first checks the frame to decide how to handle the incoming bytes.

And then here is how the sending mechanism is implemented for both type of messages:

The used WebSocket instance is the one which returned most recently durring an onOpen callback.

Receiving messages

As discussed before, there are two onMessage callbacks: one binary and one text based.

According to the message type (binary, text), only one attribute of the SocketMessageEvent instance will be assigned to a NonNull object.

And this is another part where RxJava can be really helpful. At this point, we usually need to parse our messages from JSON string or XML. We can simply apply a map operator or compose a Gson parser into the Rx stream. The next example will show how to parse the incoming text message into a Message object.

A more complex concept is when you use some kind of parsing, but you don’t know the exact class type you need to parse into. I have seen some implementations where developers added a messageType enum to the base class, and built a custom parser logic, which will help decide the exact class type, when this attribute is already read from the message. But if you decide to use WebSocket you usually want responses as fast as possible and the previous method will result in a fairly complex parsing logic. A simpler version is to define a framing logic similar to what OkHttp is using when determining the message type and to use binary messages instead of text ones. It’s as simple as using the same messageType as in the previous case, but you can exclude the type from the parsing and when the json string is ready you turn it into a byte array, The final step is to add a heading byte (or bytes depends on the number of message types) according to the messageTypeEnum. The receiving part works the same way. You use the first N bytes to determine the class and parse the rest of the bytes as a json string into the picked class type. This method is simpler IMHO, and you can even eliminate the type attribute from the classes of each type, but then you need to define a static Map which gives a byte or bytes to class and backward mapping.

Tear down the connection

As any other resource, the socket also needs to be released. So if there were no errors, and the remote host didn’t close the connection before, then we would need to call the socket’s close method when we are finished.

And below you can find the code behind the method:

You can see that there’s a subscribe call in the close method. This is basically a subscribe for the onClosed() method, which will be invoked, when the close call on the currently open WebSocket instance is finished. When the onClosed is triggered, we dispose all the internal subscriptions, so that the GC can clean up most of the internally held resources, the rest is held for the reopen of the socket.

android websocket

Final thoughts

WebSocket is a really useful protocol if you are developing anything which needs low latency and fast updates, and realtime connection with the backend. I hope you find my example understandable and useable. If you liked the wrapper I used, feel free to use it and/or contribute to it. The whole project can be found on our GitHub here.

This post is powered by a custom software development company, called Wanari. We aim to stay ahead of the curve when it comes to technologies and we hope to give back to the community with our tech blog, #wanarileaks. To learn more about us and stay updated about our new posts, follow us on facebook.

Tamás Agócs

Tamás Agócs

Mobile Application Developer at Wanari
Agócs is our very own shepherd for the Android team. He's committed to push the team's technological knowledge to the limits and beyond....