I was giving this presentation at RailsIsrael 2016 conference. I covered the basics of all major algorithms for supervised and unsupervised learning without a lot of math just to give the idea of what’s possible to do with them.
There is also a demo and ruby code of Waze/Uber like suggested destinations prediction with fast neural networks on Ruby.
I was talking about migrating from Flux to Redux last Wednesday at Reacts Israel meetup.
Video and screen cast should be published at ReactJS-IL shortly.
When I started to work with React back in Apr-2015 there were many libraries to manage the application flow. I decided to start with classical FB’s Flux implementation to understand what’s missed there. Eventually react-redux and redux solved most of issues I had with Flux. This talk is about practical aspects of migration from Flux to Redux.
Rails Conference 2012, first time in
Israel, was a great deal of fun. Lot’s of presenters both local and from all
over the world, well, more like from all over the world of Rails. There were
talks from Github, Heroku, Engine Yard, Gogobot, Get Taxi and lots and lots of
others. Solid organization from Raphael FogelPeople and Computers guys. Hordes of interesting people
to talk to, nice and abundant food and coffee, lots of great content from the
speakers and to sign off the day Github guys invited everyone to an open bar
We gave 2 talks, Vitaly’s “Performance - When, What and How” and Boris’ “Rails Missing
Features”. Check out slides and videos of those talks.
Some time later there will also be a video of the talk on the devcon site.
Complex Web Applications
At Astrails we create web applications.
Mostly complex Web Applications. We usually use Ruby on Rails and
lately also Node.js.
Since 2005 when we started working on our first real web project we did tens of
projects, so we observed how typical requirements changed over time.
One of the growing trends in the recent years is more and more of a “real time”
functionality that is required.
By real time I don’t mean constant guaranteed response time or anything like
that. We are not talking about embedded systems etc.
On the web “real time” means something different. It usually means that an
application user gets new content in the application w/o refreshing the
browser, and in the case this new content is coming from another user it is
made available “close to real time”, which usually means fast enough so that
you can have a ‘chat’ for example. Typical delays will be under 1 sec.
GMail was one of the first widely known applications that popularized the
technology back in about 2004-2005. At the time almost no web apps did this.
Today, it seems like every other one does.
Lets see how we could implement such a system.
No talk about the subject can escape the simplest technologies of all - Poling.
It’s main advantage is that it is simple.
You send request to the server every couple of seconds. You get new data back
if its available or empty response otherwise.
Many times there is nothing wrong about polling. If you know “Campfire”, a
group chat application from 37signals was implemented with poling. Application
was sending a request to check for new messages every 5 seconds or so.
Their “poller service” is written in C, so it can handle tons of connections
using either threads or async networking programming style.
Don’t discard polling right out of the start. It might just do the job,
especially at the beginning when you are building your minimally viable
product. You can always rewrite this part of the application later once you
have lots of users. Remember that scaling problems are usually good problems.
It means you are growing, so those are the good problems to have ;)
On the other hand polling is definitely not the best way to implement it in
It is resource intensive. it wastes bandwidths and CPU cycles on both client
It also has quite a significant delay between the event and its delivery to the
client. It will be about half the polling interval on average and up to and
above the polling interval in the worst case.
What other options do we have?
Long poll is a variation on the simple polling technique.
The difference is that in the case of no-data available the server will not
immediately return negative response. Instead the server will keep the
connection open and wait for new data to become available. Once the data comes
it is sent as a response to the long polling connection. Once client gets a
response (or looses connection) it immediately issues another long polling
Note that in this case we have no polling interval, we call the server again
immediately after we receive the data.
Long poll has a advantages over the regular poll.
it uses much less bandwidth and it issues less requests.
faster response. most of the time you have a pending connection
from the client, so when new data comes in you usually able to push it to the
On the other hand it is more complex to get right.
It uses more RAM to keep all the persistent connections.
You need to detect disconnects. Since you are not “talking” to the client you
might not notice if the client disconnected. You will only get notified when
you try to write into the socket to send the data to the client.
You might also get a new connection from a client while you still think you
have the previous connection open.
It also has issues with firewalls and proxies.
Long polling is what we used at Astrails at the beginning of 2006 when
we had to implement a chat server with some custom application functionality.
One of the problems that we had to deal with is that if you keep your
connection open with no data flowing through it, firewalls will usually discard
the connection state after some time. So the server still has to return
negative answer to the client if no new information becomes available after
some suitable timeout. At the time a safe timeout for this was around 25-30 seconds.
Another problem with LongPoll is that browsers limit the number of connections that you can keep open to the same server.
RFC 2616 (HTTP/1.1) section 8.1.4 “Practical Considerations” states that “A single-user client SHOULD NOT maintain more than 2 connections with any server or proxy.”
So if you keep one of them persistent for the purpose of LongPoll it means you
have only one connection left for the other things like issuing ajax requests
and downloading assets.
Note that at the present time most browsers have much this limit higher then 2, so this problem became less relevant.
Another option is to use HTTP Streaming using chunked encoding to send dat to the client.
The connection is kept open and is only re-opened if it disconnects.
Server can send empty/noop data “packets” to deal with firewall and proxy issues.
Yet another option is to use Flash ;)
You can open TCP connection to your server and do almost anything you want.
The problem of course is that Flash is not a web standard, its not universally
available. For example it is not available on all iOS devices and not installed by default on many systems.
Also with all the recent mac/ios/android flash controversy it seems to be on its way out.
Which brings us to Web Sockets.
WebSockets is an emerging standard for bi-directional dull duplex TCP communication channel between a browser and a web server.
It is a new protocol which starts as a regular HTTP connection which is then ‘upgraded’ to a web socket connection.
Major browsers support WebSockets in their recent versions and there are fallback libraries available for older browsers that emulate websockets using flash.
Which one to use?
So, with all those options available, which one should you use?
A somewhat surprising answer is that you should use almost all of them ;)
The trick is to use a library that will choose the best method based on the browser support and network conditions.
Socket.IO is one such library. Its great strength is that it will choose the most capable transport at runtime, without it affecting the API.
Adobe® Flash® Socket
AJAX long polling
AJAX multipart streaming
Internet Explorer 5.5+
Google Chrome 4+
This is an example of the server side code. As you can see we wait till we have connection with a new client, send them ‘news’ even with “hello world” payload. and define handler to a custom “my other event” event which just dumps the payload to the log.
you can define any number of events and payload can be any json object.
This is an example of the client side that talks to the server from the previous slide.
We establish connection and wait for the ‘news’ event, then we send the ‘my other event’ back.
As you can see the programming with Socket.io seems to be very simple. The hard part is that all this is completely asynchronous and such code tends to be complex to write and maintain, on the other hand there are no better alternatives to async for the server push support on the client.
What about the server side?
OK, but how do you implement the server side of things?
There are basically 2 kinds of options.
First option is what I call a “Smart server”
This is what we implemented back in 2006 at Astrails when we needed a chat server that will also handle payment metering and some other application level things.
Such a server will usually perform authentication, and application level business logic. Might also implement persistence and other application related things.
You can implement the Smart Server in almost any kind of server side language.
You can implement your whole application inside your smart push server.
The problem is that such servers usually require you to use async programming style which makes it harder then usual to do simple things.
So, for example, in case of Ruby it is harder to implement your whole application inside Event Machine then to implement a regular Ruby on Rails application.
So this architecture is more frequently chosen when the runtime environment is async, like Node.js or Erlang.
In case of more traditional runtime environments like Ruby or Python another architecture is commonly used.
In this architecture we divide our applicaiton into ‘regular’ and ‘push’ parts.
The regular part is implemented using the regular web application framework like Ruby on Rails.
The ‘push’ part of the application is then implemented by an additional Push Server using one of the technologies we talked about.
In this case client can talk directly to the Push Server, or to the regular App Server. App Server can contact the Push Server to pass data to the clients.
Another alternative architecture is to use what I call a ‘Dumb Server’.
In this case instead of implementing smart Push Server we can use an existing generic push server that is driven by our application server. So this is a variation on the Architecture B, but w/o any kind of application specific code on the Push Server side.
In this architecture client usually talks to the application server which validates and if needed passes the information to the other clients through the push server.
direct relay though the push server is also possible, but in this case a much stronger validation is required on the client side.
There are many options available for the “dumb” push server.
One is CometD from Dojo Foundation.
There is an extensive documentation and tutorials.
The protocol on the wire is an implementation of Bayeux
Bayeux is an attempt to standardize the browser interaction with a comet server with the intent of re-use of the client side code with different comet server implementations and tries to cover all the relevant use cases, which means it is quite bloated and complex.
It is based around channels that you can subscribe to and send / receive messages but it is much more to it then just that.
Nginx HTTP Push
NginX is a super fast HTTP Server from Russia. It took internet by storm from being unknown to a present server market share of more then 10% within just a couple of years.
It has a module that implements HTTP Push.
Instead of Bayeux it implements a much simpler Basic HTTP Push Relay Protocol.
The common problem of the solutions like CometD and Nginx HTTP Push is that you need to setup/install, configure and maintain them. My personal preference is either custom Node.js based server if we really need a ‘smart server’, or an outsourced hosted solution for a ‘dumb server’.
Another kind of 3rd party solutions provide much more then just push services. They are mostly targeting mobile application developers and they enable creating a server-less mobile application.
They can provide stuff like persistence, user profiles, registrations and also some kind of server push technology, p2p messaging etc.
Both Pusher and Pubnub APIs are centered around channels that you can subscribe to and send/receive events.
Server doesn’t have to be the one publishing. In this example for the pubnub we can both publish and subscribe to the channel on the client side.
One of the differences between pusher and pubnub is how they deal with security.
With pusher you have to implement an endpoint in your application that will be called by the Pusher service when a new client connects to a “private” channel
With PubNub’s way of just giving your channels long unpredictable channel names, then every client that knows the name is authorized by definition.
One thing you will need to decide is whether to enable clients to publish directly into the channels.
You can choose to enable it, in which case you will need to do much stricter validation on the client, and if you want to have ‘special’ server messages you will have to either use separate channels for that or sign your messages on the server so that clients can verify the origin.
another option is to disable client publishing altogether. In this case clients only talk to the app server which will validate and pass the messages to the other clients through the push server.
My personal choice is to go with a hosted “dumb server” solution like Pusher or
Pubnub. I usually prefer hosted to roll-your-own solutions as it saves me lots
of time that would otherwase be wasted on IT.
For those reasons exactly we chose Heroku as our deployment platform of choice
at Astrails. Since moving to Heroku we saved literally tens of hours that were
spent on IT across all the projects that we have to run or maintain.
Both Pusher and Pubnub are available as Heroku addons, so adding them to your Ruby on Rails applicaiton is literally just a click away.
As you can see the technology advanced and matured enough for it to be really easy to implement real time server push in your applications. To the point that it is usually a no brainer whether to include push in your web app or not.
Many web applications we use every day can benefit from real time features sparing us repetitive browser refreshing.
I had a lot of things to do last Thursday, Feb-17. I met a friend from abroad
3am at Ben Gurion Airport and spent several hours talking before we went to sleep,
signed a contract for developing killer web app at 1:30am, and finally gave a
presentation at The Junction at 4:30pm.