scalability - How scalable is distributed Erlang? -
part a:
erlang has lot of success stories running concurrent agents e.g. millions of simultaneous facebook chats. that's millions of agents, of course it's not millions of cpus across network. i'm having trouble finding metrics on how erlang scales when scaling "horizontal" across lan/wan.
let's assume have many (tens of thousands) physical nodes (running erlang on linux) need communicate , synchronize small infrequent amounts of data across lan/wan. @ point have communications bottlenecks, not between agents, between physical nodes? (or work, assuming stable network?)
part b:
i understand (as erlang newbie, meaning totally wrong) erlang nodes attempt connect , aware of each other, resulting in n^2 connection point-to-point network. assuming part won't work n = 10k's, can erlang configured (using out-of-the-box config or trivial boilerplate, not writing full implementation of grouping/routing algorithms myself) cluster nodes manageable groups , route system -wide messages through cluster/group hierarchy?
we should specify talk horizontal scalability of physical machines -- that's problem. cpus on 1 machine handled 1 vm, no matter number of is.
node = machine.
to begin, can 30-60 nodes out of box (vanilla otp installation) custom application written on top of (in erlang). proof: ejabberd.
~100-150 possible optimized custom application. means, has code, written knowledge gc, characteristic of data types, message passing etc.
over +150 right when talk numbers 300, 500 require optimizations & customizations of tcp layer. also, our app has aware of cost of e.g. sync calls across cluster.
the other thing db layer. mnesia (built-in) due features not effective on 20 nodes (my experience - may wrong). solution: use else: dynamo dbs, separate cluster of mysqls, hbase etc.
the common technique leverage cost of creating high quality application , scalability federations of ~20-50 nodes clusters. internally efficient mesh of ~50 erlang nodes , connected via suitable protocol n 50 nodes clusters. sum up, such system federation of n erlang clusters.
distributed erlang designed run in 1 data center. if need more, geographically distant nodes, use federations.
there lots of config options e.g. not connect nodes each other. may helpful, in ~50 cluster erlang overhead not significant. can create graph of erlang nodes using 'hidden' connection, doesn't join full mesh, cannot benefit connection nodes.
the biggest problem see, in kind of systems, designing master-less system. if not need that, should ok.
Comments
Post a Comment