Skip to main content

Map Reduce Simplified

Yes it is about parallel and distributed computing, there are tonnes of web pages, books articles, diagrams etc. etc with nice buzz words to talk about Map Reduce, here is the most simplified explanation.

Lets take a real life example.

1. Company CEO called all Program Manager's "I need total effort spent this month by noon". Program Manager's no problem sir. Why are Program Manager's not worried because they are going to distribute task :-)

2. Each Program Manager called their project manager asking for effort spent so far.

3. Each Project Manager pulled up effort sheet and provided it to their Program Managers.

4. Program Managers complied received sheet into one file and sent it to CEO.

5. Company CEO collated all the sheets and calculated total effort spent.

Each individually broke its task to smaller tasks (Mapped its input task to smaller tasks), Program Manager was required to provide effort spent, he mapped his task to smaller tasks, this is MAP.

Program Manager's on receiving data from their project managers compiled it back to single output, this is REDUCE.

Now lets zoom out and summarize how Map Reduce applies to distributed and parallel computing. Each node distributes its task to smaller tasks(Maps its given task). Each node receive results, combine them(REDUCE) to generate required output.


Comments

Popular posts from this blog

Drools - An overview

For Java based applications the most challenging part has always been the business logic maintenance, and pick any applications which you find complex and if we ask ourself how complex it would be moving forward, the answer will always be nX times.

What do we do ? Drools comes for Rescue as a Rule Engine.

Drools provides mechanism:

a. To write business logic in simple english language
b. Easy to maintain and very simple to extend
c. Reusability of logic by defining keywords in a DSL file and using them in DSLR file.

But be careful nothing comes free, everything takes cost in terms of memory and time space.

Use Drools if you really have :

a. Business logic which you think is getting cluttered with multiple if conditions because of variety of scenarios
b. You will have growing demand of increase in the complexity
c. The business logic changes would be frequent (1 - 2 times a year would also be frequent)
d. Your server's have enough of memory as it is a memory hungary tool, it provi…

Java 8 streams performance on mathematical calculations

Java 8 Streams API supports many parallel operations to process the data, it abstracts low level multithreading logic. To test performance did following simple test to calculate factorial of first X number starting with N.

Following program calculates factorial of first 1000 numbers.

package com.java.examples; import java.math.BigInteger; import java.time.Duration; import java.time.Instant; import java.util.ArrayList; import java.util.List; publicclassMathCalculation { publicstaticvoid main(String[] args) { List<Runnable> runnables = generateRunnables(1, 1000); Instant start= Instant.now(); // Comment one of the lines below to test parallel or sequential streams runnables.parallelStream().forEach(r -> r.run()); // runnables.stream().forEach(r -> r.run()); Instant end = Instant.now(); System.out.println("Calculated in "+ Duration.between(start, end)); } privatestaticvoid factorial(int number) { int i; BigInteger fact = BigInteger.valueOf(1); for (i =1; i <= numb…

MQTT : Android step by step guide using Eclipse Paho

For MQTT integration, recently explored Paho Android project, very simple to use, here are the steps:

Intialize a client, set required options and connect.

    MqttAndroidClient mqttClient = new MqttAndroidClient(BaseApplication.getAppContext(), broker, MQTT_CLIENT_ID);
    //Set call back class
    mqttClient.setCallback(new MqttCallbackHandler(BaseApplication.getAppContext()));
    MqttConnectOptions connOpts = new MqttConnectOptions();
    IMqttToken token = mqttClient.connect(connOpts);


Subscribe to a topic.

    token.setActionCallback(new IMqttActionListener() {
      @Override
      public void onSuccess(IMqttToken arg0) {
           mqttClient.subscribe("TOPIC_NAME" + userId, 2, null, new IMqttActionListener() {
                @Override
                public void onSuccess(IMqttToken asyncActionToken) {
                    Log.d(LOG_TAG, "Successfully subscribed to topic.");
                }

                @Override
                public void onFailure…