Saturday, July 7, 2012

Beautiful Mergesort

Effective data structures and programming is about putting simple blocks of logic together until the whole is larger than the sum of the parts. A great example is Mergesort. Mergesort is a beautiful algorithm to implement sorting in NlogN time (N being the number of elements being sorted). This is the asymptotic lower bound on sorting.


Mergesort brings together the concepts of recursion, divide-and-conquer, arrays, and pointers (when implemented in C). So here is to Jon Von Neumann and his fantastic invention of Merge sort.


#include <stdio.h>
/*
Simple Mergesort implementation by Sachin Agarwal
sachinkagarwal@gmail.com


 */
void PrintList(int list[], int stIndex, int endIndex)
/*
  Helper function - Print the elements of array list, between stIndex and endIndex
*/
{
  for(; stIndex <= endIndex;stIndex++)
    printf("%d ",list[stIndex]);
}

void Merge (int dataArray[], int stIndexL, int endIndexL, int stIndexR, int endIndexR) 
/*
  Merge two lists into a tempArray in sorted order; this tempArray is copied back to dataArray. 
*/
{
  int tempArray[endIndexL-stIndexL+1+endIndexR-stIndexR+1]; //temp storage
  int stL = stIndexL;
  int resultCtr = 0;
  while (stIndexL <= endIndexL) {
    if (stIndexR>endIndexR) {
      tempArray[resultCtr++]=dataArray[stIndexL++];
      continue;
    }
    if (dataArray[stIndexL] <= dataArray[stIndexR]) {
      tempArray[resultCtr++] = dataArray[stIndexL++];
    }
    else {
      tempArray[resultCtr++] = dataArray[stIndexR++];
    }
  }
  while (stIndexR <= endIndexR) {
    tempArray[resultCtr++] = dataArray[stIndexR++];
  }
  int ii = stL;
  for (; ii<=endIndexR;ii++)
    dataArray[ii] = tempArray[ii-stL];
}

void Mergesort(int dataArray[], int stIndex,int endIndex) 
/*
  Mergesort routine, divide into two halves, and then merge.
*/
{
  int midIndex = stIndex + (endIndex-stIndex)/2;
  if(stIndex<endIndex) {
    Mergesort(dataArray,stIndex,midIndex);
    Mergesort(dataArray,midIndex+1,endIndex);
    Merge(dataArray,stIndex,midIndex,midIndex+1,endIndex);
  }    
}

int main() {
  int dataArray [] = {5,3,7,8,1,6,9,2,0};
  int stIndex = 0;
  int endIndex = sizeof(dataArray)/sizeof(int) - 1;
  printf("\nUNSORTED DATA ARRAY: ");
  PrintList(dataArray,stIndex,endIndex);

  Mergesort(dataArray,stIndex,endIndex);
  printf("\nFINAL RESULT: ");
  PrintList(dataArray,stIndex,endIndex);
}

Wednesday, May 23, 2012

Keringhan & Richie Nostalgia - circular right shifting in C

I picked up my old Keringhan and Richie copy last weekend and came across the C puzzle to do circular right shifts on unsigned integer bits.

So for example if
x = unsigned int 1 = binary 00000000000000000000000000000001

The function

unsigned int rightrot (x, 29) 

will right shift the bits 29 times, and making sure that the bits wrap around.

Here is the code I came up with. Can you think of clever ways to improve the  code it?


#include <stdio.h>

unsigned int rightrot(unsigned int data, unsigned int rotCount)
{
  unsigned int ctr = 0;
  for(ctr=0;ctr<rotCount;ctr++) {
    unsigned int lsb = data & 1;
    data >>= 1;
    if (lsb)
    data |= (lsb << sizeof(unsigned int)*8-1);
  }
  return (data);
}

unsigned int main() {
  unsigned int d = 1;
  unsigned int shift = 31; 
  printf("%d right shifted and rotated %d times is %d\n",d,shift,rightrot(d,shift));
}

Tuesday, May 15, 2012

How I fixed my flat LCD TV

I have a Samsung 40 inch LCD TV. A little over two years old and with all the works - full HD, USB, multiple HDMI inputs, etc.

Then 3 weeks ago my wife heard 2 pops. Both my satellite receiver (a separate box) and my Samsung LCD had been fried, thanks to a high voltage spike in the antenna cable. My power surge protector strip was powerless (no pun intended) against this, because that high voltage signal traveled into my satellite receiver through the antenna cable, and from there on through the HDMI cable into my TV.

No indicator power/standby light, non-responsive TV. Oh no, Check mate.

The satellite receiver was easy to fix - it was immediately replaced by the store where I had bought it (still under warranty). That is why, sometimes, paying a few bucks extra instead of buying things from the Internet pays off later.

Anyway, the elephant in the room was the broken TV. Since it was out of warranty, I didnt even bother calling Samsung - they'd take 100+ euros to just look at it, and their service center is too far anyway from where I live. The local TV repair shop put the bill at a minimum of 280 Euros - thats almost half the cost of the TV itself. Out of the question. What else could I do?

Down but not out, I started looking into fixing it on my own. Now that is risky. Why?

  1. I have no training in fixing TVs (although I am a computer and electrical engineer)
  2. TVs insides can be very dangerous (You have been warned. Don't mess with it unless you really know what you are doing)
  3. Where and how to start?
  4. How would I figure out what is broken?
On the other hand, what had I got to lose anyway? It was a paperweight if I couldn't do something about it.

So I got the Philips #2 screwdriver and took the plunge. Almost a dozen screws later, the plastic back was off and I peered into the insides of my beloved LCD TV. The first thought that came to me was -  this looks a lot like the insides of a modern desktop computer. Not at all like those old CRT-display based TVs full of myriad electronic components and circuits. And I do know how to build and fix desktops. Why should this be so different?

Inside the LCD TV


Back side of the LCD TV, after removing the controller board. The top part (side of the TV) is the power board. The connectors from the powerboard to the controller board, from the controller board to the LCD screen, and from the controller board to the speakers/front panel are also visible. 

There are 3 main components  - the screen itself, the power board (its got lots of big inductors and capacitors on it - you cant miss it), and the controller board. In addition, there are small pilot circuits that run the remote control, front panel lights, speakers etc. In my case, I immediately noticed the problem - a burned out and blacked HDMI port on the controller board. So that is where the disastrous pop sound came from.


Close-up on the controller board. This is the CPU of the TV. Notice the black soot due to the burn out next to the HDMI port in the center of the picture. A careful visual inspection of the power board and the controller board can often yield the source of the problem

I quickly checked if DC voltages were being delivered from the cable coming out of the power board in order to be confident of my thesis that the controller board was the most likely culprit. I couldn't see any easy way of fixing the controller board. The parts are minuscule and this is a double-sided printed circuit board - parts on both sides. Too intricate for human hands to manipulate. Besides, how would I ever figure out which of those chips had been fried?

The only option was to replace the whole board. Thankfully, other people seem to breaking their LCD TVs in creative ways that leave their controller boards intact to be sold as parts - for example, Nintendo Wii controllers are notorious for breaking LCD screens. And sure enough, there are Internet vendors who have made a flourishing business out of selling parts pulled from other broken TVs. I found a vendor in the UK who stocked the controller board of my TV.

Now the trick is getting the same exact part as the one you are replacing. For this you need the part number. Note that the same controller/power boards are used across multiple Samsung TVs so searching for parts based exactly on the TV model number is sub-optimal. In case of my controller board I found the part number printed on the board itself (from my TV the controller board was BN41-01167C-MP1). Pasting this part number in Google yielded several sellers who offer it. I chose FlatTVParts.co.uk  for the favorable customer reviews and testimonials. They ship worldwide. I was not disappointed by their service. The part is provided with the guarantee that it is in working condition, and I believe there is a small return period as well (I hope I don't have to avail of this!). Total cost, including international shipping, was 60 Euros.

So, I screwed in this controller board. Plugged in the connectors (these connectors are just like computer connectors.). I made sure that I put back the back plastic panel before testing (caution - high voltages). 


Then I turned it on and waited. The standby light lazily turned a beautiful red. I pressed the power button. Tuned my satellite receiver. My 8-month daughter let out a squeal when the Kika channel came up. I was home.


Sunday, April 1, 2012

Devstack - Treading Lightly into the Openstack World

Devstack is a simple way to test drive Openstack before committing too much time to setting up a production-ready Openstack installation. It allows you to  setup a toy Openstack installation in as little as one single virtual machine running on your PC. Then you can start and stop virtual machines (nova-compute), assign storage (Nova swift), play with Openstack networking or run your own images (Glance), and monitor your personal cloud  via the nice graphical interface using Openstack Horizon in a web browser.

Devstack works by executing a bash script which sets up all the software dependencies and Openstack software to run Openstack. Its primary use case is for Openstack development - if you are interested in tweaking Openstack's software then Devstack flattens the steep learning curve of setting up the test environment. However, it is also an excellent "try before you dive in" option. Compared to the free Amazon AWS trial, you will gain first hand experience of the entire IaaS cloud ecosystem - from setting it up to provisioning cloud resources for your applications. 

Here are the steps to run Devstack:

  • Install VMware or another hypervisor of your choice (e.g. Virtualbox) on your computer; you probably need a powerful computer with lots of RAM because what we are going to do is run a virtual machine on this computer and then run the Openstack software inside this VM; which in turn will spawn virtual machines inside the outer virtual machine! I recommend at least a dual core processor with 4GB of RAM. 
  • Download the Ubuntu 11.10 ISO from here. You probably don't need the desktop version; the server version should be fine. Off course this is assuming that you will run the Openstack Horizon client in a web browser of the host operating system (otherwise you need to run the web browser in the VM, which means you need X). So make sure you have networking connectivity between the host and the guess Ubuntu 11.10 virtual machine.
  • Use the downloaded ISO to create the Ubuntu 11.10 server virtual machine using the hypervisor you selected. Make sure to assign it adequate resources. A good start would be 2.5 GB of RAM, 40 GB of disk space and two cores.
  • Next follow the straightforward instructions from the Devstack webpage
  • The main Devstack script (stack.sh) spits out the url of the Openstack Horizon web server at the end of the run. If your virtual machine is (network) accessible from your host OS then you can now point your browser to the url, login (credentials: admin,password or demo,password) to play with your very own Openstack cloud!

Ok done! Now what? Well, spawn a VM or two using the Horizon GUI. If you are more enterprising go ahead and try the euca2ools command line tools (the same command line tools used to speak with AWS also work here because Openstack supports the EC2 API). Try to spawn a whole bunch of VMs until your cloud gives up (in this case, the VM running the cloud is giving up!). If you are interested in learning how Openstack is setup, then reading the stack.sh Devstack script is a great introduction to Openstack's internals. You probably know that Openstack is opensource, and its written in accessible Python code. So you can get right into Openstack development  and contribute to the project if that is your kind of thing.

A word of caution - the cloud "starts clean"  every time its host system (in our case the Ubuntu 11.10 VM) is rebooted. This means that any old VMs, images, or other configuration is cleared (that is the logical thing to do for a developer relaunching the cloud every time she changes the code base). If this is problem for you then you can simply suspend the virtual machine instead of shutting it down. But again, this is another reason to not use Devstack for a production system.

Thursday, March 8, 2012

Stingy LTE Data-plans will ruin the new iPad's Party

The new iPad  has 2048 x 1536 pixels screen resolution. The best way to understand this is that it has a higher resolution than your normal 1080p full HD TV. Let us contemplate on how much data is needed to drive this high resolution screen. Lets talk video first. The Verizon FIOS HD channels typically come in at 15Mbps. That is almost 2MB/second. Now let us switch to using the new iPad  as a navigation device in the car. The iPad 3 resolution is more than 5 times as much as an iPhone 4 (960 x 640 pixels). This means that Google maps will have to load about 5 times as much map tile data on the new iPad to fill up the screen. From what I read on the Internet, users report that the iPhone 4 burns through about 30MB of bandwidth per hour when Google maps is used for navigation. So we are talking about 30x5 = 150MB/hour of data usage on the new iPad when it is used as a navigation device. Another example: in order for jpeg images to completely fill up the new iPad's screen without any software interpolation, they are going to need to be over 3 mega pixels each. That is about 1MB of data transfer per JPEG image!

Ok, but why do you need full screen apps and content, you may ask. Can't you compromise on the size and save bandwidth? No, because the new iPad's screen size is only 9.7inches across and chances are that we use the new iPad while holding it in our hands. If this is the case, then interpolation (up-sampling to fill up the pixels with guessed values) will be really ugly at best (say for jpeg pictures), and unusable at worst (say, maps with minute features marked on them). So while its acceptable to see a standard definition channel on an HD TV from a distance, it will be unacceptable seeing it from close range on the new iPad in full screen mode.

Now you might say, thats alright because we have LTE to tame the iPad 3 bandwidth sucking machine. Upto 100Mbps if you 've heard all the marketing around LTE.

But do we have the data plans? To quote from Techcrunch, here are the LTE plans on offer in the US today.

"
AT&T is offering three plans: 250MB for $14.99, 3GB for $30, and 5GB for $50. On the 250MB plan, you’ll be charged an additional $15 for each 250MB allotment you go over. On the two bigger plans, it’s a $10 overage fee per each additional 1GB of data.


Verizon is offering four different plans: 1GB for $20, 2GB for $30, 5GB for $50, and 10GB for $80. Their overage fees are a little more straightforward — it’s an extra $10 for each 1GB over.
"

Lets plug some of these numbers into the applications we alluded to earlier in this post. Lets say, that you have opted to spend $30 per month on your new iPad's LTE connection. How far will your $30 take you?




AT&T's $30 3GB LTE plan

Verizon's  $30 2GB LTE plan

HD-quality  video 
@ 2MB/sec for HD video you get only 1500 seconds (25min) of HD video from your MONTHLY data plan.

@ 2MB/sec for HD video you get only 1000 seconds (17min) of HD video from your MONTHLY data plan.


Google maps

@150MB/hour you get only 20 hours of Google maps from your MONTHY data plan

@150MB/hour you get only 13 hours of Google maps from your MONTHY data plan


Simply put, these data-plans just won't work for the new iPad. What is really surprising is that LTE, all through its long standardization process in the last decade, always promised lower cost/bit and higher spectral efficiency (slated to be 2x-5x, meaning LTE can pack 5 times as many bits in the same frequency spectrum compared to 3G). Lots of new technology was put into the standard  to support very high speed wireless mobile broadband services (read more about this on the LTE Wikipedia page).

Where are all those technology savings  going? Granted that telcos are making big capital expenditures rolling out the LTE infrastructure, but consumers are paying LTE  patent royalties (10s of dollars) too, every time they buy LTE-enabled devices such as the new iPad.

Why are the data plans so anemic? 

Can we as consumers really hope to enjoy the benefits of high-resolution mobile technology and LTE when Telcos are cutting out all the joy even before the first new iPad is shipped?


UPDATE: Related WSJ article: Video Speed Trap Lurks in New iPad 

UPDATE: Another angle from CNN, why images look so bad on the new iPad: Why do magazines look so bad on the new iPad?

Saturday, July 16, 2011

Google+ Circles: Humanity's Social Router

I have been trying to wrap my thoughts around  the importance of Google+ circles. The circles idea is to let Google+ users organize their Google+ contacts into different circles. The idea itself is not new; Facebook has let its users organize contacts into different bins for a long time now. The FB avatar of the idea hasn't really been a killer feature; in fact, PC Magazine  published a Google+ circles obituary based on the idea's failure in FB.

Most other features of Google+ are powerful and well planned - like video calling (hang-outs), seamless integration with other Google products (Gmail and You-tube), and a good cross-platform HTML mobile app. Still, Google is touting circles as the key Google+ feature. Why does Google think circles is so important?

Lets look at how Google+ circles affects the users social networking experience. By gently forcing the user to select which circle a new contact should belong, Google+ amortizes the job of categorizing contacts. On the other hand new contacts usually end up in one big "friends" bin in FB. The categorization (or binning) has to be performed later (and this is a tedious task - at least I haven't bothered to do it until now).

With Google+, I've ended up with my contacts being in one of these circles:

Fig. 1: Each circle is a post-box to send messages to a specific contact category.

So now I have a bunch of post-boxes, one corresponding to every circle, where I can post information (pictures/status updates/etc.) and they will get routed to that sub-set of contacts which comprise the circle. This gives me the ability to target information to relevant parts of my social network. I look at this as a social graph routing mechanism. Circles are routing rules that users put in place so that their social message streams are routed appropriately in the social graph.

Google+ is constructing humanity's social router via circles which will be programmed via routing rules defined through the elegant circles abstraction. Yes the same thing can be done with FB, but FB never really tried to make this the center-piece of its product. By gently forcing users to separate relationships via circles, Google+ might just manage to make users feel more confident about selectively routing their social lives with different groups of contacts, rather than blasting messages to everyone they (do or do not) know in their huge FB friend lists. The result should be a more information-rich Google+ social network. With more information comes better advertisement targeting possibilities.

 Users are more concerned about privacy with respect to their contacts (my family should not see what happened in the office holiday party) rather than Google knowing every intimate detail of their lives. A functional social router will implement this wish without coming in the way of Google obtaining user information. No I don't think Google will be more discerning than FB when it comes to monetizing the private information of users, but hey, who cares about user privacy anyway?

Sunday, May 22, 2011

App Engine's Price Shock and the New Web App Equilibrium

App Engine's Price Shock

The Google App Engine blog announced that Google's App Engine is moving out of beta this year. According to the blog this means that App Engine will have to become financially attractive to Google, and therefore the corresponding upward revision in pricing. The details are intricate and affect customers (app developers) differently based on what services they use; the takeaway is that there is a reduction in the free-tier plan, a fixed monthly subscription fee, and a switch from billing based on CPU cycles to billing based on CPU-instance uptime. Now this has made several app developers unhappy because they feel slighted by Google changing the pricing structure, so much, so late. Google is trying to assuage developer anger, but I am guessing that Google App Engine stands to loose many apps and developers given these higher pricing structures.

Anyway, going back to the 3 highlights of the pricing change, lets see how it improves App Engine's business plan.
  1. Reduction of the free-tier - Google claims that the free tier was eating into App Engine resources because there are many non-paying apps that run just fine using these free resources. Google is out to trim the free tier, leaving just enough to still win over developers who want to experiment with the App Engine, but no more. If an app is seriously deployed, it needs to pay for App Engine real estate. Many apps running on the App Engine are back-ends to iOS and Facebook apps. Why should Google be bank-rolling apps that benefit those platforms? App Engine's early adopter phase is over and freebies can end. 
  2. Fixed $9 monthly subscription fee - The App Engine has been a fantastic success with more than 100K deployed apps. Even if 10% of them can be converted into  monthly subscriptions then the App Engine will have an evergreen revenue stream (many developers have complained that they would end up pay much more than $9pm given the new pricing model). 
  3. CPU-instance time billing - Google wants customers to pay for availability (keeping their web apps running 24 x 7) rather than the amount of computation used by the apps. Since web apps usually remain online throughout, it makes business sense for Google to charge for this 24x7 CPU-instance time. CPU-cycles usage can be little and far-between given the eccentric product life-cycle, usage variability, and popularity of apps. It makes perfect sense for Google to start charging for availability rather than CPU cycles. 
Off course these facts are not lost on other cloud providers such as Amazon's AWS service who have always charged per CPU-instance and offer lean free tiers as well.

The Equilibrium Shift

I don't expect developers to shun the App Engine (or other cloud services for that matter) given this and future price increases. Its much more expensive, both in terms of capex and opex, to achieve a cloud provider's level of service availability and convenience for small and mid-sized app shops. But these price changes will affect the technology and architecture decisions of app developers, here is how:
  1. Simplification of server-side logic, and richer client side logic - Clever web app design can off-load CPU cycles to web-browsers, more so with the advent of sophisticated client-side Javascript libraries and HTML5. I expect web app developers to aggressively move more processing into the user browser. This has the added benefit of more responsive web applications, for example HTML5's local data store can be used to store user data and shave-off network latency.
  2. Multi-threaded server-side architecture - Charging for instances would push app developers toward utilizing instances more efficiently, for example, by adopting multi-threaded programming approaches. There is already talk about App Engine developers shunning Python (inherently weak in multi-threaded functionality) and moving to Java.
  3. Space-CPU time trade-off - Storage space continues to be relatively cheap in the cloud. I expect developers to store a lot more application state rather than having to compute it again at a later time. For example, instead of using traditional RDBMS databases (e.g. MySQL), app developers may start looking at simple nosql alternatives like couchDB, which are instead optimized to store multiple indexes and views based on common read patterns. 
  4. QoS as a service - App developers will also move toward higher workload thresholds in their load-balancing algorithms, meaning that users may see web apps slowing down as new CPU instances are more sparingly fired up during times of greater demand. I expect app developers to start charging users for differentiated QoS in apps. After all, there is no free lunch, and lunch just got a whole lot more expensive.
  5. Availability as a service - Does it make sense for an app to keep its back-end running (CPU instances online) in the dead of night because 0.1% of its users are insomniacs? I think that some web apps  may start experimenting with the breaking the unwritten rule of 24x7 web uptime, or at least charging more for the privilege of using these services outside waking hours. Off course, different time-zones complicate this idea.
In the end, money matters. Its going to be interesting to see the market affecting cloud-deployed web app architecture decisions of the future.

Tuesday, January 25, 2011

H.264 vs. WebM

And so it begins. The battle between the H.264 and WebM video codec.

Google's On2 acquisition and the subsequent open-sourcing of the VP8 video codec has created a formidable competitor for H.264. Formidable not because WebM is technically superior to H.264 but because now there is a free alternative to the proprietary and licensed H.264. WebM is free, underwritten by Google, and a proven web-video delivery veteren -after all, Adobe Flash has used On2's codecs for web video delivery over the years.

There are several things going for H.264. First, it is entrenched in several video delivery formats and standards. For example, Bluray uses H.264 to encode video. Millions of Bluray players will become obselete if WebM is used instead of H.264. My two cents are that this wont really happen, instead, newer players will incorporate the possibility of decoding WebM video also. Even as I write this I am aware of several hardware manufacturers who are incorporating the WebM video decoders into their ASIC hardware.  But I am not assuming that things like the Bluray standard will be changed, on the contrary, there are other emerging media delivery and storage standards that have been frozen with H.264 being selected as the codec of choice. Standards take years to change or deploy and its very unlikely that they can suddenly adopt WebM instead of H.264.

In the mid-term WebM will defeat H.264 where there is a (easily replacable) software decoder and soft-media. By soft-media I mean video that is not burnt onto read-only media like Bluray disks but instead exists, say, in the form of a web-downloadable video on a server's hard-disk. The economic compulsion of having to pay the H.264 licensing body per-video download and per decoder shipped compared to the free (as in air) WebM alternative shall edge out the former. I suspect web-video delivery platforms like You-tube will lead the charge because (1) The number of videos being downloaded are huge and, (2) Their average revenue per video is miniscule, and each WebM download instead of H.264 download saves a few cents in licensing fee.

 A black-knight for the time-frame question will be the innovation in H.264 vs. that in WebM. If open-sourcing WebM has the desired effect of creating a better and more innovative codec in the future then WebM could gain on H.264 faster. But I am sure that the H.264 camp won't be sitting on their palms all this while! Video codecs use advanced algorithms and developing such concepts needs big investments (R&D). Will backers of WebM bring that kind of investment to the table in the interest of improving WebM when there is no direct revenue coming back to them?

Another thing that is going for WebM is the push toward virtualization in consumer electronics (away from the conventional ASIC approach) in the coming years. This means that future hardware (such as future Bluray players) may be capable of running multiple upgradable decoders rather than being tied to a specific ASIC implementing a specific decoding algorithm for a specific codec. That may just break the hardware dominance of H.264 over WebM. As a consumer I would prefer to hedge my bets and buy a virtualization-capable decoder rather than being tied into one video codec via an ASIC decoder.

Tuesday, November 9, 2010

Parallelizing & Multiprocessing Commands Using Python

My computer has multiple processor cores. That means I could speed up scripts by running some of their tasks in parallel. I have written up a simple Python script that uses the Multiprocessing library to take a list of jobs (each is a unix command string) and then executes them on a specified number of independent processes. These processes are created only once and act as a pool of "workers" which undertake a job, submit the result of the computation, and then undertake another job (if available in the job queue). The script ends when there are no more jobs in the job queue.
This approach is useful when (1) You have a multi-processor/multicore CPU. (2) Your tasks are CPU intensive. (3) You are reasonably sure that the jobs are not internally parallelized to take advantage of multiple CPUs. In my case, I had two directories full of numerically-named image (.ppm) files whose PSNR's had to be compared using the pnmpsnr utility. Computing PSNR is a computationally intensive task. Running the comparisons serially (single process) was significantly slower than adopting a multiprocess approach.
The code below should get you started on parallelizing your computationally intensive script. You can download the script from here.

#! /usr/bin/env python
# Sachin Agarwal, Google, Twitter: sachinkagarwal, Web: http://sites.google.com/site/sachinkagarwal/ 
# November 2010
# Using Python to execute a bunch of job strings on multiple processors and print out the results of the jobs in the order they were listed in the job list (e.g. serially).
# Partly adapted from http://jeetworks.org/node/81


#These are needed by the multiprocessing scheduler
from multiprocessing import Queue
import multiprocessing
import commands
import sys

#These are specific to my jobs requirement
import os
import re
 
def RunCommand (fullCmd):
    try:
        return commands.getoutput(fullCmd)
    except:
        return "Error executing command %s" %(fullCmd)

        
class Worker(multiprocessing.Process):
 
    def __init__(self,
            work_queue,
            result_queue,
          ):
        # base class initialization
        multiprocessing.Process.__init__(self)
        self.work_queue = work_queue
        self.result_queue = result_queue
        self.kill_received = False
 
    def run(self):
        while (not (self.kill_received)) and (self.work_queue.empty()==False):
            try:
                job = self.work_queue.get_nowait()
            except:
                break

            (jobid,runCmd) = job
            rtnVal = (jobid,RunCommand(runCmd))
            self.result_queue.put(rtnVal)

            
def execute(jobs, num_processes=2):
    # load up work queue
    work_queue = multiprocessing.Queue()
    for job in jobs:
        work_queue.put(job)
 
    # create a queue to pass to workers to store the results
    result_queue = multiprocessing.Queue()
 
    # spawn workers
    worker = []
    for i in range(num_processes):
        worker.append(Worker(work_queue, result_queue))
        worker[i].start()
    
    # collect the results from the queue
    results = []
    while len(results) < len(jobs): #Beware - if a job hangs, then the whole program will hang
        result = result_queue.get()
        results.append(result)
    results.sort() # The tuples in result are sorted according to the first element - the jobid
    return (results) 

 
#MAIN 
if __name__ == "__main__":
    
    import time #Code to measure time
    starttime = time.time() #Code to measure time
    
   
    jobs = [] #List of jobs strings to execute
    jobid = 0#Ordering of results in the results list returned

    #Code to generate my job strings. Generate your own, or load joblist into the jobs[] list from a text file
    lagFactor = 5
    ppmDir1 = sys.argv[1]
    ppmDir2 = sys.argv[2]
    decNumRe = re.compile(u"((\d+)\.(\d+))")
    ctr = 0
    for root, dirs, files in os.walk(ppmDir1):
        numFiles = len(files)
        files.sort()
        fNameLen = len(files[0].split('.')[0])
        for fName in files:
            for offset in range(0,lagFactor):
                fNumber = int(fName.split('.')[0])
                targetFile =  '%0*d' % (fNameLen, max(fNumber-offset,1))
                targetPath = ppmDir2+'/'+targetFile+'.ppm'
                origPath = ppmDir1+'/'+fName
                fullCmd = "pnmpsnr "+origPath+' '+targetPath  #Linux command to execute
                jobs.append((jobid,fullCmd)) # Append to joblist
                jobid = jobid+1

    # run
    numProcesses = 2
    if len(sys.argv) == 3:
        numProcesses = int(sys.argv[1])
    results = execute(jobs,numProcesses) #job list and number of worker processes
    
    #Code to print out results as needed by me. Change this to suit your own need
    # dump results
    ctr = 0
    for r in results:
        (jobid, cmdop) = r  
        if jobid % lagFactor == 0:
            print
            print jobid/lagFactor,
        print '\t',
         
        try:
            print cmdop.split()[10],
        except:
            print "Err",
        ctr = ctr+1
    print

    print "Time taken = %f" %(time.time()-starttime) #Code to measure time

Sunday, October 31, 2010

My Grocery Store is a Mobile Operator

My grocery store sells generic versions of bottled water, soap, breakfast cereal, butter, milk and mobile voice/Internet service. Now thats quite remarkable considering Rewe, the German grocery store chain I am alluding to, doesn't really have a history in the German telecommunications market. What they do have are 15445 stores across Europe that can stock up prepaid SIM cards branded "ja! Mobil" (the name comes from their generic in-store brand). Their physical presence and the mind space ja! occupies drives their business model. If shoppers can drink ja! branded generic cola then they could as well use ja!-branded mobile voice/Internet service.

The innovation here is the marketing possibility offered by Rewe grocery stores (instead of any technical innovation). Rewe has partnered with T-Mobile in Germany to implement its ja! branded "mobile operator". T-Mobile provides a white-label technical platform and Rewe simply brands it "ja! mobile". T-Mobile wins because it gets to sell its service at a discount to lower-paying market segments without putting off the premium T-Mobile customers, Rewe makes a neat profit by leveraging the ja! brand, and the customer wins by getting a discounted service from the best mobile operator of Germany, minus the T-Mobile brand.

I was looking at ja! mobile pricing. There are various flavors of pre-paid and flat-rate plans, although the focus seems to be on pre-paid plans that require no long-term contract and can be dispensed at Rewe's check-out counters. Depending on a customer's typical usage, s/he can can trade-off get a discounted subset of services from among the services offered - SMS, MMS, in-network calling, fixed-line calls, data etc. Interestingly, customer support is not free. Its a little like the contemporary airline business where everything from customer service to carry-on baggage can become a chargeable add-on rather than part of the product. Customers need to be mindful of what their money is buying them before assuming that things like customer service or technical support is part of the product.

Brick-and-mortar stores also sell iTunes gift cards and Facebook credit nowadays. Dell and Amazon partner with Best Buy to sell computers and Kindle e-books respectively. There are interesting business opportunities for anyone who can funnel real customers and subscribers (read: money) into the virtual/communications world. Very real profits await those brick-and-mortar outfits who can build bridges between technology companies and customers, even if they are just plain-Jane grocery stores!

Wednesday, October 13, 2010

Fancy Vertical Handover: A victim of REST?

There has been a ton of research, standardization work, and development around Vertical Handover - the ability to change the underlying network access without disturbing the overlying communication protocol (TCP or application) sessions. The simplest example is when a user moves from a Wifi zone (e.g. office) to a 3G zone (outdoors). A seamless handover hides the underlying rewiring of the access and lets the user continue using the device as if nothing changed. Vertical handovers have quickly graduated from laboratory quirk to mainstream occurance, with Wifi-enabled smart-phones switching between access technologies multiple times daily.

But the vertical handover on my smart-phone doesn't really preserve the underlying TCP session and yet works pretty well. Why? Because most of the apps on my phone use REST-ful protocols like HTTP, XML-RPC, or SOAP. That means they are, in theory, stateless. In fact, a TCP connection is created and torn down for every message exchange between the service server and the client. Sometimes TCP connections linger on to improve efficiency (carrying multiple request-response mesages between the client and service server), but a discontinuity in the TCP connection is not catastrophic.  I simply see my smart-phone negotiate a new connection with the new access (3G or Wifi) and then my app keeps working as if nothing has changed.

All that talk about preserving TCP connections across access technologies was much ado about nothing!

Wednesday, October 6, 2010

Mobile Video Calling: Can Tango Tango?

Tango is a newly launched mobile-to-mobile video calling application for iPhone and Android devices. Tango enables smart-phone owners to see each other in addition to speaking with each other during a Voice Over IP (voip) conversation. Many smart-phones come with front facing cameras, ostensibly for video calling, and Tango enables people to use these cameras during a voip call. Think of mobile video calling when you want to see your expat pet doing silly tricks on video (or for beach and boardroom voyeurism).

But, as Walter Mossberg's Tango's review in WSJ reports, the  quality of Tango's video call leaves a lot to be desired. I came across a video on Gizmodo's website showing Tango in action. The verdict is that Tango's performance is way below expectation. In fact, Tango's video frame-rate seemed to be approximately 1 frame per second in the Gizmodo video (and not the "high quality video mobile calling service" as the company's press release claims).

Make no mistake, achieving even 1 frame-per-second video+voice is no small feat. Tango's engineers have packed a real-time video+voice encoder/decoder into a smart-phone and have managed to trasmit/receive two parallel audio/video streams over Wifi (they also claim high quality video calls over 3G but lets not give Tango all the benefit of doubt :-) ). On top of this, achieving this for both the Android and iPhone platforms and for dozens of smart-phone models is admirable.

Frankly, I am not surprised by Tango's dismal video frame rate - resource bottlenecks such as smart-phone hardware, software/OS, network bandwidth and latency have to be overcome before an acceptable double digit frame-rate is achieved. But what surprised me was the poor voice quality: the Tango call sounded a lot like those cheap international calling cards I used to make international calls from the US  many years ago. Terrible sound quality. I wonder why Tango engineers didn't trade more video quality (or even cut out video entirely when resources were scarce) and spend resources on improving voice? Voice over IP for mobile phones is a solved problem - Skype and the umpteen number of mobile SIP voip clients got audio to work well even on older smart-phones. Why couldn't Tango?

Tango is an over-the-top application, meaning that it runs over the best-effort (ordinary) Internet. I mention this here because the alternative, 3G telecom-operator-supported video calling, uses a dedicated network channel to ensure call quality assurance. But a Tango call will be carried over the same pipes as plain web traffic, making the video/voice call quality dependent on what else is being transmitted during the call. Telecom-supported 3G video calling is also much more energy (battery) efficient  than Tango.Why? Because in order to remain signed-into Tango to receive calls, the smart-phone has to periodically send "I-am-alive" messages to the Tango server. This means that a TCP or UDP socket is always active (or repeatedly created and and torn-down), effectively disabling the smart-phone's built-on power-saving sleep function. Offcourse, telecom supported 3G video calling costs money, but it is technically superior to Tango or any other over-the-top mobile video calling system.

But this is not about Telecom vs. Internet applications. This is about the use-case. Video calling was touted as one of the big use-cases for 3G Telecom networks (and 4G too?). 3G standards support video calling and so there is hardware acceleration, network resource reservation, optimized audio/video codecs, and cross-phone/OS support for video calling on every modern smart-phone. But apart from the cost of making 3G video calls, is their something else that relegated video calling to its sad never-used status in phones? Yes there is. Video calling has simply not been accepted as a viable form of mass communication in our society, and remains to-date, a quirky add-on. When was the last time you  placed a video call?

When Internet telephony (voip) arrived it quickly replaced circuit-switched calling. With mobile video calling, even if Tango can eventually fix its technical/engineering limitations, there is nothing to replace! Sadly, the mobile video calling use-case was still-born from the beginning.

Saturday, July 3, 2010

Microsoft Kin: RIP == Social Networks:RIP (?)


Microsoft Kin 04/2010-06/2010

Microsoft is phasing out its social-network/cloud storage-heavy Kin smart phone just 2 months after launch. This embarrassing report from CNN claims that the Microsoft+Verizon Kin sold less than 10000 units in the two months. RIP Kin.

I never got around to using the Kin, but apparently the market didn't see the justification for the expensive data plan (>=$29 p.m.) tagged on to the Kin by Verizon. The market was supposed to be teens looking to stay connected via social networks, but they did not bite into the insanely high data-plan tariff. Social networking, it seems, is not worth that much to them. How much is it worth anyway?

Lets not belittle the effort Microsoft put into this device - as a product the Kin was fully functional and seemed to do the things you would expect from this sort of device - Internet social networking, cloud storage and syncing of users' data, a built in Zune player, sleek design, etc. And at under $100 (with a data plan) it had a low entry barrier too. It seems like all the pieces were there but the Kin machine never got off the ground.

I don't know if the lack of a credible app store spelt the end for the Kin. What I do know is that social networking apps completely failed to drive sales. Next time someone uses social networking as the use-case for a device or service that is supposed to make money, say - Kin!




Friday, April 30, 2010

Untitled Poem

Among many other things, my father taught me how to read and write English. Everything I've ever written starts with what he taught me. Now as he lies dying of cancer, I wrote this for him. Say a prayer for him.


All of my thoughts
Like river drops
Together making up me
Like a river that flows
Until it throws
Fresh into the salty sea

All rivers meet that end
No matter what they pretend
Or how many bends they make
And so it will be
With every drop inside me
No matter what path I take

So you may ask
The point of the task
To meander toward the salty end

But don't we all know

Drops become vapor and snow

From which new rivers descend