Business Scoops

Making Stuff: 3D Printing on Campus

February 23, 2012 by Campus Technology


For example, company co-founder Bre Pettis appeared on a June 2011 episode of The Colbert Report, where host Stephen Colbert had his face scanned by a 3D scanner and a model of his head composed by the printer. That plan went into the Thingiverse
See all stories on this topic »

Faced with Distraction, We Need Willpower

February 22, 2012 by John Coleman


Mustering willpower is a struggle for almost everyone — and it’s getting harder. We, as individuals and as a society, lack self-control at precisely the time we need it most.

Willpower is about more than resisting our bad habits. It’s the mental discipline that allows us to cultivate good habits, make better decisions, and control our own behaviors — everything from dieting effectively to powering through difficult problems at work. It’s a quality that can separate the most productive businesspeople from the least productive. And it’s a trait that many of us lack. Surveys of more than 1 million people show that self-control is the character trait modern men and women recognize least in themselves.

Our environment only exacerbates the problem. The jungle of stimuli that engulfs us each day make it difficult to exercise restraint or focus on the important habits we need to build or tasks we need to accomplish. Nicholas Carr has argued in his book, The Shallows, that the internet is destroying our ability to concentrate and read or think deeply; and as John Tierney and Roy Baumeister point out in their book, Willpower, a typical computer user checks out more than three dozen websites per day. Focusing on an important memo is hampered by the distraction of Facebook and the incessant new email notifications blinking on our smartphones. Our ability to read a book is handicapped by the impatience of our 140-character habits. Even as I write this article, I’m tempted to snack, surf Wikipedia, check Twitter, or switch to another task.

But willpower is an essential quality you’ll need for personal effectiveness at work, forcing yourself to prioritize the most important items on your to-do list, powering through an endless day of difficult decisions, or simply resisting the urge to eat that extra bag of chips in the office snack room. Want to grow your business or get that promotion at work? Cultivating willpower may be your quickest route to success.

To combat declining willpower, consider a few of the following approaches, based in part on Tierney and Baumeister’s recommendations:

  • Practice small. Did you know that by reminding yourself to sit up straight at your desk you can train the same mental muscle you need to quit smoking or sustainably shed pounds? Research has indicated (PDF) that even reminding yourself to keep good posture on a regular basis can gradually improve your ability to self-regulate, and maintaining a regular exercise routine may improve self-control. Practice small exercises in self-control, and your overall willpower will benefit.
  • Take on your greatest challenges one at a time. How long was your New Year’s resolutions list this year? How many points have you already ignored? Even my suggested list for young leaders had five separate points, but if you want to shake a particularly trying habit (or build a good one), you should only focus on one major change at a time. Start, for instance, with your resolution to check Facebook or Twitter only twice per day; then, once you’re free of that habit, move on to your new diet and exercise plan. In the short term, the amount of willpower you have is fixed, and overloading yourself with new tasks that require it may diminish your ability to accomplish any goal.
  • Monitor, monitor, monitor. Want to run a fast mile? Time every run. Want to write the next great American novel? Post the word count you’ve written every day on Facebook for all your friends to see. The more you monitor something (and ask others to help you monitor) the more likely you are to stay on task. Sites like Quantified Self offer an increasingly diverse array of ways to self-monitor, just as sites like Mint.com offer specific opportunities for self-regulation. If you’re distracted by Facebook, Twitter, or other social media at work, keep a log of every time you check those sites and force yourself to introduce small goals to reduce the number of times you visit them every day.
  • Find time to replenish. In the short term, you only have so much willpower, and once it’s depleted, your ability to exercise self-control or make sound decisions diminishes dramatically. If you’re in a stressful job, for example, your ability to make decisions is worse in the afternoon than in the morning. However, finding downtime and even eating (replenishing your body’s glucose) can help you replenish your willpower before taking on difficult decisions or tasks. Skipping or working through lunch may actually negatively impact both your ability to make decisions and your ability to work productively in the afternoon.
  • Keep it clean. A simple way to improve willpower is to operate in a neat environment. Tierney and Baumeister note that environmental cues like messy desks or unmade beds can “infect” the rest of your life and habits with disorder, whereas maintaining a neat and clean environment can help you to maintain order and self-control in the other tasks you confront. If your office or cubicle is a mess at work, make your first order of business to organize your space, and you may find your focus and productivity improving at work.

Willpower is a struggle in the modern era. Our distraction-filled lives make it innately difficult. These are just a few tips to build and maintain willpower, but starting here may help you build a critical personal discipline.

What else do you do to stay on track?

This post is part of a series of blog posts by and about the new generation of purpose-driven leaders.

Big data in the cloud

February 22, 2012 by Edd Dumbill


Big data and cloud technology go hand-in-hand. Big data needs clusters of servers for processing, which clouds can readily provide. So goes the marketing message, but what does that look like in reality? Both “cloud” and “big data” have broad definitions, obscured by considerable hype. This article breaks down the landscape as simply as possible, highlighting what’s practical, and what’s to come.

IaaS and private clouds



What is often called “cloud” amounts to virtualized servers: computing
resource that presents itself as a regular server, rentable per
consumption. This is generally called infrastructure as a service
(IaaS), and is offered by platforms such as Rackspace Cloud or Amazon
EC2. You buy time on these services, and install and configure your
own software, such as a Hadoop cluster or NoSQL database. Most of the
solutions I described in my Big Data Market Survey can be deployed on
IaaS services.



Using IaaS clouds doesn’t mean you must handle all deployment
manually: good news for the clusters of machines big data
requires. You can use orchestration frameworks, which handle the
management of resources, and automated infrastructure tools, which
handle server installation and configuration. RightScale offers a
commercial multi-cloud management platform that mitigates some of the
problems of managing servers in the cloud.



Frameworks such as OpenStack and Eucalyptus aim to present a uniform
interface to both private data centers and the public
cloud. Attracting a strong flow of cross industry support, OpenStack
currently addresses computing resource (akin to Amazon’s EC2) and
storage (parallels Amazon S3).



The race is on to make private clouds and IaaS services more usable:
over the next two years using clouds should become much more
straightforward as vendors adopt the nascent standards. There’ll be a
uniform interface, whether you’re using public or private cloud
facilities, or a hybrid of the two.



Particular to big data, several configuration tools already target
Hadoop explicitly: among them Dell’s Crowbar, which aims to make
deploying and configuring clusters simple, and Apache Whirr, which is
specialized for running Hadoop services and other clustered data processing systems.



Today, using IaaS gives you a broad choice of cloud supplier, the
option of using a private cloud, and complete control: but you’ll be
responsible for deploying, managing and maintaining your clusters.

Microsoft SQL Server is a comprehensive information platform offering enterprise-ready technologies and tools that help businesses derive maximum value from information at the lowest TCO. SQL Server 2012 launches next year, offering a cloud-ready information platform delivering mission-critical confidence, breakthrough insight, and cloud on your terms; find out more at www.microsoft.com/sql.

Platform solutions

Using IaaS only brings you so far for with big data applications: they handle the creation of computing and storage resources, but don’t address anything at a higher level. The set up of Hadoop and Hive or a similar solution is down to you.

Beyond IaaS, several cloud services provide application layer support for big data work. Sometimes referred to as managed solutions, or platform as a service (PaaS), these services remove the need to configure or scale things such as databases or MapReduce, reducing your workload and maintenance burden. Additionally, PaaS providers can realize great efficiencies by hosting at the application level, and pass those savings on to the customer.

The general PaaS market is burgeoning, with major players including VMware (Cloud Foundry) and Salesforce (Heroku, force.com). As big data and machine learning requirements percolate through the industry, these players are likely to add their own big-data-specific services. For the purposes of this article, though, I will be sticking to the vendors who already have implemented big data solutions.

Today’s primary providers of such big data platform services are Amazon, Google and Microsoft. You can see their offerings summarized in the table toward the end of this article. Both Amazon Web Services and Microsoft’s Azure blur the lines between infrastructure as a service and platform: you can mix and match. By contrast, Google’s philosophy is to skip the notion of a server altogether, and focus only on the concept of the application. Among these, only Amazon can lay claim to extensive experience with their product.

Amazon Web Services

Amazon has significant experience in hosting big data processing. Use of Amazon EC2 for Hadoop was a popular and natural move for many early adopters of big data, thanks to Amazon’s expandable supply of compute power. Building on this, Amazon launched Elastic Map Reduce in 2009, providing a hosted, scalable Hadoop service.

Applications on Amazon’s platform can pick from the best of both the IaaS and PaaS worlds. General purpose EC2 servers host applications that can then access the appropriate special purpose managed solutions provided by Amazon.

As well as Elastic Map Reduce, Amazon offers several other services relevant to big data, such as the Simple Queue Service for coordinating distributed computing, and a hosted relational database service. At the specialist end of big data, Amazon’s High Performance Computing solutions are tuned for low-latency cluster computing, of the sort required by scientific and engineering applications.


Elastic Map Reduce

Elastic Map Reduce (EMR) can be programmed in the usual Hadoop ways, through Pig, Hive or other programming language, and uses Amazon’s S3 storage service to get data in and out.

Access to Elastic Map Reduce is through Amazon’s SDKs and tools, or with GUI analytical and IDE products such as those offered by Karmasphere. In conjunction with these tools, EMR represents a strong option for experimental and analytical work. Amazon’s EMR pricing makes it a much more attractive option to use EMR, rather than configure EC2 instances yourself to run Hadoop.

When integrating Hadoop with applications generating structured data, using S3 as the main data source can be unwieldy. This is because, similar to Hadoop’s HDFS, S3 works at the level of storing blobs of opaque data. Hadoop’s answer to this is HBase, a NoSQL database that integrates with the rest of the Hadoop stack. Unfortunately, Amazon does not currently offer HBase with Elastic Map Reduce.

DynamoDB

Instead of HBase, Amazon provides DynamoDB, its own managed, scalable NoSQL database. As this a managed solution, it represents a better choice than running your own database on top of EC2, in terms of both performance and economy.

DynamoDB data can be exported to and imported from S3, providing interoperability with EMR.

Google

Google’s cloud platform stands out as distinct from its competitors. Rather than offering virtualization, it provides an application container with defined APIs and services. Developers do not need to concern themselves with the concept of machines: applications execute in the cloud, getting access to as much processing power as they need, within defined resource usage limits.

To use Google’s platform, you must work within the constraints of its APIs. However, if that fits, you can reap the benefits of the security, tuning and performance improvements inherent to the way Google develops all its services.

AppEngine, Google’s cloud application hosting service, offers a MapReduce facility for parallel computation over data, but this is more of a feature for use as part of complex applications rather than for analytical purposes. Instead, BigQuery and the Prediction API form the core of Google’s big data offering, respectively offering analysis and machine learning facilities. Both these services are available exclusively via REST APIs, consistent with Google’s vision for web-based computing.

BigQuery

BigQuery is an analytical database, suitable for interactive analysis over datasets of the order of 1TB. It works best on a small number of tables with a large number of rows. BigQuery offers a familiar SQL interface to its data. In that, it is comparable to Apache Hive, but the typical performance is faster, making BigQuery a good choice for exploratory data analysis.

Getting data into BigQuery is a matter of directly uploading it, or importing it from Google’s Cloud Storage system. This is the aspect of BigQuery with the biggest room for improvement. Whereas Amazon’s S3 lets you mail in disks for import, Google doesn’t currently have this facility. Streaming data into BigQuery isn’t viable either, so regular imports are required for constantly updating data. Finally, as BigQuery only accepts data formatted as comma-separated value (CSV) files, you will need to use external methods to clean up the data beforehand.

Rather than provide end-user interfaces itself, Google wants an ecosystem to grow around BigQuery, with vendors incorporating it into their products, in the same way Elastic Map Reduce has acquired tool integration. Currently in beta test, to which anybody can apply, BigQuery is expected to be publicly available during 2012.

Prediction API

Many uses of machine learning are well defined, such as classification, sentiment analysis, or recommendation generation. To meet these needs, Google offers its Prediction API product.

Applications using the Prediction API work by creating and training a model hosted within Google’s system. Once trained, this model can be used to make predictions, such as spam detection. Google is working on allowing these models to be shared, optionally with a fee. This will let you take advantage of previously trained models, which in many cases will save you time and expertise with training.

Though promising, Google’s offerings are in their early days. Further integration between its services is required, as well as time for ecosystem development to make their tools more approachable.

Microsoft

I have written in some detail about Microsoft’s big data strategy in Microsoft’s plan for Hadoop and big data. By offering its data platforms on Windows Azure in addition to Windows Server, Microsoft’s aim is to make either on-premise or cloud-based deployments equally viable with its technology. Azure parallels Amazon’s web service offerings in many ways, offering a mix of IaaS services with managed applications such as SQL Server.

Hadoop is the central pillar of Microsoft’s big data approach, surrounded by the ecosystem of its own database and business intelligence tools. For organizations already invested in the Microsoft platform, Azure will represent the smoothest route for integrating big data into the operation. Azure itself is pragmatic about language choice, supporting technologies such as Java, PHP and Node.js in addition to Microsoft’s own.

As with Google’s BigQuery, Microsoft’s Hadoop solution is currently in closed beta test, and is expected to be generally available sometime in the middle of 2012.

Big data cloud platforms compared

The following table summarizes the data storage and analysis capabilities of Amazon, Google and Microsoft’s cloud platforms. Intentionally excluded are IaaS solutions without dedicated big data offerings.














  Amazon Google Microsoft


Product(s)
Amazon Web Services
Google Cloud Services
Windows Azure

Big data storage
S3
Cloud Storage
HDFS on Azure

Working storage
Elastic Block Store
AppEngine (Datastore, Blobstore)
Blob, table, queues

NoSQL database
DynamoDB1
AppEngine Datastore
Table storage

Relational database
Relational Database Service (MySQL or Oracle)
Cloud SQL (MySQL compatible)
SQL Azure

Application hosting
EC2
AppEngine
Azure Compute

Map/Reduce service
Elastic MapReduce (Hadoop)
AppEngine (limited capacity)
Hadoop on Azure2

Big data analytics
Elastic MapReduce (Hadoop interface3)
BigQuery2 (TB-scale, SQL interface)
Hadoop on Azure (Hadoop interface3)

Machine learning
Via Hadoop + Mahout on EMR or EC2
Prediction API
Mahout with Hadoop

Streaming processing
Nothing prepackaged: use custom solution on EC2
Prospective Search API 4
StreamInsight2 (“Project Austin”)

Data import
Network, physically ship drives
Network
Network

Data sources
Public Data Sets
A few sample datasets
Windows Azure Marketplace

Availability
Public production
Some services in private beta
Some services in private beta

Conclusion

Cloud-based big data services offer considerable advantages in removing the overhead of configuring and tuning your own clusters, and in ensuring you pay only for what you use. The biggest issue is always going to be data locality, as it is slow and expensive to ship data. The most effective big data cloud solutions will be the ones where the data is also collected in the cloud. This is an incentive to investigate EC2, Azure or AppEngine as a primary application platform, and an indicator that PaaS competitors such as Cloud Foundry and Heroku will have to address big data as a priority.

It is early days yet for big data in the cloud, with only Amazon offering battle-tested solutions at this point. Cloud services themselves are at an early stage, and we will see both increasing standardization and innovation over the next two years.

However, the twin advantages of not having to worry about infrastructure and economies of scale mean it is well worth investigating cloud services for your big data needs, especially for an experimental or green-field project. Looking to the future, there’s no doubt that big data analytical capability will form an essential component of utility computing solutions.

Notes:

1 In public beta.

2 In controlled beta test.

3 Hive and Pig compatible.

4 Experimental status.

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20

Related:

RIM and BlackBerry's Rise and Fall

February 21, 2012 by (author unknown)


Five years ago, Research in Motion, maker of the BlackBerry, was one of the most acclaimed technology companies in the world. The BlackBerry dominated the smartphone market, was a staple of the business world, and had helped make texting a mainstream practice. Terrifically profitable, the phone became a cultural touchstone—in 2006, a Webster’s dictionary made “CrackBerry” its word of the year. These days, it seems more like the SlackBerry.

Michael Porter on Why 'Best' Isn't the Right Goal

February 20, 2012 by Erika Andersen


I don’t always agree with Michael Porter about strategy.  Which, in recent years, is kind of like saying I’m not sure the sun really rises in the East.  I won’t bore you with the sources of my disagreement (if you’re interested you can read this post).

How to create a visualization

February 13, 2012 by Pete Warden


Over the last few years I’ve created a few popular visualizations, a lot of duds, and I’ve learned a few lessons along the way. For my latest analysis of where Facebook users go on vacation, I decided to document the steps I follow to build my visualizations . It’s a very rough guide, these are just stages I’ve learned to follow by trial and error, but following these guidelines is a good way to start if you’re looking to create your first visualization.

Play with your data

I was lucky enough to spend a few hours with Andreas Weigend recently, head of the Stanford Social Data lab. He has nine rules of data, and the first is “Start with the problem, not the data.” What struck me about visualizations is that I actually take the opposite approach. I find the only way to begin is to explore what information is available and get a feeling for what stories it can tell.

In my case, we have a Cassandra cluster with information on more than 350 million photos shared on Facebook. I’ve been running Pig analytics jobs regularly to get a view of what we have in there. One of the reports we generate is a count of how many photos and users we have for particular places:

Data source example
Click to enlarge.

I was chatting with my colleague Chris Raynor about this, and he asked me if we could tell where all the visitors to those places were coming from. This was something that had been at the back of my mind for a long time. Seeing how much information we had on each destination made me realize we had enough data to produce significant and meaningful answers.

When I was learning engineering, one of my favorite case studies was an investigation into an air-traffic control system. Software engineers couldn’t understand why fully-computerized control rooms were actually less efficient and safe than more old-fashioned sites. What the researchers discovered was that the old process of passing around and arranging small cards that each represented a plane gave controllers a much stronger awareness of the situation than a screen that didn’t require their involvement for tasks, such as handing an aircraft to a colleague. I think the same is true of data. The more time you spend manipulating and examining the raw information, the more you understand it at a deep level. Knowing your data is the essential starting point for any visualization.

Pick a question

Now that I had a rough idea for what I wanted to visualize, I really needed to focus on what I would be doing. The best way to do that is to chose the exact title you want to give your visualization. I actually messed this up on one early map I created, giving the blog post the title “How to split up the US.” Everyone subsequently described it as “The Five Nations of Facebook.” Since then, I’ve tried very hard to pick the most natural title for what I’m going to be presenting, and then ensure I can deliver on the promise of the headline.

In this case I had a clear idea of the question at the start, it was going to be “Where do people go on vacation?”. However, as I thought about it, I realized it needed to be a lot more specific and concrete. There’s already a lot of “top travel destinations” lists out there, so what made mine different? It was the use of Facebook to gather much richer and more detailed information, so I refined it to “Where do Facebook users go on vacation?”.

Sketch out your presentation

I now had the data and a question I wanted to answer. The next step was figuring out how to show the information in a visual form. I’m in love with network diagrams showing connections between thousands of objects, but so often they are completely baffling to the rest of the world. I still remember David Cohen threatening to strangle me if I showed him another one of “those damn spider webs” instead of a business plan. However, network diagrams are a good way of hinting at how much data is available for querying; they can really give an idea of the sheer scale of what’s there.

One of my favorite recent visualizations was Paul Butler’s map of friendships on Facebook, so I decided to use that as a visual reference:

Paul Butler's Visualizaing Friendships visualization
See the full version of Paul Butler’s “Visualizing Friendships” visualization.

I borrowed a couple of key ideas from his work: the general color palette of the blue lines on a dark background and the use of great circles to create flowing arcs for all connections.

As I thought about the presentation, I realized that I had to simplify what it would be showing. With sources and destinations plotted all over the world, both the visual look and the querying interface would be overwhelming. Our user-base is primarily American thanks to our reliance on English-only natural language processing, so with that in mind I decided to make life simpler by only showing data from people who lived in the U.S. Accordingly, I changed the question in my title to “Where do American Facebook users go on vacation?”.

While I’m mostly presenting this as a linear, waterfall process, what I’ve just described is a good example of how iterative cycles drive the real workflow. It’s hard to know how well a lot of things will work until you try them. As you’re still making some progress, don’t worry if you find yourself going in circles.

Crunch the data

If you know your data, and you have a good idea of the question you’re trying to answer, this should be the simplest stage. You’ll hopefully have a clear set of requirements and it’s just a matter of executing the right queries over your data.

In this case I already had some Pig scripts asking similar questions, so I was able to adapt one of those. The biggest surprise was when I ran into issues with some of the joins. The hard part was running the Hadoop job to gather the raw data from our Cassandra cluster, and that worked. I was able to output smaller files containing the gathered data, and then run a local Pig job to do the joins I needed.

The next stage was turning the raw information into a form that could be displayed. For example, I needed to take all of the user locations from the unstructured text strings that Facebook gave me, and convert them into latitude-longitude coordinates for plotting on a map. For this sort of work I usually turn to a general-purpose scripting language, and most of Jetpac is already written in Ruby, so that was an easy choice. I wrote a script that walked through the data, using the Data Science Toolkit to match coordinates with names, and then output it into a file containing a JSON array of all the information.

Build an interface

A lot of the best visualizations have no interactivity. They just tell a story with a static image. That’s why it’s worth considering whether you need an interface at all. I actually had the interactive site that I used to create the “Five Nations of Facebook” visualization up for several weeks before that post, and nobody used it because it was too confusing. It was only when I boiled it down into a single picture with labels that it became a hit.

My problem is that I want other people to have as much fun exploring the data as I’ve had, so I couldn’t resist adding some interaction to the vacation visualization. I still wanted to retain the immediate visual appeal of a static image, so I decided to create a background showing the full data to introduce the visualization at a first glance, and then overlay an interactive foreground once the user started exploring it more deeply.

In most cases you’re better off using one of the excellent off-the-shelf visualization frameworks like D3. Since I needed something client-side for interaction, and was working with both geographic and network rendering, I couldn’t find anything that met my requirements. Instead I cannibalized one of my own projects, the jQuery component from OpenHeatMap, and combined it with HTML5 canvas rendering to produce a custom JavaScript renderer. I used it to pre-render a background containing all the possible connections between home towns and travel destinations, and saved that off as a static image. That’s useful to save rendering time on page load, and lets me fall back to a static visualization on older browsers that don’t support Canvas.

Background image of Facebook vacation visualization
Click to enlarge.

I then tied in rendering the connections of any places that the user was hovering their cursor over, so that they could quickly get a feel for the relationships expressed in the data. I also wanted to display the details underlying the picture, so to drill down I added a dialog listing the raw statistics about a place. Users can bring this dialog up by clicking.

Facebook vacation visualization dialog box
Click to enlarge.

One problem with that interaction is that a lot of different cities are in a very small area, so it becomes extremely difficult to pick the one you want with the mouse cursor. To make that a little better, I prioritized the most popular U.S. cities so that in case of a conflict, they’re chosen over their smaller neighbors. I realized I also needed to add a search box. Thankfully we’re heavy users of Twitter’s Bootstrap framework, so it was a simple matter to add a search field and tie it in with Twitter’s excellent autocomplete component.

Find the surprises!

I build these visualizations so I can explore them myself, so my favorite part of the whole process is the chance to sit and play with the results. There’s always unexpected stories hidden in there, and I love uncovering them. For example, who knew that the city that had the most visitors to Paris was West Hollywood? When I lived in Los Angeles I used to love popping by the wonderful patisseries. Now I know why they’re so good! These little details are the stories that catch people’s imagination and cause them to spread the word, so think about writing a few of them up to help visitors understand what the page can tell them.

You’ll never know whether one of your visualizations will become popular ahead of time, but the real reward is enjoying your own work. I hope this short guide gives you some ideas for visualizations you want to build. I look forward to seeing what you come up with.

See the full Facebook vacation visualization
See the full visualization.

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20


Related:


Utilities vs Networks

February 11, 2012 by Fred


It’s interesting to see a network, Instagram, starting to replace the iPhone’s native camera application in many users’ daily usage of their phones. I see this in my kids’ behavior all the time. When they want to take a photo, they open Instagram, not the camera application.

In the PC era, when applications got bundled into the operating system, they became instoppable. All the competitive apps got left in the dust. But in the mobile era, it seems that a different dynamic is at play. The native applications are getting beat by networks. And these networks will eventually go cross platform which means that the native applications will be at an even greater disadvantage.

I expect we will see this happen not only with the camera application, but also with the calendar app, the contacts app, the to do list, etc, etc. Clean, simple, networked, social, cross platform mobile apps will be the winning model in the mobile ecosystem and the OS vendors will not be able to maintain dominance with the default apps they ship with the OS.

Networks beat utilities in the age when everyone is connected to everyone else. This is a big opportunity for startups. We’ve already got a few bets in this area and are looking to make more.

Website from Sherwin-Williams Analyzes Images to Create Color Palette

February 10, 2012 by (author unknown)



Here’s a truly useful idea for Sherwin Williams from McKinney. Chip It! analyzes every pixel in an image to match it to one of the paint manufacturer’s 1,500 paint colors, and composites the top 10 into ChipCards.

Pirate file-sharing goes 3D

February 6, 2012 by New Scientist


Could The Pirate Bay’s move open the door for a new wave of piracy as people scan objects using a 3D scanner and share them online? The prospect may seem unlikely, but remember that MP3 players were a niche market until free music from the likes of
See all stories on this topic »

Mark Zuckerberg’s Letter to Investors: ‘The Hacker Way’

February 1, 2012 by Epicenter Staff


Mark Zuckerberg giving the keynote at SXSW conerence in 2009. Credit: Jim Merithew/Wired.com


On Wednesday, Facebook filed the prospectus for a $5 billion initial public offering. Here is CEO Mark Zuckerberg’s letter to potential investors.

Facebook was not originally created to be a company. It was built to accomplish a social mission — to make the world more open and connected.

We think it’s important that everyone who invests in Facebook understands what this mission means to us, how we make decisions and why we do the things we do. I will try to outline our approach in this letter.

At Facebook, we’re inspired by technologies that have revolutionized how people spread and consume information. We often talk about inventions like the printing press and the television — by simply making communication more efficient, they led to a complete transformation of many important parts of society. They gave more people a voice. They encouraged progress. They changed the way society was organized. They brought us closer together.

Today, our society has reached another tipping point. We live at a moment when the majority of people in the world have access to the internet or mobile phones — the raw tools necessary to start sharing what they’re thinking, feeling and doing with whomever they want. Facebook aspires to build the services that give people the power to share and help them once again transform many of our core institutions and industries.

There is a huge need and a huge opportunity to get everyone in the world connected, to give everyone a voice and to help transform society for the future. The scale of the technology and infrastructure that must be built is unprecedented, and we believe this is the most important problem we can focus on.

We hope to strengthen how people relate to each other.

Even if our mission sounds big, it starts small — with the relationship between two people.

Personal relationships are the fundamental unit of our society. Relationships are how we discover new ideas, understand our world and ultimately derive long-term happiness.

At Facebook, we build tools to help people connect with the people they want and share what they want, and by doing this we are extending people’s capacity to build and maintain relationships.

People sharing more — even if just with their close friends or families — creates a more open culture and leads to a better understanding of the lives and perspectives of others. We believe that this creates a greater number of stronger relationships between people, and that it helps people get exposed to a greater number of diverse perspectives.

By helping people form these connections, we hope to rewire the way people spread and consume information. We think the world’s information infrastructure should resemble the social graph — a network built from the bottom up or peer-to-peer, rather than the monolithic, top-down structure that has existed to date. We also believe that giving people control over what they share is a fundamental principle of this rewiring.

We have already helped more than 800 million people map out more than 100 billion connections so far, and our goal is to help this rewiring accelerate.

We hope to improve how people connect to businesses and the economy.

We think a more open and connected world will help create a stronger economy with more authentic businesses that build better products and services.

As people share more, they have access to more opinions from the people they trust about the products and services they use. This makes it easier to discover the best products and improve the quality and efficiency of their lives.

One result of making it easier to find better products is that businesses will be rewarded for building better products — ones that are personalized and designed around people. We have found that products that are “social by design” tend to be more engaging than their traditional counterparts, and we look forward to seeing more of the world’s products move in this direction.

Our developer platform has already enabled hundreds of thousands of businesses to build higher-quality and more social products. We have seen disruptive new approaches in industries like games, music and news, and we expect to see similar disruption in more industries by new approaches that are social by design.

In addition to building better products, a more open world will also encourage businesses to engage with their customers directly and authentically. More than four million businesses have Pages on Facebook that they use to have a dialogue with their customers. We expect this trend to grow as well.

We hope to change how people relate to their governments and social institutions.

We believe building tools to help people share can bring a more honest and transparent dialogue around government that could lead to more direct empowerment of people, more accountability for officials and better solutions to some of the biggest problems of our time.

By giving people the power to share, we are starting to see people make their voices heard on a different scale from what has historically been possible. These voices will increase in number and volume. They cannot be ignored. Over time, we expect governments will become more responsive to issues and concerns raised directly by all their people rather than through intermediaries controlled by a select few.

Through this process, we believe that leaders will emerge across all countries who are pro-internet and fight for the rights of their people, including the right to share what they want and the right to access all information that people want to share with them.

Finally, as more of the economy moves towards higher-quality products that are personalized, we also expect to see the emergence of new services that are social by design to address the large worldwide problems we face in job creation, education and health care. We look forward to doing what we can to help this progress.

Our Mission and Our Business

As I said above, Facebook was not originally founded to be a company. We’ve always cared primarily about our social mission, the services we’re building and the people who use them. This is a different approach for a public company to take, so I want to explain why I think it works.

I started off by writing the first version of Facebook myself because it was something I wanted to exist. Since then, most of the ideas and code that have gone into Facebook have come from the great people we’ve attracted to our team.

Most great people care primarily about building and being a part of great things, but they also want to make money. Through the process of building a team — and also building a developer community, advertising market and investor base — I’ve developed a deep appreciation for how building a strong company with a strong economic engine and strong growth can be the best way to align many people to solve important problems.

Simply put: we don’t build services to make money; we make money to build better services.

And we think this is a good way to build something. These days I think more and more people want to use services from companies that believe in something beyond simply maximizing profits.

By focusing on our mission and building great services, we believe we will create the most value for our shareholders and partners over the long term — and this in turn will enable us to keep attracting the best people and building more great services. We don’t wake up in the morning with the primary goal of making money, but we understand that the best way to achieve our mission is to build a strong and valuable company.

This is how we think about our IPO as well. We’re going public for our employees and our investors. We made a commitment to them when we gave them equity that we’d work hard to make it worth a lot and make it liquid, and this IPO is fulfilling our commitment. As we become a public company, we’re making a similar commitment to our new investors and we will work just as hard to fulfill it.

The Hacker Way

As part of building a strong company, we work hard at making Facebook the best place for great people to have a big impact on the world and learn from other great people. We have cultivated a unique culture and management approach that we call the Hacker Way.

The word “hacker” has an unfairly negative connotation from being portrayed in the media as people who break into computers. In reality, hacking just means building something quickly or testing the boundaries of what can be done. Like most things, it can be used for good or bad, but the vast majority of hackers I’ve met tend to be idealistic people who want to have a positive impact on the world.

The Hacker Way is an approach to building that involves continuous improvement and iteration. Hackers believe that something can always be better, and that nothing is ever complete. They just have to go fix it — often in the face of people who say it’s impossible or are content with the status quo.

Hackers try to build the best services over the long term by quickly releasing and learning from smaller iterations rather than trying to get everything right all at once. To support this, we have built a testing framework that at any given time can try out thousands of versions of Facebook. We have the words “Done is better than perfect” painted on our walls to remind ourselves to always keep shipping.

Hacking is also an inherently hands-on and active discipline. Instead of debating for days whether a new idea is possible or what the best way to build something is, hackers would rather just prototype something and see what works. There’s a hacker mantra that you’ll hear a lot around Facebook offices: “Code wins arguments.”

Hacker culture is also extremely open and meritocratic. Hackers believe that the best idea and implementation should always win — not the person who is best at lobbying for an idea or the person who manages the most people.

To encourage this approach, every few months we have a hackathon, where everyone builds prototypes for new ideas they have. At the end, the whole team gets together and looks at everything that has been built. Many of our most successful products came out of hackathons, including Timeline, chat, video, our mobile development framework and some of our most important infrastructure like the HipHop compiler.

To make sure all our engineers share this approach, we require all new engineers — even managers whose primary job will not be to write code — to go through a program called Bootcamp where they learn our codebase, our tools and our approach. There are a lot of folks in the industry who manage engineers and don’t want to code themselves, but the type of hands-on people we’re looking for are willing and able to go through Bootcamp.

The examples above all relate to engineering, but we have distilled these principles into five core values for how we run Facebook:

Focus on Impact

If we want to have the biggest impact, the best way to do this is to make sure we always focus on solving the most important problems. It sounds simple, but we think most companies do this poorly and waste a lot of time. We expect everyone at Facebook to be good at finding the biggest problems to work on.

Move Fast

Moving fast enables us to build more things and learn faster. However, as most companies grow, they slow down too much because they’re more afraid of making mistakes than they are of losing opportunities by moving too slowly. We have a saying: “Move fast and break things.” The idea is that if you never break anything, you’re probably not moving fast enough.

Be Bold

Building great things means taking risks. This can be scary and prevents most companies from doing the bold things they should. However, in a world that’s changing so quickly, you’re guaranteed to fail if you don’t take any risks. We have another saying: “The riskiest thing is to take no risks.” We encourage everyone to make bold decisions, even if that means being wrong some of the time.

Be Open

We believe that a more open world is a better world because people with more information can make better decisions and have a greater impact. That goes for running our company as well. We work hard to make sure everyone at Facebook has access to as much information as possible about every part of the company so they can make the best decisions and have the greatest impact.

Build Social Value

Once again, Facebook exists to make the world more open and connected, and not just to build a company. We expect everyone at Facebook to focus every day on how to build real value for the world in everything they do.

Thanks for taking the time to read this letter. We believe that we have an opportunity to have an important impact on the world and build a lasting company in the process. I look forward to building something great together.

How Amazon Could Split Netflix and iTunes to Win Streaming Video

January 28, 2012 by Tim Carmody


Amazon Video image courtesy Amazon

Everyone knows that Amazon wants to extend its digital media offerings. Any online retailer of analog media would. Its executives know the long-term trends for sales of DVDs, Blu-Rays and their players. The company that dominates e-book and e-reader sales was already beaten first to digital music by Apple. Jeff Bezos never wants that to happen again.

Everything from the Kindle Fire’s video support, to cloud storage for Amazon Video On Demand, to free subscription-based video for Amazon Prime suggests that Bezos wants to make a play for digital television and movies. Media company sources speaking to Claire Atkinson at The New York Post say that a standalone, subscription-based video service separate from Amazon Prime is in the works, a direct challenge to Netflix’s streaming service. In Netflix’s letter to investors following its recent quarterly results, Netflix CEO Reed Hastings writes that he and his team fully expect Amazon to launch a competing service, challenging them for both customers and content.

I asked an Amazon spokesperson about these two stories; she said only that she had nothing to announce about a video subscription change and that Amazon didn’t comment on rumors or speculation. I also asked more broadly about Amazon’s plans for its subscription and on-demand services — specifically the company’s content acquisition strategy for each service. This was her reply:

Our focus is on our customers –and customers tell us they want lots of video content at a great price. So we’re working hard every day to continue growing both the Amazon Instant Video (currently 100,000) and Prime Instant Video (currently 13,000) title counts and we work very closely with our studio partners to add more content at a great value for customers.

One way or another, Amazon seems committed to pursuing both a Netflix-style streaming subscription library and iTunes-style digital downloads. And really, out of the major players in digital video, it’s the only one who’s substantively pursuing both approaches. How do they complement each other? What does offering both give Amazon and its customers that it doesn’t give iTunes or Netflix?

Earlier this week, I noted that audiences for broadcast and streaming television content seemed to be diverging, particularly for subscription-based streaming services like Netflix and Hulu. Broadcast television rewards big audiences who gather based on individual events or popular time slots; streaming television rewards the long tail, niche programs that are (as Hulu’s Andy Forssell says) “beloved, not beliked.” We binge on these shows, and the presence of just a few of them (plus a wide enough range of content around them) is enough for many of us to continue to pay $8 every month.

Netflix offers another illustration of this divergence in content and audience — not just in its streaming service, but streaming in tandem with DVD. But here, the picture gets a little more complicated. When Netflix’s subscriber base and digital content library was smaller, and every account included both rental discs and streaming video, the two services complimented each other in different ways.

  • Viewers could flip through new instant titles, watching older back catalog movies like they were coming on basic cable, but order whole seasons of HBO television series one or more discs at a time.
  • But as the digital library grew and became more stable, viewers began to take the digital catalog for granted; episodes of Futurama or Firefly were always there if you needed them, but you could also still rent must-have new movies like Inception that would never end up on Netflix Instant in a million years. Even if you wished that a favorite or exceptionally popular movie might be on streaming, there was always another route to seeing it.

So at a single price, Netflix satisfied both modes of video consumption. Digital streaming helped grow the subscriber base and increase the power of the service; even as streaming became more core to the Netflix experience and DVD more of an add-on, DVDs were really providing the stable profit base for the business. Now that those two modes of delivery have been uncoupled, those two complimentary parts of the Netflix experience and the company’s bottom line have become uncoupled as well.

Now, the media companies offering the best mix of those two modes at a single price are arguably cable operators and channels, with a combination of on-demand through the cable box and the new digital TV Everywhere approaches. You have live TV; you have the archive; maybe you even have a cable-provided DVR that provides extra options. But that means at a minimum a hefty cable subscription, one that quickly becomes enormous once premium channels, DVR fees, and other associated costs are added in.

The only competitior to cable, then, on range of services alone, is Amazon. It gives you a streaming library at a base rate of $6.58 each month — or free, if you count the cost of Amazon Prime only as a discount on shipping Amazon.com purchases rather than only as a charge for a streaming video library. Then it also has the video on demand library — from which customers can buy or rent, and which serves much the same role that Netflix DVDs once did to compliment its streaming service.

The major difference between Netflix and Amazon Prime (or Hulu and Amazon Prime) is the strength of the content in their respective streaming libraries. There’s a good deal of overlap between Netflix and Amazon, and in some cases even Hulu, as companies have licensed off a lot of their back content nonexclusively for a song. But generally speaking, it’s pretty easy to say that content that’s exclusive to Netflix or Hulu Plus is much stronger than the vast majority of Amazon Prime’s streaming catalog, which is somewhere roughly around where Netflix’s streaming options were a few years ago.

But if the Post and Netflix’s Reed Hastings are right, and Amazon were to either replace or augment the current Amazon Prime streaming library with a premium streaming library at a monthly fee, now Amazon has more money to compete with Netflix and Hulu for content. It can match their libraries, capture exclusive deals with TV or movie studios, or compete with Netflix, Hulu and YouTube for born-digital original content.

Amazon could also do things that Netflix or iTunes were never willing to do because Netflix and Apple both wanted to preserve as simple a purchasing model as possible. For instance, negotiations between Netflix and Starz broke down because Starz wanted Netflix to treat it as a premium channel, with a separate customer fee and revenue-sharing arrangement from the rest of Netflix’s streaming catalog — tied to subscriber numbers, not a lump sum. Apple, too, has tried to keep all shows more or less the same price, which in some cases (for instance, James Cameron’s Avatar) has cost it some high-value content.

Amazon, though, already has a variety of different plans for watching video: streaming or download, rental or purchase, and (in this hypothetical scenario), Amazon Prime Free or Amazon Premium subscription. Maybe adding in lots of extra channels, all at their own prices, would be too complicated. But an Amazon Premium streaming library could certainly support deals with a handful of content partners and a revenue-sharing arrangement like Starz wanted from Netflix.

In any event, Amazon’s greater flexibility than Netflix, iTunes, Hulu or anyone else offers a distinct advantage. With Amazon’s cloud storage and video on demand purchases, you can essentially build your own a la carte streaming library that you can take with you anywhere. It’s everything Netflix offered through both streaming and DVDs, all moved into the cloud.

The major disadvantage Amazon has relative to Netflix is Netflix’s support for a superior range of devices. Arguably, it was Netflix’s expansion beyond the PC to game consoles, set-top boxes, phones and tablets that helped launch its subscriber numbers through the stratosphere in 2010 and early 2011, for more than anything on the content side. And Netflix still gets to devices that Amazon and Hulu can’t. (Classic example: the Nintendo Wii, an immensely popular video game system that natively supports streaming video from Netflix and virtually no one else.)

Amazon Instant Video, meanwhile, is only on the Xbox 360; it’s not on the PS3, iPad, iPhone, many Android phones — really, almost any phones —  or the Wii. There’s no real reason, as far as I know, that Amazon couldn’t make (and device makers couldn’t support) a Netflix-style app for iOS or the Wii.

Here, too, Amazon’s spokesperson wouldn’t comment directly, but responded simply that Amazon video was available on “more than 300 connected, compatible devices” and that “[w]e are always evaluating new devices and different ways to offer Amazon Instant Video.” That’s good, because platform availability is one of the only things holding Amazon’s video services back.

In fact, I have a perfect example of this that just so happens to involve Amazon, Apple and Netflix. It comes via Gregory Warner and Kai Ryssdal at American Public Media’s radio show Marketplace:

Of the 600,000 new subscribers [Netflix] says it added by the end of December, 97 percent of them were still on a free trial.

Tony Wible tracks tech stocks for Janney Capital Markets. He explains that Netflix was still losing customers in October and November because of its price hike. Then came Christmas, with all those gifts of iPads and Kindles and Nooks.

“A lot of devices were sold with free Netflix trials embedded within them,” Wible said. “All you had to do was click the Netflix app and sign up,” says Ryssdal. “And hundreds of thousands of people did.”

If a one-month free Netflix trial on iPads and Kindle Fires is the difference between Netflix gaining subscribers and losing subscribers, its stock bouncing back and it collapsing into free-fall, imagine what the sudden avaiability of an Amazon video app on iPads, PS3s and Nintendo Wiis could do.

And make no mistake; Amazon’s team would figure out a way not to just goose subscriber numbers, but make real money. They’re only getting started.

Report: Facebook Files for IPO Next Week

January 27, 2012 by Mike Isaac


CEO Mark Zuckerberg at an event in 2011. Photo: Jon Snyder/Wired.com

After years of speculation and anticipation on Wall Street and across Silicon Valley, Facebook may file for its initial public offering as soon as Wednesday of next week, according to a report citing people familiar with the matter.

The social giant’s IPO could raise as much as $10 billion, with investment firms Morgan Stanley and Goldman Sachs as the lead underwriters, according to The Wall Street Journal. This would put the company’s valuation at approximately $75 to $100 billion, the highest IPO of any technology company in Silicon Valley history.

Facebook did not immediately respond to a request for comment.

Facebook’s IPO is one of the most highly anticipated filings in Silicon Valley, a topic nearing obsession for the tech press and venture capital community at large. The company has swelled since its founding only seven years ago, with a user base of over 800 million people worldwide. The expansion of the social network has only fueled anticipation for an IPO.

But CEO and co-founder Mark Zuckerberg has been in no rush to take his company public, continually rebuffing questions of when Facebook would finally make its Wall Street debut. The 27-year-old CEO has had a number of exit opportunities in the company’s history, with tech titans as large as Yahoo and Microsoft offering billions of dollars to acquire the social network. Zuckerberg, then years younger, famously turned down both, instead opting to continue to grow and improve his company while continuing to accept venture funding.

If the company were to go public next week, Zuckerberg and a host of other shareholders would become multi-billionaires overnight. A filing next week would bring Facebook’s Wall Street debut in the summer, the piece de resistance in a string of major tech 2011 IPOs, including Zynga, LinkedIn, and social deals web site Groupon.

Listen to What Innovators Don't Talk About

January 25, 2012 by Michael Schrage


While working away on my laptop at a hotel breakfast, I couldn’t help but overhear the four gentlemen poring over an iPad two tables way. Their intense discussion revolved around rolling out their high-tech prototypes in a medical care complex. Since I’ve written about prototypes and prototyping, I couldn’t help but eavesdrop.

Forgive me.

The foursome represented a mix of medical care complex personnel and what was clearly an entrepreneurial innovator with a potentially high-impact idea. I’ll skip the technical details, but this was clearly a sophisticated group who were both smart and ambitious. The prototypes were their gateways to success. Their debates included whether it made more sense to field one or two more “finished” prototypes or whether they could get more information more quickly by fielding “roughs.” Were “staggered roll-outs” more cost-effective than “staggered builds”? They talked about the need to be able to “patch” quickly and whether their prototypes should optimize particular subsystems or overall system performance. They argued timelines and sequencing for test.

These questions are classic and it’s always fascinating to hear how — and what — decides them. Getting great value and insight from prototypes and pilots is more an art and craft than a science. Successful tech prototyping in health care contexts is particularly demanding.

That’s why the more passionately they spoke, the more nervous I got. Something was missing. Whenever innovators gather, I always listen for what’s not discussed. In almost 50 minutes of detailed discussion (yes, I am that kind of eavesdropper), I heard not a single mention, reference or allusion to the challenge of training the people onsite on how best to use or learn from the prototype. Details of prototype design and roll out were discussed as if the medical care personnel were irrelevant to the process. It reeked of “over the wall” technology transfer. OMG.

When something isn’t explicitly discussed, that doesn’t mean it’s not important or being ignored. Usually, it indicates a topic that’s either taken for granted — i.e., we know that already so there’s no point in discussing it now — or that it’s someone else’s job so it’s really their problem to discuss. While this behavior is typical, it practically defines innovation dysfunction.

Any innovator deploying any prototypes in the field can’t possibly assess the economics and costs of staggered roll-outs, staggered builds and optimization trade-offs independent of the people who will actually be using those prototypes. Their level of training, their abilities to observe and report, their mistakes and misunderstandings, the natural variability they individually introduce are costs and risk factors that invariably influence design decisions around the prototype. What’s more, as new and improved iterations of the prototypes emerge, so do new demands for training, observation, interaction and error management. If your conversations don’t reflect and respect that reality, you’re not doing planning or design, you’re simply indulging in speculation.

In other words, you can’t meaningfully budget, schedule and iteratively improve prototypes without literally and figuratively accounting for their users. You certainly can’t do this in the high tech, capital intensive, information-rich and litigatory-crazy health care environment. If anything, medical systems prototype development and testing has to be even more user sensitive and aware.

This design/prototyping conversation wasn’t. They talked about virtually every system-level element except their people in the field.

After they fixed on their sequencing and schedules, they stood up to shake hands. They had agreed to a prototyping protocol that would be obsolete moments after real humans had real interactions with their real prototypes. The great German General von Moltke once observed that, “All plans evaporate on contact with the enemy.” For serious innovators, that aphorism becomes, “All prototypes evolve on contact with the user.”

Are you talking too much about your prototypes and not enough about their users? Or do you (honestly) think that because your prototypes have incorporated your users’ requirements, you don’t need to talk about them anymore? Pay close attention to what you don’t talk about.

Big data market survey: Hadoop solutions

January 19, 2012 by Edd Dumbill



The big data ecosystem can be confusing. The popularity of “big data” as industry buzzword has created a broad category. As
Hadoop steamrolls through the industry, solutions from the
business intelligence and data warehousing fields are also
attracting the big data label. To confuse matters, Hadoop-based solutions such as Hive
are at the same time evolving toward being a competitive data warehousing solution.


Understanding the nature of your big data problem is a helpful
first step in evaluating potential solutions. Let’s
remind ourselves of href=”http://radar.oreilly.com/2012/01/what-is-big-data.html”>the
definition of big data:


“Big data is data that exceeds the processing capacity of
conventional database systems. The data is too big, moves too
fast, or doesn’t fit the strictures of your database
architectures. To gain value from this data, you must choose an
alternative way to process it.”


Big data problems vary in how heavily they weigh in on the axes
of volume, velocity and variability. Predominantly structured yet
large data, for example, may be most suited to an analytical
database approach.


This survey makes the assumption that a data warehousing
solution alone is not the answer to your problems, and concentrates on
analyzing the commercial Hadoop ecosystem. We’ll focus on the
solutions that incorporate storage and data processing,
excluding those products which only sit above those layers, such
as the visualization or analytical workbench software.

Getting started with Hadoop doesn’t require a large
investment as the software is open source, and is also available
instantly through the Amazon Web Services cloud. But for
production environments, support, professional services and
training are often required.

Just Hadoop?




Apache Hadoop is unquestionably the center of the latest
iteration of big data solutions. At its heart, Hadoop is a
system for distributing computation among commodity servers. It
is often used with the Hadoop Hive project, which layers data
warehouse technology on top of Hadoop, enabling ad-hoc
analytical queries.


Big data platforms divide along the lines of their approach to
Hadoop. The big data offerings from familiar enterprise vendors
incorporate a Hadoop distribution, while other platforms
offer Hadoop connectors to their existing analytical database
systems. This latter category tends to comprise massively
parallel processing (MPP) databases that made their name in big
data before Hadoop matured: Vertica and Aster Data. Hadoop’s
strength in these cases is in processing unstructured data in
tandem with the analytical capabilities of the existing database
on structured or structured data.



Practical big data implementations don’t in general fall neatly
into either structured or unstructured data
categories. You will invariably find Hadoop working as part of a
system with a relational or MPP database.



Much as with Linux before it, no Hadoop solution incorporates
the raw Apache Hadoop code. Instead, it’s packaged into
distributions. At a minimum, these distributions have been
through a testing process, and often include additional
components such as management and monitoring tools. The most
well-used distributions now come from Cloudera, Hortonworks and
MapR. Not every distribution will be commercial, however: the
BigTop
project
aims to create a Hadoop distribution under the
Apache umbrella.


Microsoft SQL Server is a comprehensive information platform offering enterprise-ready technologies and tools that help businesses derive maximum value from information at the lowest TCO. SQL Server 2012 launches next year, offering a cloud-ready information platform delivering mission-critical confidence, breakthrough insight, and cloud on your terms; find out more at www.microsoft.com/sql.

Integrated Hadoop systems


The leading Hadoop enterprise software vendors have aligned their
Hadoop products with the rest of their database and analytical
offerings. These vendors don’t require you to source Hadoop from
another party, and offer it as a core part of their big data
solutions. Their offerings integrate Hadoop into a
broader enterprise setting, augmented by analytical and workflow
tools.

EMC Greenplum


EMC Greenplum

Database


Deployment options

Appliance
(Modular Data Computing Appliance),

Software
(Enterprise Linux)

Hadoop



Bundled distribution
(Greenplum HD);

Hive,

Pig,

Zookeeper,

HBase



NoSQL component

HBase

Links





Acquired by EMC, and rapidly taken to the heart of the
company’s strategy, Greenplum is a relative newcomer to the
enterprise, compared
to other companies in this section. They have turned that to
their advantage in creating an analytic platform, positioned as
taking analytics “beyond BI” with agile data science teams.

Greenplum’s Unified Analytics Platform (UAP) comprises three
elements: the Greenplum MPP database, for structured data; a
Hadoop distribution, Greenplum HD; and href=”http://www.greenplum.com/products/chorus”>Chorus, a
productivity and groupware layer for data science teams.

The HD Hadoop layer builds on MapR’s Hadoop compatible
distribution, which replaces the file system with a faster
implementation and provides other features for
robustness. Interoperability between HD and Greenplum Database
means that a single query can access both database and Hadoop data.

Chorus is a unique feature, and is indicative of Greenplum’s commitment
to the idea of data science and the importance of the agile team
element to effectively exploiting big data. It supports
organizational roles from analysts, data scientists and DBAs
through to executive business stakeholders.



As befits EMC’s role in the data center market, Greenplum’s UAP is
available in a modular appliance configuration.

IBM


IBM InfoSphere



Database



DB2


Deployment options



Software
(Enterprise Linux), Cloud

Hadoop



Bundled distribution
(InfoSphere BigInsights);

Hive,

Oozie,

Pig,

Zookeeper,

Avro,

Flume,

HBase,

Lucene



NoSQL component

HBase

Links




IBM’s href=”http://www-01.ibm.com/software/data/infosphere/biginsights/”>InfoSphere
BigInsights is their Hadoop distribution, and part of a suite
of products offered under the “InfoSphere” information management
brand. Everything big data at IBM is helpfully labeled
Big, appropriately enough for a company affectionately known as “Big
Blue.”

BigInsights augments Hadoop with a variety of features,
including
management and administration tools. It also offers textual analysis tools
that aid with entity resolution — identifying people, addresses,
phone numbers and so on.

IBM’s Jaql query language provides a point of integration
between Hadoop and other IBM products, such as relational databases
or Netezza data warehouses.

InfoSphere BigInsights is interoperable with IBM’s other
database and warehouse products, including DB2, Netezza and its
InfoSphere warehouse and analytics lines. To aid analytical
exploration, BigInsights ships with BigSheets, a spreadsheet
interface onto big data.

IBM addresses streaming big data separately through its href=”http://www-01.ibm.com/software/data/infosphere/streams/”>InfoSphere
Streams product. BigInsights is not currently offered in an
appliance form, but can be used in the cloud via Rightscale, Amazon, Rackspace, and IBM Smart Enterprise Cloud.

Microsoft


Microsoft
Database



Deployment options




Software
(Windows Server),

Cloud
(Windows Azure Cloud)

Hadoop



Bundled distribution
(Big Data Solution);

Hive,

Pig


Links





Microsoft have adopted Hadoop as the center of their big data
offering, and are pursuing an integrated approach aimed at making
big data available through their analytical tool suite, including
to the familiar tools of Excel and PowerPivot.

Microsoft’s
href=”http://www.microsoft.com/sqlserver/en/us/solutions-technologies/business-intelligence/big-data-solution.aspx”>Big
Data Solution brings Hadoop to the Windows Server platform,
and in elastic form to their cloud platform Windows
Azure. Microsoft have packaged their own distribution of Hadoop,
integrated with Windows Systems Center and Active Directory.
They intend to contribute back changes to Apache Hadoop to
ensure that an open source version of Hadoop will run on Windows.

On the server side, Microsoft offer integrations to their SQL
Server database and their data warehouse product. Using their
warehouse solutions aren’t mandated, however. The Hadoop Hive data
warehouse is part of the Big Data Solution, including
connectors from Hive to ODBC and Excel.

Microsoft’s focus on the developer is evident in their creation
of a JavaScript API for Hadoop. Using JavaScript, developers can
create Hadoop jobs for MapReduce, Pig or Hive, even from a
browser-based environment. Visual Studio and .NET integration
with Hadoop is also provided.

Deployment is possible either on the server or in the cloud, or
as a hybrid combination. Jobs written against the Apache Hadoop
distribution should migrate with miniminal changes to Microsoft’s
environment.


Oracle


Oracle Big Data

Deployment options

Hadoop



Bundled distribution
(Cloudera’s Distribution including Apache Hadoop);

Hive,

Oozie,

Pig,

Zookeeper,

Avro,

Flume,

HBase,

Sqoop,

Mahout,

Whirr



NoSQL component


Links



Announcing their entry into the big data market at the end of
2011, Oracle is taking an appliance-based approach. Their
href=”http://www.oracle.com/us/products/database/big-data-appliance/overview/index.html”>Big
Data Appliance integrates Hadoop, R for analytics, a new
Oracle NoSQL database, and connectors to Oracle’s
database and Exadata data warehousing product line.

Oracle’s approach caters to the high-end enterprise market, and
particularly leans to the rapid-deployment, high-performance end
of the spectrum. It is the only vendor to include the popular R
analytical language integrated with Hadoop, and to ship a NoSQL
database of their own design as opposed to Hadoop HBase.



Rather than developing their own Hadoop distribution, Oracle
have partnered with Cloudera for Hadoop support, which brings them
a mature and established Hadoop solution. Database connectors
again promote the integration of structured Oracle data with the
unstructured data stored in Hadoop HDFS.

Oracle’s href=”http://www.oracle.com/us/products/database/nosql/overview/index.html”>NoSQL
Database is a scalable key-value database, built on the
Berkeley DB technology. In that, Oracle owes double gratitude to
Cloudera CEO Mike Olson, as he was previously the CEO of
Sleepycat, the creators of Berkeley DB. Oracle are positioning
their NoSQL database as a means of acquiring big data prior to
analysis.

The Oracle R Enterprise product offers direct integration into
the Oracle database, as well as Hadoop, enabling R scripts to run
on data without having to round-trip it out of the data stores.


Availability

While IBM and Greenplum’s offerings are available at the time
of writing, the Microsoft and Oracle solutions are expected to be
fully available early in 2012.

Analytical databases with Hadoop connectivity



MPP (massively parallel processing) databases are specialized
for processing structured big data, as distinct from the
unstructured data that is Hadoop’s specialty. Along with Greenplum,
Aster Data and Vertica are early pioneers of big data
products before the mainstream emergence of Hadoop.

These MPP solutions are databases specialized for analyical
workloads and data integration, and provide connectors to
Hadoop and data warehouses. A
recent spate of acquisitions have seen these products become the
analytical play by data warehouse and storage vendors: Teradata
acquired Aster Data, EMC acquired Greenplum, and HP acquired
Vertica.

Quick facts




Aster Data



Database

MPP analytical database

Deployment options

Hadoop




Hadoop connector available

Links






ParAccel



Database

MPP analytical database

Deployment options



Software
(Enterprise Linux),

Cloud
(Cloud Edition)

Hadoop




Hadoop integration available

Links





Vertica



Database

MPP analytical database

Deployment options



Appliance
(HP Vertica Appliance),

Software
(Enterprise Linux),

Cloud
(Cloud and Virtualized)

Hadoop




Hadoop and Pig connectors available

Links




Hadoop-centered companies

Directly employing Hadoop is another route to creating a big
data solution, especially where your infrastructure doesn’t fall
neatly into the product line of major vendors. Practically every
database now features Hadoop connectivity, and there are multiple
Hadoop distributions to choose from.

Reflecting the developer-driven ethos of the big data world,
Hadoop distributions are frequently offered in a community edition.
Such editions lack enterprise management features, but contain all
the functionality needed for evaluation and development.

The first iterations of Hadoop distributions, from Cloudera and
IBM, focused on usability and adminstration. We are now seeing the
addition of performance-oriented improvements to Hadoop, such as
those from MapR and Platform Computing. While maintaining API
compatibility, these vendors replace slow or fragile parts of the
Apache distribution with better performing or more robust components.

Cloudera

The longest-established provider of Hadoop distributions,
Cloudera provides an
enterprise Hadoop solution, alongside
services, training and support options. Along with
Yahoo, Cloudera have made deep open source contributions to Hadoop, and
through hosting industry conferences have done much to establish
Hadoop in its current position.

Hortonworks

Though a recent entrant to the market, href=”http://www.hortonworks.com/”>Hortonworks have a long
history with Hadoop. Spun off from Yahoo, where Hadoop
originated, Hortonworks aims to stick close to and promote the
core Apache Hadoop technology. Hortonworks also have a partnership
with Microsoft to assist and accelerate their Hadoop
integration.

Hortonworks href=”http://hortonworks.com/technology/hortonworksdataplatform/”>Data
Platform is currently in a limited preview phase, with a
public preview expected in early 2012. The company also provides
support and training.


An overview of Hadoop distributions







Cloudera

EMC Greenplum

Hortonworks

IBM

MapR

Microsoft

Platform Computing



Product Name

Cloudera’s Distribution including Apache Hadoop

Greenplum HD

Hortonworks Data Platform

InfoSphere BigInsights

MapR

Big Data Solution

Platform MapReduce












Free Edition


CDH
Integrated, tested distribution of Apache Hadoop




Community Edition
100% open source certified and supported version of the Apache Hadoop stack








Basic Edition
An integrated Hadoop distribution.




MapR M3 Edition
Free community edition incorporating MapR’s performance increases








Platform MapReduce Developer Edition
Evaluation edition, excludes resource management features of regualt edition




Enterprise Edition


Cloudera Enterprise
Adds management software layer over CDH




Enterprise Edition
Integrates MapR’s M5 Hadoop-compatible distribution, replaces HDFS with MapR’s C++-based file system. Includes MapR management tools








Enterprise Edition
Hadoop distribution, plus BigSheets spreadsheet interface, scheduler, text analytics, indexer, JDBC connector, security support.




MapR M5 Edition
Augments M3 Edition with high availability and data protection features




Big Data Solution
Windows Hadoop distribution, integrated with Microsoft’s database and analytical products




Platform MapReduce
Enhanced runtime for Hadoop MapReduce, API-compatible with Apache Hadoop




Hadoop Components



Hive,

Oozie,

Pig,

Zookeeper,

Avro,

Flume,

HBase,

Sqoop,

Mahout,

Whirr





Hive,

Pig,

Zookeeper,

HBase





Hive,

Pig,

Zookeeper,

HBase,

None,

Ambari





Hive,

Oozie,

Pig,

Zookeeper,

Avro,

Flume,

HBase,

Lucene





Hive,

Pig,

Flume,

HBase,

Sqoop,

Mahout,

None,

Oozie





Hive,

Pig








Security



Cloudera Manager

Kerberos, role-based administration and audit trails















Security features

LDAP authentication, role-based authorization, reverse proxy











Active Directory integration









Admin Interface



Cloudera Manager

Centralized management and alerting







Administrative interfaces

MapR Heatmap cluster administrative tools







Apache Ambari

Monitoring, administration and lifecycle management for Hadoop clusters







Administrative interfaces

Administrative features including Hadoop HDFS and MapReduce administration, cluster and server management, view HDFS file content







Administrative interfaces

MapR Heatmap cluster administrative tools







System Center integration






Administrative interfaces

Platform MapReduce Workload Manager






Job Management



Cloudera Manager

Job analytics, monitoring and log search







High-availability job management

JobTracker HA and Distributed NameNode HA prevent lost jobs, restarts and failover incidents







Apache Ambari

Monitoring, administration and lifecycle management for Hadoop clusters







Job management features

Job creation, submission, cancellation, status, logging.







High-availability job management

JobTracker HA and Distributed NameNode HA prevent lost jobs, restarts and failover incidents














Database connectors








Greenplum Database












DB2,

Netezza,

InfoSphere Warehouse













SQL Server,

SQL Server Parallel Data Warehouse










Interop features

























Hive ODBC Driver,

Excel Hive Add-in










HDFS Access



Fuse-DFS

Mount HDFS as a traditional filesystem







NFS

Access HDFS as a conventional network file system







WebHDFS

REST API to HDFS











NFS

Access HDFS as a conventional network file system














Installation



Cloudera Manager

Wizard-based deployment















Quick installation

GUI-driven installation tool


















Additional APIs















Jaql

Jaql is a functional, declarative query language designed to process large data sets.







REST API






JavaScript API

JavaScript Map/Reduce jobs, Pig-Latin, and Hive queries







Includes R, C/C++, C#, Java, Python





Volume Management



















Mirroring, snapshots
















Notes





  • Pure cloud solutions: Both Amazon Web Services
    and Google offer cloud-based big data solutions. These will be
    reviewed separately.

  • HPCC:
    Though dominant, Hadoop is not the only big data solution. LexisNexis’ href=”http://hpccsystems.com/”>HPCC offers an alternative
    approach.


  • Hadapt: not yet featured in this survey.
    Taking a different approach from
    both Hadoop-centered and MPP solutions, href=”http://hadapt.squarespace.com/product-overview/”>Hadapt
    integrates unstructured and structured data into one
    product: wrapping rather than exposing Hadoop. It is currently in “early access” stage.


  • NoSQL: Solutions built on databases such as
    Cassandra, MongoDB and Couchbase are not in the scope of this
    survey, though these databases do offer Hadoop integration.


  • Errors and omissions:
    given the fast-evolving nature of the market and variable
    quality of public information, any feedback about errors and
    omissions from this survey is most welcome. Please send it to
    edd+bigdata@oreilly.com.



  • Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.


    Save 20% on registration with the code RADAR20

    Related:

    The Top 12 Social Trends in ‘12

    January 7, 2012 by Geoffrey Colon


    With the new year upon us and 2011 in the rear view mirror, it’s time to pay attention to where social media will go this year. In December, the Ogilvy Digital Influence New York City team hosted its year end 2011 Social Trends Lab. The team predicted 12 trends we think will shape and influence 2012. Is there a prediction you don’t see on this list? Let us know! social-media-predictions-360

    And now without further ado, here is the Ogilvy Digital Influence crowdsourced Top 12 in ‘12 list of predictions in social media trends (in no particular order).

    1. Social Television goes mainstream. David A. Brooks and Chris Heydt said the trend to watch  is the popularization of Social Television.  New technology will have an effect on the TV industry much like Apple iOS and Android had on the  smartphone mobile market. David noted, “It has gone beyond simply dual screen. Just like you had the advent of smartphones you will see TV technology being revolutionized which will alter viewing habits.” An area Chris said to pay attention to  will be advertising and engagement. “New ads in this forum using engagement technology will be able to garner instant reaction.”
    2. Social ROI must become real. Maya Swedowsky said the debate about ROI must be clarified and made concrete in 2012. Facebook analytics need to show better tracking in order to appease the 1 in 3 CMOs demanding to see how their marketing budget invested in social actually has a direct effect on consumer purchase. “Facebook metrics need to go deeper to show if marketing expenditures are actually worth the investment. Currently, the analytics are unable to show this but will need to in order to get CMOs to continue to invest in the platform.”
    3. Mobile apps will be the main communication tools between brands and consumers. In the week between Christmas and New Year’s users downloaded 1 billion apps. Yes, you read that correctly, 1 billion. Rose Reid thinks certain apps will become just as popular as the platforms that help amplify them. Case in point is Instagram. Instagram as a community platform will grow in 2012. “A brand will eventually tap this platform to carry out an initiative. They may use other platforms to amplify the program, but more app-specific social marketing programs beyond Facebook and Twitter will expand as a result of more people owning mobile smartphone technology which prompts them to use apps as daily utilities.”
    4. News sources that harness the power of social will be credible alternatives to traditional media outlets. Layla Revis thinks one platform to watch in this area is newsmotion.org. She thinks it will emerge as an alternative to mainstream media news. “As new technologies collide with world-changing events, the independent voice (or tweet) has emerged as a major player in news media. Like Alternet and Huffington Post, a citizen news platform like Newsmotion.org is a growing trend to watch.” The innovative platform where civic media and citizen reporting converge to connect new audiences to valuable stories, resources, and — most importantly — each other is definitely a new journalism model for 2012
    5. Social media functionality will further integrate into digital website properties. Geoffrey Colon thinks the website of the future will be templated in 2012 with new design and user experiences. “The website as we’ve known it is evolving. Brands that have a website and a Facebook and Twitter presence and a mobile application need to merge all of these experiences together to create a unique and friendly digital experience. People are spending 8.5 hours per month on Facebook based on 2011 Nielsen research. Why would a consumer want to visit a brand website when they have community on the Facebook page or get updates on their Twitter feed? If the company website had Facebook or Twitter plug-ins integrated with the open graph this is one way brands can help customers get closer to their owned property. This actually helps brands get more from advocates by linking their web properties to real time conversations.” As a result, Colon thinks social aggregators will be popular on company websites and CMOs can use the data to create targeted marketing messages in real time.
    6. Apps on Facebook will become the main amplifier of brand messaging. Max Kelerstein and Stephen Cooper both think there will be an influx of apps on Facebook due to the new open graph. The ownership of verbs to track engagement and personalize a brand on one’s newsfeed will only take place with a good app that ties back to the brand message. Kelerstein states, “Making verbs personal and tying them into a unique app with great functionality is the only way many brands will ever end up in one’s social feed.” Although users tweet and post about brands, it’s more likely they will show brand love through interaction with a seamless and frictionless experience. Case in point is the popularity of such apps as Spotify and Nike+. So brands can reach their advocates through all the clutter, a paid/earned model will become an essential part of all successful app launches.
    7. Brands will become publishers and not simply curators.  Sophia Aladenoye thinks that content production and strategy will continue to grow tremendously in 2012. Brands will be expected to be content publishers, producers & planners – pushing them to become more flexible, imaginative and proactive in the creation of great content to sustain interaction with the millions of people who follow them. People, across various social networks, will expect brands to provide them with consistent & fun content that spans across platforms. Brands that embrace this challenge and focus on developing an online personality that is easily recognizable via first, second and third-party brand content will win (in the space of online attention & relationships) moving forward.
    8. A content sharing strategy will be just as important as a content production strategy. It’s not enough to simply create good content in 2012. All brands are slowly becoming publishers and there is a lot of clutter. Brands will have to plan accordingly in order to create good shareable content. It’s the end of the “produce content, push it out and make it go viral” flowchart. A brand’s social content can have as much of an effect, if not more on consumers in social channels as it does on traditional avenues.
    9. Privacy issues will lead to rewriting personal history in the social realm. In the advent of social media, many users were carefree to share everything. This included controversial tweets and photos. Like people, brands may have disharmony in their social history. And many (unfortunately) will go about the practice of editing that history where it may seem harmful to the brand. Geoffrey Colon noted that the new timeline on Facebook will probably lead many to edit their social history for fear of losing out on a job or to not embarrass themselves to friends or family. “Brands most likely will also have a similar timeline user interface on Facebook and many will be quick to edit any negativity to present themselves as pristine to potential consumers. Tools empowering the deletion of photos, omissions from their timeline and other actions to help protect their reputation will be the norm.”
    10. Healthcare and B2B will adopt what B2C brands have been doing for the last five years in social. Most consumers are used to engaging with CPGs like a favorite soda or fashion brand. But many will finally be able to communicate and do business with companies that were more conservative in getting involved in the social space. Digital health specialist Priya Kapoor says the space to watch in 2012 is in the healthcare social media space. “In 2011, we saw a lot of activity in the space from pharmaceutical, physicians, hospitals and patients alike. We’ve also seen how platforms such as Facebook and YouTube, working with risk-adverse pharma companies, can reach key audiences. But all of this was done at a sluggish pace as the industry awaited guidance from the FDA.” With the category ending the year by acknowledging the space with newly drafted guidance, we can expect to see healthcare and B2B finally go 2.0 and beyond in 2012.
    11. Social commerce is adopted across the board. Remember all those crazy crowds and long lines on Black Friday 2011? That will be an image of the past as retailers will adopt location-based apps to help consumers with purchase in the real world in real time. Imagine this, you check- in on Foursquare. Automatically, you are asked if you want to pay using the brand app. You download the app, scan the mouse bar code, get a subtotal, and you hit OK to purchase. The app takes the amount right out of your checking account, bills your credit card or debits a pre-existing gift card. Your receipt is emailed and you’re on your way out of the store to your next destination. Starbucks already does this using smartphones with a register scanner. More retailers want to make noise in this realm as customers will amplify how lovely the experience is and thus, we’ll see more of it in 2012
    12. Brands will become more nimble along with their agency partners. Social is a 24/7/365 business. And consumers are always talking. While brands are planning with their agencies, they’re talking. Smarter agency partners will tell their clients that they must use social to actually adopt real time messaging. For a long period of time, social sat in a silo and didn’t reflect the larger brand message. That is now history just like 2011. Smarter CMOs and agency execs know that social conversation needs to be utilized not to simply see what people were saying in the past, but to communicate in real time based on the current social conversation. While conversations are happening about a brand across a variety of platforms, whether it’s positive or negative, a brand can shape what it wants to talk about immediately. Not ten months from now via a television advertisement.
    13. Trends from 2011 we enjoyed: What would a forecast list of the future be without what we enjoyed this past year? Here’s some of our favorite social items from 2011: Spotify, Instagram, Socialcam, Pinterest, Path, Foursquare, Twitter reboot, YouTube reboot, Facebook timeline, Google+ launch, tablet app creation and strategy, social CRM, social business, Kim Kardashian

    Barnes & Noble Aims to Separate Nook Biz With an Eye for Global Markets

    January 5, 2012 by Tim Carmody


    Barnes & Noble Union Square in November. Photo by Tim Carmody/Wired.com

    The e-reader business may be moving faster in the last six months than it did in the previous six years. Even Barnes & Noble, the brick-and-mortar book retailer that’s best managed the transition to digital reading, has been taken by surprise.

    Now the company has to reread, restock and re-sort its own future — possibly one where the B&N and the Nook go separate ways.

    A separate Nook business may be able to attract new investment and partnerships and innovate more quickly

    In a press release Thursday, CEO William Lynch announced that the company is beginning “strategic exploratory work to separate the Nook business.”

    “We see substantial value in what we’ve built with our NOOK business in only two years,” Lynch says, “and we believe it’s the right time to investigate our options to unlock that value.”

    The brand-new Nook Tablet, for example, has been a great success. It’s already the company’s fastest-selling device and outperformed expectations over the holidays, even though it faced stiff competition from Amazon’s Kindle Fire. Driven by the $250 Nook Tablet and its older sister device, the $200 Nook Color, Lynch says Nook will do $1.5 billion in sales this fiscal year and continue growing into 2012 and beyond.

    So what’s the problem? Even though Nook set records, and overall retail sales for the holidays were up, sales of the Nook Simple Touch e-reader were disappointing. A price cut from $140 to $100 to match the new Kindle Touch couldn’t give the six-month-old Touch enough of a boost going into the holidays. B&N doesn’t release raw sales numbers of individual units, but whatever they were, they weren’t enough to meet the company’s expectations.

    The good news, as Lynch notes, is that consumers appear to prefer color e-readers, validating the strategy B&N introduced in 2010 with the Nook Color.

    The bad news is that the company has had to radically revise its earnings projections. Initially, B&N projected earnings before taxes, interest, depreciation and amortization (EBTIDA) of $210 million to $250 million. In December, the guidance offered was at “the lower end” of that figure. Now the company has revised its expectations again, to just $150 to $180 million. So after taxes and other non-operating expenses, B&N will most likely lose quite a bit of money, somewhere between $1.10 to $1.40 per share.

    The drop in demand for the Nook Simple Touch and the cost of advertising the other Nook devices takes most of the blame for the drop in expected profit.

    The timing couldn’t be worse. The company’s already looking to sell Sterling Publishing, the in-house book publisher that it acquired in 2003 and consolidated in 2009. The growth of the brick-and-mortar retail business is slowing, even though the economy has picked up a bit and the stores now sell more high-profit items like toys and electronics. BN.com (which includes both e-books and web sales of books and other goods) is still the smallest segment of B&N’s business. It’s much smaller than the retail or college divisions, and it still loses money — almost $59 million in Q2 [PDF].

    Meanwhile, the e-reader business is growing, but is also undergoing tectonic shifts in pricing, form factors, content types and reader expectations. It doesn’t matter that Nook anticipated and helped precipitate many of these changes. They make the business unpredictable.

    For the most part, bookstores and their investors do not like unpredictability. If they did, they’d be venture capitalists. They also don’t have years to wait. Nobody wants to be propping up the next Borders while the next big thing is always just around the corner.

    Barnes & Noble has scooped itself out of Borders’ fate before. How do they do it again? And what does all of this mean for the still-developing e-book and e-reader industry?

    Continue reading ‘Barnes & Noble Aims to Separate Nook Biz With An Eye for Global Markets‘ …

    Five technologies to watch

    January 4, 2012 by (author unknown)


    Innovation in energy technology is taking place rapidly. Five technologies you may not have heard of could be ready to change the energy landscape by 2020.
    Read more on the McKinsey Quarterly >
      Topics:
    Climate Change
    Energy, Resources, Materials
    Public Sector

    Update your Quarterly feed preferences

    The feedback economy

    January 4, 2012 by Alistair Croll


    Military strategist John Boyd spent a lot of time understanding how to win battles. Building on his experience as a fighter pilot, he broke down the process of observing and reacting into something called an Observe, Orient, Decide, and Act (OODA) loop. Combat, he realized, consisted of observing your circumstances, orienting yourself to your enemy’s way of thinking and your environment, deciding on a course of action, and then acting on it.

    OODA chart
    The Observe, Orient, Decide, and Act (OODA) loop. Click to enlarge.

    The most important part of this loop isn’t included in the OODA acronym, however. It’s the fact that it’s a loop. The results of earlier actions feed back into later, hopefully wiser, ones. Over time, the fighter “gets inside” their opponent’s loop, outsmarting and outmaneuvering them. The system learns.

    Boyd’s genius was to realize that winning requires two things: being able to collect and analyze information better, and being able to act on that information faster, incorporating what’s learned into the next iteration. Today, what Boyd learned in a cockpit applies to nearly everything we do.

    Data-obese, digital-fast

    In our always-on lives we’re flooded with cheap, abundant information. We need to capture and analyze it well, separating digital wheat from digital chaff, identifying meaningful undercurrents while ignoring meaningless social flotsam. Clay Johnson argues that we need to go on an information diet, and makes a good case for conscious consumption. In an era of information obesity, we need to eat better. There’s a reason they call it a feed, after all.

    It’s not just an overabundance of data that makes Boyd’s insights vital. In the last 20 years, much of human interaction has shifted from atoms to bits. When interactions become digital, they become instantaneous, interactive, and easily copied. It’s as easy to tell the world as to tell a friend, and a day’s shopping is reduced to a few clicks.

    The move from atoms to bits reduces the coefficient of friction of entire industries to zero. Teenagers shun e-mail as too slow, opting for instant messages. The digitization of our world means that trips around the OODA loop happen faster than ever, and continue to accelerate.

    We’re drowning in data. Bits are faster than atoms. Our jungle-surplus wetware can’t keep up. At least, not without Boyd’s help. In a society where every person, tethered to their smartphone, is both a sensor and an end node, we need better ways to observe and orient, whether we’re at home or at work, solving the world’s problems or planning a play date. And we need to be constantly deciding, acting, and experimenting, feeding what we learn back into future behavior.

    We’re entering a feedback economy.

    The big data supply chain

    Consider how a company collects, analyzes, and acts on data.

    The big data supply chain
    The big data supply chain. Click to enlarge.

    Let’s look at these components in order.

    Data collection

    The first step in a data supply chain is to get the data in the first place.

    Information comes in from a variety of sources, both public and private. We’re a promiscuous society online, and with the advent of low-cost data marketplaces, it’s possible to get nearly any nugget of data relatively affordably. From social network sentiment, to weather reports, to economic indicators, public information is grist for the big data mill. Alongside this, we have organization-specific data such as retail traffic, call center volumes, product recalls, or customer loyalty indicators.

    The legality of collection is perhaps more restrictive than getting the data in the first place. Some data is heavily regulated — HIPAA governs healthcare, while PCI restricts financial transactions. In other cases, the act of combining data may be illegal because it generates personally identifiable information (PII). For example, courts have ruled differently on whether IP addresses aren’t PII, and the California Supreme Court ruled that zip codes are. Navigating these regulations imposes some serious constraints on what can be collected and how it can be combined.

    The era of ubiquitous computing means that everyone is a potential source of data, too. A modern smartphone can sense light, sound, motion, location, nearby networks and devices, and more, making it a perfect data collector. As consumers opt into loyalty programs and install applications, they become sensors that can feed the data supply chain.

    In big data, the collection is often challenging because of the sheer volume of information, or the speed with which it arrives, both of which demand new approaches and architectures.

    Ingesting and cleaning

    Once the data is collected, it must be ingested. In traditional business intelligence (BI) parlance, this is known as Extract, Transform, and Load (ETL): the act of putting the right information into the correct tables of a database schema and manipulating certain fields to make them easier to work with.

    One of the distinguishing characteristics of big data, however, is that the data is often unstructured. That means we don’t know the inherent schema of the information before we start to analyze it. We may still transform the information — replacing an IP address with the name of a city, for example, or anonymizing certain fields with a one-way hash function — but we may hold onto the original data and only define its structure as we analyze it.

    Hardware

    The information we’ve ingested needs to be analyzed by people and machines. That means hardware, in the form of computing, storage, and networks. Big data doesn’t change this, but it does change how it’s used. Virtualization, for example, allows operators to spin up many machines temporarily, then destroy them once the processing is over.

    Cloud computing is also a boon to big data. Paying by consumption destroys the barriers to entry that would prohibit many organizations from playing with large datasets, because there’s no up-front investment. In many ways, big data gives clouds something to do.

    Platforms

    Where big data is new is in the platforms and frameworks we create to crunch large amounts of information quickly. One way to speed up data analysis is to break the data into chunks that can be analyzed in parallel. Another is to build a pipeline of processing steps, each optimized for a particular task.

    Big data is often about fast results, rather than simply crunching a large amount of information. That’s important for two reasons:

    1. Much of the big data work going on today is related to user interfaces and the web. Suggesting what books someone will enjoy, or delivering search results, or finding the best flight, requires an answer in the time it takes a page to load. The only way to accomplish this is to spread out the task, which is one of the reasons why Google has nearly a million servers.
    2. We analyze unstructured data iteratively. As we first explore a dataset, we don’t know which dimensions matter. What if we segment by age? Filter by country? Sort by purchase price? Split the results by gender? This kind of “what if” analysis is exploratory in nature, and analysts are only as productive as their ability to explore freely. Big data may be big. But if it’s not fast, it’s unintelligible.

    Much of the hype around big data companies today is a result of the retooling of enterprise BI. For decades, companies have relied on structured relational databases and data warehouses — many of them can’t handle the exploration, lack of structure, speed, and massive sizes of big data applications.

    Machine learning

    One way to think about big data is that it’s “more data than you can go through by hand.” For much of the data we want to analyze today, we need a machine’s help.

    Part of that help happens at ingestion. For example, natural language processing tries to read unstructured text and deduce what it means: Was this Twitter user happy or sad? Is this call center recording good, or was the customer angry?

    Machine learning is important elsewhere in the data supply chain. When we analyze information, we’re trying to find signal within the noise, to discern patterns. Humans can’t find signal well by themselves. Just as astronomers use algorithms to scan the night’s sky for signals, then verify any promising anomalies themselves, so too can data analysts use machines to find interesting dimensions, groupings, or patterns within the data. Machines can work at a lower signal-to-noise ratio than people.

    Human exploration

    While machine learning is an important tool to the data analyst, there’s no substitute for human eyes and ears. Displaying the data in human-readable form is hard work, stretching the limits of multi-dimensional visualization. While most analysts work with spreadsheets or simple query languages today, that’s changing.

    Creve Maples, an early advocate of better computer interaction, designs systems that take dozens of independent, data sources and displays them in navigable 3D environments, complete with sound and other cues. Maples’ studies show that when we feed an analyst data in this way, they can often find answers in minutes instead of months.

    This kind of interactivity requires the speed and parallelism explained above, as well as new interfaces and multi-sensory environments that allow an analyst to work alongside the machine, immersed in the data.

    Storage

    Big data takes a lot of storage. In addition to the actual information in its raw form, there’s the transformed information; the virtual machines used to crunch it; the schemas and tables resulting from analysis; and the many formats that legacy tools require so they can work alongside new technology. Often, storage is a combination of cloud and on-premise storage, using traditional flat-file and relational databases alongside more recent, post-SQL storage systems.

    During and after analysis, the big data supply chain needs a warehouse. Comparing year-on-year progress or changes over time means we have to keep copies of everything, along with the algorithms and queries with which we analyzed it.

    Sharing and acting

    All of this analysis isn’t much good if we can’t act on it. As with collection, this isn’t simply a technical matter — it involves legislation, organizational politics, and a willingness to experiment. The data might be shared openly with the world, or closely guarded.

    The best companies tie big data results into everything from hiring and firing decisions, to strategic planning, to market positioning. While it’s easy to buy into big data technology, it’s far harder to shift an organization’s culture. In many ways, big data adoption isn’t a hardware retirement issue, it’s an employee retirement one.

    We’ve seen similar resistance to change each time there’s a big change in information technology. Mainframes, client-server computing, packet-based networks, and the web all had their detractors. A NASA study into the failure of Ada, the first object-oriented language, concluded that proponents had over-promised, and there was a lack of a supporting ecosystem to help the new language flourish. Big data, and its close cousin, cloud computing, are likely to encounter similar obstacles.

    A big data mindset is one of experimentation, of taking measured risks and assessing their impact quickly. It’s similar to the Lean Startup movement, which advocates fast, iterative learning and tight links to customers. But while a small startup can be lean because it’s nascent and close to its market, a big organization needs big data and an OODA loop to react well and iterate fast.

    The big data supply chain is the organizational OODA loop. It’s the big business answer to the lean startup.

    Measuring and collecting feedback

    Just as John Boyd’s OODA loop is mostly about the loop, so big data is mostly about feedback. Simply analyzing information isn’t particularly useful. To work, the organization has to choose a course of action from the results, then observe what happens and use that information to collect new data or analyze things in a different way. It’s a process of continuous optimization that affects every facet of a business.

    Replacing everything with data

    Software is eating the world. Verticals like publishing, music, real estate and banking once had strong barriers to entry. Now they’ve been entirely disrupted by the elimination of middlemen. The last film projector rolled off the line in 2011: movies are now digital from camera to projector. The Post Office stumbles because nobody writes letters, even as Federal Express becomes the planet’s supply chain.

    Companies that get themselves on a feedback footing will dominate their industries, building better things faster for less money. Those that don’t are already the walking dead, and will soon be little more than case studies and colorful anecdotes. Big data, new interfaces, and ubiquitous computing are tectonic shifts in the way we live and work.

    A feedback economy

    Big data, continuous optimization, and replacing everything with data pave the way for something far larger, and far more important, than simple business efficiency. They usher in a new era for humanity, with all its warts and glory. They herald the arrival of the feedback economy.

    The efficiencies and optimizations that come from constant, iterative feedback will soon become the norm for businesses and governments. We’re moving beyond an information economy. Information on its own isn’t an advantage, anyway. Instead, this is the era of the feedback economy, and Boyd is, in many ways, the first feedback economist.

    Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

    Save 20% on registration with the code RADAR20


    Related:


    Strategy Essentials You Ignore at Your Peril

    December 22, 2011 by Joan Magretta


    Michael Porter, the world’s leading authority on competition and strategy, is sometimes the victim of his own success. We use his terminology every day — competitive advantage, the value chain, differentiation, value creation. We think, therefore, that we “know” his work. But in fact, most managers don’t. They talk the talk, but they have turned his powerful ideas into business buzzwords. Competitive advantage, for example, is often used to mean “anything we think we’re good at.” Any plan or program is called a strategy. Managers confuse differentiation with being different.

    That’s more than just too bad. I’ve had the rare opportunity to see Porter with fresh eyes — rare because when I approached him some time ago with the idea of writing a concise, practice-oriented guide to his work on competition and strategy, he agreed to give me complete access to his most current work as well as the original classics. My premise in writing Understanding Michael Porter was very simply that clear strategic thinking is essential for any manager in any setting, and Porter’s work lays out the basic principles and frameworks you need to master.

    My goal was to present the essential Porter in a form that could be more easily digested and put to work than the original. Having worked directly with Porter for almost two decades, and having applied his ideas during my years as a strategy consultant, I was arrogant enough to believe I wasn’t going to learn anything new. Wrong. When you put that body of work all together, when you integrate the new with the old, you tap into a rich vein of practical and often surprising insights. Not least among them is that most companies think they have a strategy when they don’t.

    So as I worked on this book, I kept a list of those insights. Here it is.

    1. Competitive advantage is not about beating rivals; it’s about creating unique value for customers. If you have a competitive advantage, it will show up on your P&L.
    2. No strategy is meaningful unless it makes clear what the organization will not do. Making trade-offs is the linchpin that makes competitive advantage possible and sustainable.
    3. There is no honor in size or growth if those are profit-less. Competition is about profits, not market share.
    4. Don’t overestimate or underestimate the importance of good execution. It’s unlikely to be a source of a sustainable advantage, but without it even the most brilliant strategy will fail to produce superior performance.
    5. Good strategies depend on many choices, not one, and on the connections among them. A core competence alone will rarely produce a sustainable competitive advantage.
    6. Flexibility in the face of uncertainty may sound like a good idea, but it means that your organization will never stand for anything or become good at anything. Too much change can be just as disastrous for strategy as too little.
    7. Committing to a strategy does not require heroic predictions about the future. Making that commitment actually improves your ability to innovate and to adapt to turbulence.
    8. Vying to be the best is an intuitive but self-destructive approach to competition.
    9. A distinctive value proposition is essential for strategy. But strategy is more than marketing. If your value proposition doesn’t require a specifically tailored value chain to deliver it, it will have no strategic relevance.
    10. Don’t feel you have to “delight” every possible customer out there. The sign of a good strategy is that it deliberately makes some customers unhappy.

    Do these seem self-evident when you stop to think about them? Or do you find, as I do, that they run counter to the way most managers think and behave? That’s why I’d argue that Porter’s work, while never trendy, has never been as timely for so many people working in both the private and public sectors as it is today. Amidst the enormous economic upheaval in many industries and countries around the world, strategy itself has come under fire. Porter’s fundamentals keep you grounded. They explain not only how companies sustain competitive advantages for decades, but also why strategy is even more important — not less so — in turbulent and uncertain times.

    The End of the Web? Don’t Bet on It. Here’s Why

    December 20, 2011 by Mark Suster


    Fred Wilson recently posted a great video on his blog with the CEO of Forrester Research, George Colony. The money slide is the graphic below.

    The chart shows three scarce resources and their improvements over time. The top line is available storage (S), the middle line represents processing power (following Moore’s law) or (P) and the bottom line is the Network (N).

    In it he asserts that the web is dying and in its ashes will see the rise of the “App Internet.” The App Internet is different than the HTML Internet (aka The Web, WWW and in the mobile arena “The Mobile Internet” or short-hand HTML5) because the “presentation layer” and “client side” functionality are defined by applications that run on your mobile device and connect into the open Internet back-end to exchange information with other web services.

    He’s right about this. But only temporarily in my view. And while the App Internet is currently more powerful than the Mobile Internet it has fundamental flaws. It isn’t open in either its standards or in the way that applications are marketed and distributed. I will cover this in my post.

    Colony’s presentation is intriguing (and worth a watch if you have a few minutes) because I love to see when informed people make arguments that are different than you ordinarily hear (and different from my own views). In the end, my bet is that George’s bets will largely prove wrong. This blog post lays out my case. If anybody from Forrester reads this I hope they won’t see it as an attack on George’s presentation, which I found enlightening, well argued and interesting. My views are just a data point in the debate.

    In the end, Seth Godin’s comments on Fred’s blog post said it best:

    “His black swan is showing.

    The problem with just about every prediction made by industry firms like Forrester (all the way back to 1985 when these firms said that the Commodore 64 was going to change the world–until the VCR interrupted to become the next big thing) is that they are based on sophisticated analysis of what’s in the rear-view mirror. 

    A tough way to drive.

    The trends are legit, but we have no idea what unexpected breakthrough in human interaction is going to change everything.”

    In other words, nobody can really assert authoritatively what the future of tech or the Internet will hold. I have some educated guesses.

    George’s Arguments

    1. The web is dying and will be replaced by “the App Internet.” He says that since storage & processing are growing at a much more rapid rate than the network we’ll be at a point where not having apps on devices will greatly under utilize the power of the devices in our hands. In other words, our mobile devices are all powerful and the network that they connect into sucks.

    2. Social networking is peaking. He cites that we have reaching a saturation of social networking in which nearly everybody is already using social networks (85+% in most developed countries and in urban environments in the developing world) and the amount of time dedicated to social activities already exceeds many other important tasks such as exercise and is even approaching the same amount of time we dedicate to child care. He argues for a world he calls POSO (post social) in which we will only use social applications which drastically cut down our time involvement and/or increase our productivity.

    3. Social media will be pervasive in the enterprise and is primarily driving by customer interactions. He shows data that the overwhelming majority of major enterprise in the US is currently adopting or looking to adopt social networking technology. When asked what their objectives are they cite some form of “improving customer communications” by a long margin.

    A (Very) Brief (and Selective) History in Computing
    To understand my perspective you have to rewind to the late 80′s / early 90′s in business computing. As a software developer I wrote code on what was called a “dumb terminal” because it literally had no processing capability. It is the opposite of the world that Colony describes. The local computer had no processing capability, the network did its job and the central computer was the master.

    We wrote programs that existed solely on a centralized computer (a mainframe), all of our data was stored centrally and all processing was centralized. When we wanted to compile our programs (turning human programming language into an executable file that the computer can read) we had to submit them to the mainframe and wait for them to be processed in sequence along with everybody else’s code.

    In busy times compiling a program could take more than an hour, so we obviously didn’t submit often and if our program had errors and was unable to compile it was devastating. Things got so bad on one project that we ended up doing split shifts with teams of people programming from 8pm-6am and the next team arriving at 8am.

    Throughout the 90′s the PC became much more popular in corporate environments, so companies began to replace dumb terminals with PCs. We ran software on the PCs called “terminal emulation” that allowed us to act like a dumb terminal to interact with mainframes and to act like a PC (with word processing, spreadsheets, etc. the rest of the time.

    In this era the computing model known as “client / server computing” was popularized. What this model said was that since we now had really powerful processing on our desktops we should split the computing responsibilities between the PC (the client) and the mainframe (the server). Initially the computer did basic functions like “screen validation” (making sure that you didn’t enter non-sensical data into fields, for example) and could take over functions like compiling your software code so you could check for errors before submitting it to the mainframe.

    Over time the PCs began to do more and more. They took over the “presentation layer” of computing. As a society we got used to the windows metaphor of computing. So suddenly we had “drop down” boxes that gave us multiple choice selection of data, we had dialog boxes that would prompt us with “Are you sure you want to proceed? Y/N.” This initially took on what was called “thin clients” because the server did most of the work.

    The more the processors on our PC improved, the more we expected our PCs to do and everybody gushed about this new era where we had much better user interfaces and we had way more individual device power. Centralized computing was giving way to smart, distributed devices.

    Sound familiar?

    It was wonderful. For 5 minutes. Then the unintended consequences started cropping up.

    • How much data was acceptable to sit on local devices? Few had considered what happened in a world in which the data was distributed. Suddenly you had security risks, confidentiality problems, privacy concerns (think about your medical records being distributed), etc.
    • What happened when you submitted a processing request to a central server (think, I’d like to transfer money from my bank to yours) but the transaction didn’t complete? You could be in a situation where your local computer had assumed the money was transferred and it wasn’t. We had to develop whole frameworks of “middleware” to deal with this problem. We had to come up with “two phase commits” and “rollbacks” and other data tricks to keep our devices in sync.
    • We started to realize that that most expensive part of computing was actually manpower. Manpower to develop all of these applications, manpower to maintain them, and manpower to deal with all of these devices, which added great complexity to our IT environments. For example, on any software upgrade for a typical client/server enterprise package it would take up to 50% of the overall development budgets to deal with testing the software in heterogenous environments.

    So having powerful devices with decentralized computing is not always a panacea.

    Enter the World Wide Web (WWW).

    As George appropriately describes in his video, the Internet and the Web are two different, but related things. The Internet represents what you might think of as “plumbing.” It defines how data gets moved around on networks, how files get located, how files get transfered between devices, how packets of data get sent via routers, etc. The WWW is the presentation layer. It’s central standard was HTML (hyper text markup language) that described how we would show data on computer screens.

    When web browsers (the programs that can read and interpret HTML) were popularized they were “dumb.” It was literally like returning to the old days of computing. On the Web almost all of the processing was centralized and your browser was your input / output device. As an example of how dumb they were (for those that don’t remember) whenever you changed one field in a browser-designed program the entire screen had to refresh. It was a terrible user experience.

    But for software developers like my company the web was a blessing. We were able to crank out software code at a much greater pace than was ever possible for. We designed our code and tested it in a Firefox browser and once we had working code we then had to figure out how to make the clunky Internet Explorer work. But the heterogenous environment was practically eliminated. We didn’t have to worry about which computer you were on. we didn’t have to support 3 database types, worry about network configurations, etc.

    We flirted for a brief period with building some client-side applications (mostly for offline use) but abandoned those efforts when we realized how much overhead it took to maintain – especially as we release new versions of our code and had to always keep the local, offline software in sync with our releases.

    We adopted an ethos that all of our development would be web only and that eventually browsers would become more powerful and make the user experience much better. And that’s what happened. A series of standards emerged known as “AJAX” (asynchronous javascript and xml) that gave the web-based designer much more control over the browser. Suddenly you could update small portions of the browser without refreshing the whole screen.

    AJAX was one of the major drivers of the “dot com renaissance” that became known as Web 2.0. As people realized streamlining client-side development really matters to cost-effectively build software, new tool sets emerged to streamline the process. Libraries like jQuery have emerged that lower the effort to build front-end code.

    Web & Social Change the Landscape of the Web

    Prior to the popularization of smart phones and Facebook we were in a pretty good place on the Web. The one big concern many people had was how to constrain the total dominance of Google. Every startup (every company, really) was beholden to the traffic god that was Google search. One change in Google’s algorithm and whole businesses could be wiped out as chronicled in this excellent book by John Battelle called The Search.

    The growth of social networking (er, the growth of Facebook) along with the growth of the iPhone have changed the landscape dramatically.

    In this chart from Silicon Alley Insider you can see the first major trend to affect the open Web – the growth of Facebook. And Facebook’s popularity has only increased in the past year.

    Why does the rise of Facebook affect the web? Because it isn’t a part of the open WWW. Facebook exists behind a walled garden. You need to log in to use it. Content or software developers who want to build products that work in Facebook have got to develop inside of Facebook’s framework rather than working on open, Internet standards.

    Brands, celebrities and even individuals like you who produce information inside this walled garden are subject to the rules & conditions set upon you by a private company – Facebook. This isn’t a case against Facebook, it’s just a statement of fact.

    As more people consume Facebook pages, less people are consuming open Web pages.

    I wrote about this previously here and spoke about it on YouTube with Howard Lindzon here & here. (if you’re not that familiar with the topic it’s worth a 20-minute watch)

    Is the App Emerging as the Winner?
    The App Internet had a clear advantage in the past few years. Why? Because the mobile devices had a series of new features for which mobile browsers were not optimized. Examples include the camera, GPS, the accelerometer and the small screen sizes.

    And importantly when developing games that require high-end graphics to handle game play you need to make use of the iPhone’s PowerVR GPU (graphics processing unit).

    So Apps were inherently more powerful than browser-based applications.

    It also had two other huge advantages.

    1. Apple had a mechanism for charging users for apps and because most people already had an iTunes account it was simply 1-click to purchase an item. This meant that small teams could create games and make real revenue whereas on the Web this was much harder because you either had to build (or license) your own billing infrastructure, convince consumers to get out their credit cards (which they don’t like to do) or you had to sell enough advertising to make it worth offering your product.

    2. Apple had a store. For early game developers this made it easier for your application to be found on the limited “shelves” in the iPhone App Store. Now that there are MANY more apps out there – this isn’t such an easy game. But in the early days the App Store was very appealing to new entrants.

    Round 1 clearly goes to the App Internet.

    Will the App Metaphor Hold for Mobile?
    This is where my disagreement with many starts. I think the allure of Round 1 has convinced people that in mobile, apps are better. I’m not so sure.

    1. Workarounds are developed. The surest sign of a market inefficiency is when solutions emerge to help developers get around the bottlenecks of platform development. This is what is happening in mobile. Developers are now able to build apps in native languages such as Javascript or HTML5 that can run in multiple platforms.

    There are companies that develop “wrappers” that in essence handle all of the functionality needed to control each individual device that “abstracts” the programmer from having to build in device specific code. Some of the companies that do this include PhoneGap, Appcelerator, Strobe and RhoMobile.

    2. Browsers will catch up. Just as in the first round of the web when everybody complained that web browsers weren’t powerful enough to build applications on, many of us believed that open systems would win. Eventually standards will emerge that will make it easier to build natively into browsers. Effectively either the wrapper developers become browsers or the browsers build wrappers or the two groups merge.

    Also note that AJAX finally took off when Google open-sourced a bunch of its internally built AJAX frameworks. I wouldn’t be surprised if big innovations from Facebook and others in the mobile web eventually see there way into open-source mobile initiatives.

    3. The costs of multi-platform development are too expensive. The costs for developers to build for multiple platforms is too great, the gatekeepers are too powerful and the outcomes ultimately limit innovation as happens in any system when a few players are a choke hold on distribution.

    If you want to do a deeper dive on why I believe this is bad overall for the system despite the short-term allure of iPhone’s beautiful products please see my post “App is Crap, why Apple is bad for your health.” And before Fanboys slam me, please note that I own 3 Mac laptops, 2 iPads, 5 iPod devices & Apple TV. I love the products. That doesn’t mean I think it’s great for our future as an industry to have a close distribution system.

    4. Distribution becomes a stranglehold. Fred Wilson talks about this in his “mobile gatekeepers” post. The early allure of empty shelves in the App Store is making way to the over-crowded shelve (currently tallied at more than 500,000 SKUs). This leads to all sorts of games by developers to get into the rankings, most of which favor companies with more cash.

    Also, whenever we see distribution strangleholds we tend to see slower innovation and more resistance by the distributor to change. Think about the following examples:

    • mobile phone companies who controlled our crappy phones prior to the iPhone breaking that hegemony
    • cable & satellite companies who have controlled our paid TV through set-top boxes that make it impossible for innovation on the TV set
    • radio stations that controlled the music we listened to until music could achieve wider distribution on the Internet

    Choke points are never good for innovation.

    5. Data, data, data. Just as when we first went from mainframe computing to client-server computing we forgot that data leakage and data management across multiple devices is a big issue. The App Internet creates the potential for many more data issues. That doesn’t mean they can’t be solved, but it’s not as easy as saying, “powerful apps on our mobile devices is the best answer.” More power, more distribution = more data problems.

    6. TCO. There is an acronym we use in computing called TCO or Total Cost of Ownership. It is often used in ROI calculation on projects to estimate a build vs. buy decision. Often people who build apps internally at their company calculate only the costs of the initial build rather than the total costs of maintenance of the project. Maintenance often greatly exceeds the development costs when you consider both human costs of maintenance plus the loss of productivity of not having an app that innovates as fast as the market solutions do.

    I think there’s a TCO argument to be made against the proliferation of the App Internet. The more companies build their own apps, the more maintenance work they’ll need to do, the more employees they’ll need to maintain their apps and the further the innovation drain. I know this is a harder concept to quantify and intellectualize but I’ve seen it first hand in 20 years of working with large corporation on “legacy” IT projects. The App Internet opens the door to many more legacy apps.

    This argument never features into any young developers mind because it takes years to see the decaying effect of legacy infrastructure in corporations (plus, many app developers prefer the sexy world of consumer apps).

    To be clear … I think that the App Internet won’t disappear overnight. I also think certain apps will always be more effective built natively. But the same is true of today’s non-mobile computing. Still, most apps need not exist. Long live the Mobile Web.

    And What about Colony’s Assertions about Social?
    I’m going to save that for a future post. Coming soon.

    Postscript:

    “If I had more time, I would have written a shorter letter.”

    Marcus T Cicero

    Sorry for the uber long post. Given more time I could make it concise. And I’d have fewer typos. But I valued getting my ideas out there. If you think there are any inaccuracies I’d be glad to meet you in the comments section and I’ll gladly amend any mistakes (rather than differences of opinion)

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    *

    You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>