RACONTEUR

Entries in machine learning (2)

Saturday
Jul302011

Recommender System + OTT or STB = Personalized TV

What To Watch

There is so much good programmed content available today it has become a staggering proposition to sift through the morass and find things that are interesting, compelling or even stimulating.  This is both a good and bad thing for the end consumer.  The proliferation of content choices has made the position of consumer an arduous task – and search does not solve the challenge.  “We want to be entertained and don’t really want to work for it.”  It is this divide that has consumer businesses scrambling for a better mousetrap and more often than not these content megaliths are turning toward recommendation systems.

Understand Me Please

Consumers today are engaging with content and should be rewarded for it.  They are providing valuable insight into their interests simply by participating within the content ecosystem.  Each time a consumer sees a content element, they have the distinct opportunity to interact with it.  What they do and what they don’t do can shape the ongoing model of what that consumer should be presented with.  Time of day, device (i.e. tablet, mobile phone) or set top box provide more articulation of exposure and direction with respect to future interest probabilities.  Armed with user behavior, contents’ performance in a Darwinian ecosystem and meta-data, recsys solutions have a strong foundation from which to attack the burgeoning content discovery battle.

What will make this personalized TV movement fail, however, is by forcing the user to contribute meaningful information to the personalization model at every turn.  There is no way to realize the ‘lean back’ experience where content is magically presented in increasingly more accurate interest categories if the user has to proactively engage with the system.  This is where behavioral observation can play a lead role, allowing consumers to simply ‘be’ in the system.  The consumer can unassumingly go about her day by interacting with the things she finds appealing all the while a recsys is paying close attention and is learning about her interests and preferences in the background.

What about systems that allow consumers to rate things?  Cool – this is helpful information.  I would like to suggest that the star rating system and other incarnations like it be banned from use.  Consumers just don’t know what to do with stars in-between one and five.  The best rating approach is really ternary (i.e. thumbs-up, thumbs-down and neutral or no thumb).  It’s the clearest and most elegant way to allow consumers to express interest.  You can see this KISS principle in action at sites like YouTube.

“What about my privacy?”  This is one of the top questions that arise from discussions I have around recommendation system implementations.  My approach is always to let the consumer know up front what is trying to be accomplished by such an effort and to get the user excited and engaged with the experience.      Surreptitious gathering of user data has gotten some very large and well-known public companies into trouble.  There is no excuse for ‘spying’.  It’s not needed as long as the target goals have the consumer in mind.  Systems can be very successfully designed and implemented that do not require Personally Identifiable Information (PII).

The Road To Personalization

Keeping the aforementioned points in mind, the ideal of providing a compelling content experience is readily achievable.  Especially as a growing number of consumers employ the tablet as a second screen or even in some cases use it as the primary screen.  Terrific amounts of quality data can be gathered from these devices.  In turn this data will be digested by the recsys platforms, which will produce more targeted, relevant personalization experiences.

Personalization solutions will become table stakes in the new world of multi-device entertainment.  Those that realize this early will have a great advantage in the competitive marketplace of content and consumers.

Tuesday
May172011

Content, Content Everywhere and not an Item to Consume

I Feel Bloated

You have spent a great deal of time building up the catalog of things that you want to make available for consumers to purchase or consume.  In fact you have done such a great job that now you have (gasp) too much stuff and folks are not getting the value of the deep and rich service you provide.  You might even have multiple, complementary platforms that you offer your service or catalog through – potentially complicating the landscape of what to select even further.

Let’s face it; users are confounded at the myriad choices they have today.  “I have 999 channels to surf”, “There are 350,000 apps to look at”, “What other kinds of music tracks or artists might I like?”  Users are frustrated because search isn’t the answer.

 

The Wilson Confounded Search Conjecture:  You don’t know what to search for if you don’t know what you can search for.  

 

Search doesn’t solve the challenge of Discovery.  In order for Discovery to be a part of the user experience, the stuff they might like necessarily needs to find them!  It is precisely this challenge that we in the recommender system (recsys) space are aiming to solve.

I Think I Need A Recsys

The first step to solving an issue is to recognize that one exists.  If you are reading this post then you may have come to the realization that your stuff just isn’t performing and your users aren’t engaging. Don’t worry; there is a way through.

Some recsys questions to consider:

  • What are your goals and KPI’s that measure success for a recsys approach?
  • How broad is the catalog of items you express in your service?
  • Is that catalog of items subject to frequent change or is it static (long tail) in nature?
  • Do you believe in social and what does social mean to you?
  • Do you want content based recommendations (how content items relate to each other) or are you going for a more personal approach (recommendations based on users and their behavior)?
  • Do you have good event data to work with (click path histories, purchases, downloads, viewership)?
  • Is speed important?  How quickly do you need a request for a recommendation returned to you?
  • Is scale important?  Do you have “web scale” usage; many millions of users and thousands or millions of items to recommend?
  • Are you comfortable with Big Data?
  • What are your time to market needs?

Build vs. Buy?

This is an interesting and often challenging question to answer honestly.  Sure, you have some sharp engineers that are invigorated by the proposition of building this kind of tech.  It’s cool.  Super geeky, but cool. The tough question is; do they really have the experience and horsepower needed to get it done?  

Any amount of research into the recsys area will certainly reveal several Open Source Software (OSS) projects that aim to bring some general-purpose solutions for this heady problem down to earth.  Here are a few of the more promising projects:

Mahout: http://mahout.apache.org/

Apache Mahout is a scalable machine learning library that supports large data sets.

Lucene: http://lucene.apache.org/java/docs/index.html

Apache Lucene(TM) is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

Solr: http://lucene.apache.org/solr/

Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search.

The availability of these kinds of OSS options can mask the serious and complicated engineering nature of architecting a successful recsys solution.  Be aware that there is much work to be done before the above solutions can become pragmatic for your endeavors.

Also, make certain to answer the recsys questions listed above before embarking on your recsys project.  The answers will shape the success criteria for a buy option or the product definition for a build decision.