Lessons from Amazon

Coverage by Bhat Dittakavi of Variance.AI on “Lessons from Amazon” by Rajeev Rastogi of Amazon at NASSCOM AI Summit @Hyderabad on 23rd June  2017
Rajeev Rastogi, Director ML @Amazon
8 areas of focus: 
Retail, Customer, Seller and Catalog
Text, Images, AWS and Speech
Let me tackle them one by one.
Problem: Product Demand Forecasting
Forecast based on the past sales in a region (Over forecast means additional warehousing costs, Under-forecast means not giving  customer delight)
Challenges
-Scale (100s of millions of products in our catalog)
-New Products (No past demand!)
-Sparsity: (Huge skew -many products sell very few items -long tail phenomenon)
-Seasonal: Demand for some products exhibit seasonal patterns
-Demand Spikes: Variation due to external events
-Distributions: Future is uncertain (prediction are average demand. We need probability distributions of the demand)
Products are stocked at 90 percentile of the demand not at average demand as we are customer obsessed.
Problem: Given a partial user query, find the relevant products to display
Challenges
-Scale: (100s of millions of products in our catalog)
-Real-time prediction: Low latency
-Query analysis: Understand semantics (Apple or apple), identify phrases, classify into product category
-Intent Detection: Window shopping or buying?
Knowledge graph
Problem: Given a product description from a seller, map it to the appropriate leaf node in product taxonomy (17000 nodes)
Challenges:
-Scale (multi-class classification problem 1000s of classes)
-Products vs Accessories: Hard to distinguish
-Noisy Data
-Training Data Quality (Wrongly labeled, some classes may have very few examples)
Problem: Gven product information (title, desc, price), find duplicate product listings in Amazon Catalog
Challenges
-High Precision Requirement: Incorect matching leads to poor user experience (Pricing: Matching iPhone with its adapter’s price is dangerous)
-Variations: Some variation (e.g. color) are insignificant while others are not
-Diverse Formats
-Noisy Data
Problem: Generate Review summaries. extract product attribute ratings from reviews
Abstractive summarization is not easy. Huge amounts of training is needed. Reviews could be coarse and not at grain level.
Challenges 
-Diverse attrributes
-Synonyms (Sound, audio)
-Informal style
-Stylistic variation (Linguistic style varies between users)
-Sentiment analysis (Gauging sentiment may require deep parsing of sentences)
Problem: Recommendation of right product at the right time at the right place to the right customer
Problem: Predict good competitor prices
Problem: Identify substitute products for a given product in Amazon catalog (Co-viewed products). When we are out of stock, we show substitutes.
Problem: Retrieve images from the Amazon catalog for visual similarity
Problem: Provide a voice interface to shop for products, perform tasks, answer questions
Q) Your confidence on deep learning?
ImageNet image classification error rate came down fro 25% to 5% with deep learning. Voice Recognition is deep learning. Machine Translation is deep learning. NLP and Translation too. Deep learning us good in sow IFSC use cases. 
Q) How do you handle fake product reviews?
We look for signals that detect fake reviews. We look for patterns and IP addresses.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s