The Awesomeness Of Data Interpretation: Manu Sharma, Principal Data Scientist, LinkedIn

When LinkedIn was launched in 2003, it took them 500 days to reach their first million customers; the most recent million took just six days. Today there are two new registrations with the site every second, 4.2 billion user searches a year and the data analysis team looks at 200 TB of data each day to understand its users better. 

Five years ago, inWhat is Web 2.0, Tim O’Reilly said that “data is the next Intel Inside.”  Why do we suddenly care about statistics and about data? Why data science is the Sexiest Job of the 21st Century? This Lesson focus on getting some of the answers from Manu Sharma

by TiE Mumbai
5 months, 1 week ago
1

LinkedIn is your professional digital real estate. When people look for you and don’t find you it’s like lose of a potential opportunity. Therefore it’s important to keep the profile up-to-date. LinkedIn uses data to build products and generate insights to drive the business. To achieve this LinkedIn have developed proprietary algorithms such as   Metropolis. It process over 10 billion rows of data everyday in real time by building it’s own unique solutions like Voldemort, Kafka, Zoie. These have been made open source now. 

Data Scientist is the right combination of curiosity and intuition; I wonder what can I do with this data? what questions can I ask? What can this data tell me?  It’s about having the right intuition to know the limitations of your approaches. It involves gathering data, standardizing it, doing the right modeling, doing stacks on it and having the ability to code it. A data scientist needs all these skills and that’s what startups should look for when setting up their data science teams.
TiE Entrepreneurial Summit 2012 Data Science @ LinkedIn Manu Sharma
0 Comment
Write comment
2

Key Application of Data Science @ LinkedIN

  • Build Innovative Data Products
  • Drew Insights
  • Drive the Business 

Inference Algorithms helps in predicting information based on users network data. This can be extremely critical in future product development. In particular it helped in building “People you may know” feature is key to it’s user engagement and viral growth. A feature invented at LinkedIn; now used in every social product.

Similarly LinkedIn build “Skills” by extracting and analyzing free form text written by users under specialties section and creating a standardized dictionary of skills key words. Which then can lead to a lot of interesting insights by applying Clustering Algorithms      

The Data also leads to meaningful insights such as predicting future through identifying trends in sectors and economy. Good thing is that is not a survey data; it’s the real data that users provide through their activities. It’s not surprising that this data made a part of US presidential economic report as policy inputs. This same data is equally vital in driving business growth.

Best Practices   

  • More data is better than less data
  • Raw data is better than processed data
  • Data Standards and Data Quality are vital
  • Simple Models are better than Complex Models
  • Fail Fast, Iterate, test, test and Test
TiE Entrepreneurial Summit 2012 Data Science @ LinkedIn Manu Sharma
0 Comment
Write comment
2
Sections
4616
Views
1. LinkedIn and Data Science
2. Data Driven Development
Share
0
Share
1
Login and track your Progress
0%
F Like this Lesson ? Invite your friends to LurnQ.
Recommendation For You
Loading
Something messed up. Click Refresh Recommendations button and try again.
Discuss this Lesson
Loading Details
Loading...
Give us your valuable Feedback
Some error occured, please try again.
Feedback
Bug Report
Loading Details