From challenging projects to cutting-edge technology, Twitter uses everything to give its users one of a kind experience. It is the back end work of some smart people (you can also call them Data Scientists) who make the elite class, influencers, bloggers, or even the people around look cool. From Mann Ki Baat with PM to the First Lady of U.S to hilarious tweets of the world cup, the world now doesn’t have to wait for the news to be aired after 24 hours.
Talking about Data Scientist at Twitter, this article will help you understand how Data Science is used at Twitter. How the data scientists are contributing to maintaining the competitive edge of the product. You will also get an insight into the types of Data Scientists at Twitter and even their roles and responsibilities.
How is Data Science Used at Twitter?
Data Science is used at Twitter in two different ways as there are two types of data scientists who work differently –
Type A Data Scientists
The A here stands for Analysis. This is a more static approach towards the analysis of data or gaining insights from it. The work of a Type A data scientist is more closely related to that of a statistician. A Type A Data Scientist is well versed with data cleaning, working with large data-sets, data visualization, domain knowledge, etc.
Type B Data Scientists
The B here stands Building. While Type B Data Scientists share their background in statistics with Type A Data Scientists, they are well versed in coding and fundamentals of software engineering. They are responsible for building data products that directly interact with the user. This helps them to craft products that provide recommendations and other forms of interactive results to the user.
Data Platform at Twitter
There are three types of companies upon which the magnitude of the data platform depends –
- Early Stage Startup
- A Mid-Stage Growing Startup
- Enterprise and Large Scale Companies
While an early-stage startup does not require a high data-intensive platform like Hadoop, much of it is contributed by the lack of data and cold start. A Mid Level Startup focuses mostly on gaining insights from the data. However, an experienced company like Twitter already has a well-developed data platform. There are various requirements in a large scale enterprise like Twitter such as – the need for maintaining the competitive edge, efficiency in logistics, optimization that requires Data Scientists that are skilled at Machine Learning. At Twitter, there are hundreds of Map-reduce jobs that are processed daily and efficient and reliable ETL processes.
Responsibilities of a Data Scientist at Twitter
The duties of a Data Scientist at Twitter can be categorized into four categories –
1. Developing Insights from the Product
Using Data to discover insights and implementing the ideas to better the product is one of the main responsibilities of the data scientist. This data is gathered whenever a user interacts with the device, which is ultimately stored in a log file or metadata for further usage.
There are various ways of analyzing this data. The first method is to have a straightforward process of understanding user eligibility through push notifications. The next mode of analysis is the SMS delivery rates across different carriers and, finally, analysis of multiple user accounts.
2. Building Data Pipelines
At Twitter, Data pipelines are extensively used. A Data Pipeline allows the aggregation of data from various sources and makes it easier for the data scientist to perform operations on it.
The analysis that is carried out at Twitter is through these data pipelines. It allows the jobs to be executed automatically and powering of the dashboards to facilitate user consumption of the data.
3. Performing Experimentation (A/B Testing)
Another vital role of a Data Scientist at Twitter is to carry out A/B testing. A/B testing is a randomized experiment with two variants. It is a form of hypothesis testing through which the company can determine the option that draws the most users. At Twitter, experimentations like A/B testing are carried out as part of their tool, Duck Duck Goose (DDG). It allows the system to accumulate big data that is gathered through millions of tweets, delineate changes in the social graph, make server logs, and records of user interactions through web and mobile clients.
4. Predictive Modeling
Predictive Modeling and Machine Learning are two of the most critical responsibilities of a Data Scientist. Twitter is a data playground. The colossal amount of data can be harnessed through various predictive modeling techniques and machine learning techniques.
With the help of machine learning, data scientists at Twitter can reduce the number of spam messages to the users. It also applied advanced deep learning techniques to provide relevant notifications.
Summary
In this article, we went through the daily responsibilities of a Data Scientist at Twitter. Being one of the largest companies in the world, Twitter gains insights about the users and provides them with relevant content through these Data Scientists. We also learned about the types of Data Scientists that are hired at Twitter.
Hope after reading the article, you are motivated enough to start your career as a Data Scientist at Twitter. If you want more such articles or data science case studies, let us know through comments.
Happy learning
twitter – DepositPhotos