Staff Data Scientist at Google (Head of The Data Guild at Waze, which includes the Data Science Group, Product Analytics Group and Data Infra),
Lecturer at TAU (School of Economics) and IDC (Executive Education),
Former Chief Data Scientist at Outbrain,
PhD, Econometrics - Tel Aviv University
I am a hands-on Data Scientist and Group Lead. I specialize in Economic applications of machine learning. My work combines academic research with industry data-driven products. You may ask yourself - what do theoretical Economic models have to do with the machine learning that Data Scientists practice? Well.. from my experience.. everything.. Economic applications of Machine Learning range from marketplace management and auction optimization in the field of recommendation systems, to efficient transportation and behavioral nudging people to carpool, and up to optimizing the decision-making processes of NBA coaches and CEOs. A sample of my work in these various domains is detailed below.
Or simply subscribe to my newsletter below to stay up to date.
SELECTED RESEARCH & PRODUCTS
(PRESENTED IN CONFERENCES + ACADEMIC JOURNALS)
THE ECONOMICS AND DATA SCIENCE OF ELIMINATING TRAFFIC ALTOGETHER
This short lecture outlines the work my Data Science team does at Waze - helping people find great matches for Carpooling on their daily commute. The lecture highlight how our work is influenced by classic Economic models (specifically by William Vickrey's), combined with modern Economic Models (such as Nudging theories) and Machine Leaning modeling.
URBAN PULL: THE ROLES OF AMENITIES & EMPLOYMENT
This paper leverages new measurement of neighborhood amenities to demonstrate that housing prices and rents in U.S. cities are determined nearly as much by proximity to amenities as they are by proximity to employment. We develop a revealed preference measure of amenities using navigations data indicating the locations in which people consume leisure. Consumption amenity centers overlap substantially with employment centers but are distinct and have distinct effects on prices. Using the Alonso-Muth-Mills within-city spatial equilibrium framework, we estimate the relative importance of amenities and employment in demand for neighborhoods. The navigations-based amenity measure strongly and positively predicts local and nearby prices with spatial decay. It adds substantial explanatory value relative to observable-venues-based amenity measures as well as to several strictly localized amenities, such as school quality or crime. We show that constraining neighborhood amenities to be consumed only by locals, when in fact people may travel within city to consume amenities, misses a key feature of cities and biases estimates of both commute costs and the value of amenities. These improvements in amenity measurement increase the estimated importance of amenities relative to employment in location demand and suggest the potential robustness of cities to changes in employment locations.
THE IMPACT OF HIGH-OCCUPANCY VEHICLE LANES ON COMMUTERS: FIELD EVIDENCE
Governments are actively investing to alleviate traffic congestion. Starting from the 1970s, a common policy has been to introduce high-occupancy vehicle (HOV) lanes, that is, traffic lanes exclusively reserved for vehicles with more than one occupant. Following the introduction of HOV lanes in several countries, their effectiveness has been the topic of heated debates. In this paper, we have a unique opportunity to empirically examine the impact of introducing HOV lanes on carpooling. In October 2019, the Israeli government has decided to introduce three HOV lanes. In this context, we have access to traffic and carpool data—from Waze (the free GPS navigation app owned by Google)—both before and after introducing the HOV lanes. We can thus rigorously quantify the impact of introducing an HOV lane on carpool intent and adoption. We also study refined questions around the effectiveness of different types of HOV lanes (2+ versus 3+), the impact at different times of the day, and the behavioral effect on commuters (e.g., strategically adapting commute times). Our study shows that the introduction of the HOV lanes led to a median time saving during rush hours of 5.7–15.7 minutes and increased the carpool rates by hundreds of percent for some routes. Interestingly, we find that the new HOV lanes have a global impact as they also raised the carpool rates for routes unaffected by the HOV lanes. This effect can be explained by the increased awareness of the public about the opportunity of carpooling.
NUDGING COMMUTERS TO CARPOOL: A LARGE FIELD EXPERIMENT WITH WAZE
Traffic congestion is a serious global issue. A potential solution, which requires zero investment in infrastructure, is to convince solo car users to carpool. In this paper, we leverage the Waze Carpool service and run the largest ever digital field experiment to nudge commuters to carpool. We find a strong relationship between the affinity to carpool and the potential time saving through a high-occupancy vehicle (HOV) lane. Specifically, we estimate that mentioning the HOV lane increases the click-through rate and conversion rate by 133-185% and 64-141%, respectively relative to sending a generic message.
MIND THE DATA CONFERENCE
This lecture lays my manifesto about the current role that Economists have in the industry, and how they should change their practice if they want to keep the Science of Economics in their hands. The most important lesson from this lecture - "Economists should have their skin in the game", meaning - they should build products instead of consulting, and stand behind their failures.
WHICH INCENTIVES GET PEOPLE TO CARPOOL?
(WAZE LATAM SUMMIT, MEXICO CITY 2019)
This lecture outlines the Analytical work that is being done at Waze about Carpool Incentives: Subsidies, Matching Algorithms, Lock-In Supply and many more.
"FOR YOUR EYES ONLY": CONSUMING VS. SHARING CONTENT ON FACEBOOK
The most comprehensive work ever done to compare what people read online vs. what they share on Facebook. The paper analyzes two types of user interactions with online content: (1) private engagement with content, measured by page-views and click-through rate; and (2) social engagement, measured by the number of shares on Facebook as well as share-rate. Based on more than a billion data points across hundreds of publishers worldwide and two time periods, it is shown that the correlation between these signals is generally low. Potential reasons for the low correlation are discussed, and the notion of private-social dissonance is defined. A more in-depth analysis shows that the dissonance between private engagement and social engagement consistently depends on content category. Categories such as Sex, Crime and Celebrities have higher private engagement than social engagement. On the other hand, categories such as Books, Careers and Music have higher social engagement than private engagement. In addition to the offline analysis, a model which utilizes the different signals was trained and deployed on a live recommendation system. The resulting weights ranked the social signal lower than clickthrough rate. The results are relevant for publishers, content marketers, architects of recommendation systems and researchers who wish to use social signals in order to measure and predict user engagement.
Joint work with Ram Meshulam.
INTRODUCING OUTBRAIN LOOKALIKE AUDIENCES
This is a product that my team at Outbrain developed - a marketer (for example - an online retailer) delivers Outbrain a list of valuable users, for example - users who have made a purchase, not necessarily through Outbrain. We use machine learning models, such as logistic regression, decision trees and matrix factorization to characterise these valuable users' content interests. Such interests (we call those 'features'. There are thousands of those) may include the main content categories they read and not likely to read, publishers they visit and not likely to visit, personas and companies they're interested in etc. Using these models, we identify in real time a user which is not included in the marketer's list, but similar to those users, and recommend them with campaigns by that marketer.
Research led by Moran Gavish.
USER ENGAGEMENT - BEYOND CLICKS
Outbrain serves over 150 billion content recommendations to more than 500 million users every month. Masses of data tell us what’s driving the mindset of the crowd at each point in time. But how do you analyze if the individual user finds real value in recommendations? And why being satisfied with click-focused-metrics is dangerous for long term growth?
This lecture outlines a Data Scientist’s experience and challenges when analyzing post-click-engagement, in the context of content discovery. This lecture shows examples of how relying on click-focused-metrics might be misleading you in the long run. We will share data of how crowd preferences of consuming content differ from individual user preferences. Finally, we suggest a 3-layer framework for Data Scientists to measure and analyze post-click-engagement, while considering the perspectives of host publishers, marketers and recommendation providers.
TERMINATION RISK AND AGENCY PROBLEMS: EVIDENCE FROM THE NBA
When organizational structures and contractual arrangements face agents with a significant risk of termination in the short term, such agents may under-invest in projects whose results would be realized only in the long term. We use NBA data to study how risk of termination in the short term affects the decision of coaches. Because letting a rookie play produces long-term benefits on which coaches with a shorter investment horizon might place lower weight, we hypothesize that higher termination risk might lead to lower rookie participation. Consistent with this hypothesis, we find that, during the period of the NBA’s 1999 collective bargaining agreement (CBA) and controlling for the characteristics of rookies and their teams, higher termination risk was associated with lower rookie participation and that this association was driven by important games. We also find that the association does not exist for second-year players and that the identified association disappeared when the 2005 CBA gave team owners stronger incentives to monitor the performance of rookies and preclude their underuse.
INTRODUCTION TO ECONOMETRICS
Tel Aviv University,
The Eitan Berglas School of Economics,
Executive Education Program
BIG DATA FOR ECONOMISTS
Arison School of Business Administration
(Also taught at TAU School of Economics)
PROMOTING DATA SCIENCE EDUCATION
A/B testing systems have become a mandatory tool for Data Scientists and Product Managers for getting insights and learning about which features work and drive engagement with users. In this Lecture (Hebrew), I draw the 4 fundamental hazards that rapid-growth startup face in utilizing A/B tests for key learnings, especially in today's marketplace-oriented products
HEBREW UNIVERSITY'S PPE CONFERENCE
LEARN DATA SCIENCE ONLINE FOR FREE
Even if you don't have the capability of going to college - you can still become a proficient data scientist, almost for free. This is my "greatest hits" list of online classes. It comprises a pretty full survival kit for to-be-data-scientists.
CLICK PREDICTION CONTEST ON KAGGLE
Our “Outbrain Challenge” was a call out to the research community to analyze our data and model user reading patterns, in order to predict individuals’ future content choices. The best models were rewarded with cash prizes totaling $25,000. The sheer size of the data we’ve released (100 GBs) was unprecedented on Kaggle, the competition’s platform, and was considered extraordinary for such competitions in general. Crunching all of the data may be challenging to some participants—though Outbrain does it on a daily basis.
THE TECHNION'S NEW DATA SCIENCE PROGRAM - A REVIEW
BIG DATA ON THE BAR
A light lecture for potential undergraduate students at Dizzy Frishdon, Tel Aviv)