Saturday, June 22, 2024

The Fault in Our Recommendations

Omar Besbes, Akshit Kumar and I just posted the initial version of a new paper to the arxiv. 
"The Fault in Our Recommendations: On the Perils of Optimizing the Measurable" The main idea is simple, and yet, we think, powerful. I'm going to give a highly informal version of it here. Online platforms for content (or goods and services) typically optimize recommendations for engagement (or purchases). This is natural since engagement is easily and immediately measurable, and the longer you can extend each session of the user, the more ads and sponsored content you can show them, right? Not quite, we argue. All clicks are not created equal. Some generate a lot more value for the user than others, and facilitating such discovery could boost user retention and lifetime value. Discovering "niche" content that they like can generate a lot more value for some users than only consuming mass market content, while other users prefer sticking to mass market content. Crucially, modern platforms have the ability to simultaneously serve both kinds of users well, even without knowing a priori which users are interested in exploring niche content, and even without being able to precisely measure user utility. The way to do this, we argue, is to recommend users a mix of niche and mass market content. Such an approach minimally reduces engagement while boosting user welfare significantly, relative to an engagement maximization approach which only recommends mass market content.

I should mention that Akshit is a rising final year PhD student in Columbia DRO, co-advised by Omar and me, and he'll be on the 2024-25 faculty job market. He's great, keep an eye out for him! Ping me if you want to know more about him. 

P. S. There is some energy now around utility-aware recommendations. Here's an interesting workshop on being organized by my colleague Dan Russo and others at Recsys 2024: https://sites.google.com/view/sure-2024/home

Saturday, October 14, 2023

Dynamic Spatial Matching (newly revised)

Today I'll share a bit about a paper which is very close to my heart, having recently revised the paper and posted it to the arxiv: https://arxiv.org/abs/2105.07329


Title: Dynamic Spatial Matching

Abstract: Motivated by a variety of online matching platforms, we consider demand and supply units which are located i.i.d. in [0,1]^d, and each demand unit needs to be matched with a supply unit. The goal is to minimize the expected average distance between matched pairs (the "cost"). We model dynamic arrivals of one or both of demand and supply with uncertain locations of future arrivals, and characterize the scaling behavior of the achievable cost in terms of system size (number of supply units), as a function of the dimension d. Our achievability results are backed by concrete matching algorithms. Across cases, we find that the platform can achieve cost (nearly) as low as that achievable if the locations of future arrivals had been known beforehand. Furthermore, in all cases except one, cost nearly as low in terms of scaling as the expected distance to the nearest neighboring supply unit is achievable, i.e., the matching constraint does not cause an increase in cost either. The aberrant case is where only demand arrivals are dynamic, and d=1; excess supply significantly reduces cost in this case.


The goal of the paper is to establish technical foundations for dynamic matching and resource allocation, where demand and supply live in d-dimensional space, and there are many types. My motivation for establishing such a foundation comes loosely from ridehailing, matching platforms, and network revenue management with many types. And of course, it was an opportunity to introduce a new, fundamental model and say something about it :) Motivating literatures include those for static random spatial matching (including Ajtai, Komlos and Tusnady '84, Peter Shor's thesis work, and Talagrand's work), and work on stochastic bin packing. 

From a personal perspective, there's a few things which have been special about this project for me. For one, this is my first single author paper in more than 10 years. Why? Well, I tend to write papers with others since I like collaborating with good and friendly people, and learning from them. It's nice to have somebody to talk to about the research, and also the publication process, disseminating the work, and what not. Even with this particular project, I interacted with many folks about it over months and years including multiple PhD students, several colleagues, and the reputed mathematician Michel Talagrand (which was a special experience in itself!). In fact, the idea for the key Hierarchical Greedy algorithm which facilitates my analysis and hence the main results in the paper came out of a conversation with Michel, in the context of his epic latest book: https://link.springer.com/book/10.1007/978-3-030-82595-9. Nevertheless, none of these interactions developed into a full blown collaboration, and I ended up doing this paper myself. Writing it alone has been a great reminder of the fact that writing one of these papers is hard, and takes a lot out of you. And yet, producing research like this is exactly why I'm in the profession in the first place, so no complaints :)

A second, related, special thing about this project is that it connects up conceptually with an exploratory project I was a part of way back in 2011, on Distributed Storage for Intermittent Energy Sources, with Andrea Montanari, David Tse and Baosen Zhang. There's something truly beautiful about wandering off in one's research for years, while something "old" continues to gestate at the fringes of one's interests, and eventually coming back to a topic after a decade or more with a new angle to move the needle. That's exactly what happened here. 

A third special experience with this paper has been that with the referee who reviewed it for the Annals of Applied Probability. This has been perhaps my best referee experience to date. The referee found an intricate bug in the proof of the easier of my results, and went on to construct a detailed fix for the bug, which would have taken me multiple days (at least) to come up with myself! So the bug was short lived. In addition, the referee suggested a very simple way to obtain my results for all d>=3 via a very elegant analysis indeed. I've included the insight in my paper. Many thanks to the anonymous referee, and here's hoping that I have the fortune to encounter more such brilliant, dedicated and helpful referees in future. 

Thursday, September 14, 2023

What I got done in Amazon's supply chain so far

I figured I'd give occasional blogging a go, and this will be my first post. While I don't have a detailed plan for this space, I intend to share professional updates that are top of mind for me here, such as career updates, blurbs about new papers (esp. my own papers!), and possibly general musings about where things seem to be headed with market design, supply chain optimization, AI, RL, and operations. This post will be a brief one about what I managed to get done so far at Amazon, where I spent the last year doing supply chain optimization full time. The sabbatical gave me the chance to both learn and make a substantial impact.

At Amazon, I led the creation of a general-purpose primal dual algorithm for solving very large resource allocation problems (millions of decisions, with hundreds of thousands of constrained resources), e.g., for assigning customer shipments to "paths" in the supply chain. The new general solver works an order of magnitude better and faster than the previously used custom algorithms, and is already being used by two key supply chain decision systems, with more in the pipeline. This work won the Operations Research Best paper award at Amazon's Consumer Science Summit 2023 (1/108 submissions). A primal dual algorithm, informally speaking, solves a resource allocation problem via a "negotiation" between millions of orders/shipments on how to share scarce resources (e.g., transportation capacities, sortation capacities, last mile capacities). My brilliant former PhD student, Pengyu Qian, taught me primal dual algorithms. I leveraged his mirror backpressure concept in my algorithm design:
Y. Kanoria and Pengyu Qian, “Blind Dynamic Resource Allocation in Closed Networks via Mirror Backpressure,” appeared in ACM EC 2020, to appear in Management Science.
A shout out also to my colleague Santiago Balseiro, who has successfully used similar ideas in a different application, and shared some of his experience with me. 

One other big thing seems to have landed at Amazon. I discovered a major inefficiency in how customer orders were being assigned to fulfillment centers. The new decision framework I proposed will be simpler, more transparent, and efficient, and is expected to save nearly half a billion dollars/year. My proposed system will make tradeoffs consistently, allowing to operate at the system-level Pareto frontier.

I'll be continuing my work at Amazon part time.. there's billions of dollars (and proportional amounts of carbon) still to be saved, and lots more for me to learn! The two projects I described above were perhaps among the lowest hanging fruit that I identified, and I was able to move those forward in less than a year after I formulated the underlying idea. I have made a number of additional proposals which are being considered (or not, depending on the weather that day) which are more ambitious but also more complex to assess and roll out. Having achieved some success, I sense more trust and backing from leadership, and I'm hopeful that some of them will move forward in time. If I get a chance, I'll share my experience this past year with organizational inertia and the trust building process in the industry in a different post. Overall, I'd say that identifying and "solving" scientific problems has been the quick and easy part of my journey at Amazon, which is obviously very different from my experience as an academic (though peer reviewers can certainly be a tough bunch!).

I'm super excited to be back full time at Columbia (since Sep 1), with my students and colleagues, focused on research and teaching. The AI wave seems to be rapidly engulfing all things in my professional sphere, and I'm curious to see where it will go, and how I can contribute.