Longxiu Tian

Welcome! I'm an Assistant Professor of Marketing at the UNC Kenan-Flagler Business School where I teach Customer Relationship Management (CRM). I received my PhD from the University of Michigan in Marketing and in Scientific Computing.

My research focus is in using statistical models, particularly Bayesian econometrics and causal inference, to understand how consumers respond to marketing activities and make choices in uncertain environments.




Manuscripts and Publications/

Optimizing Price Menus for Duration Discounts: A Subscription Selectivity Field Experiment

Longxiu Tian and Fred M. Feinberg

Marketing Science (2020)

+Show Abstract

Subscription services typically offer duration discounts, rewarding longer commitments with lower per-period costs. The “menu” of contract plan prices must be balanced to encourage potential customers to select into subscription overall and to nudge those that do to more profitable contracts. Because subscription menu prices change infrequently, they are difficult to optimize using historical pricing data alone.

We propose that firms solve this problem via an experiment and a model that jointly accounts for whether to opt-in and, conditionally, which plan to choose. To that end, we conduct a randomized online pricing experiment that orthogonalizes the “elevation” and “steepness” of price menus for a major dating pay site. Users’ opt-in and plan choice decisions are analyzed using a novel model for correlated binary selection and multinomial conditional choice, estimated via Hamiltonian Monte Carlo. Benchmark comparisons suggest that common models of consumer choice may systematically misestimate price substitution patterns, and that a key consideration is the distinctiveness of the opt-out (e.g., non-subscription) option relative to others available.

Our model confirms several anticipated pricing effects (e.g., on the margin, raising prices discourages both opt-in overall and choice of any higher-priced plans), but also some that alternative models fail to capture, most notably that across-the-board pricing increases have a far lower negative impact than standard random-utility models would imply. Joint optimization of the price menu’s component prices suggests the firm has set them too low overall, particularly for its longest-duration plan, and adjusting the price menu accordingly should lead to roughly 10% greater revenue.


Broadening the Horizon: Augmenting One-Shot Field Experiments with Longitudinal Customer Data

Longxiu Tian and Fred M. Feinberg

+Show Abstract

Managers of online services are interested in both short- and long- term profitability, and how they are affected by customer-facing service attributes. This is especially true in subscription settings where different contracts, characterized by length, price, and other attributes, can affect long-run profits. A commonplace customer acquisition strategy utilized by these services involves offering price promotions targeted at new customers to induce initial conversion, while reverting to standard unpromoted prices for acquired customers. As the initial contract choice of customers can have long-run impact on their subsequent renewal choices, setting new member pricing represents a tradeoff between running promotions steep enough to convert new customers while attracting customers into contract options that maximize customer lifetime value (CLV). However, problematic to conventional methods for measuring long-run effects of initial price promotions, e.g., longitudinal A/B tests, is their time-consuming nature, as these require extended trial periods in order for long-run resubscription behaviors to play out. 

To address this shortcoming, we introduce a Bayesian nonparametric data fusion framework that enables inference on the long-run effects of initial price promotions using only a parsimonious ‘one-shot’ experiment on initial conversions, augmented with resubscription patterns found in longitudinal customer (CRM) databases. We apply our proposed framework to impute the long-run trajectory of resubscription choices for subjects from a A/B price test whose observations are limited solely to that of the initial conversions. We develop a class of Gaussian process (GP) prior data fusion models that utilize Bayesian regularization as the mechanism for the sharing of information across the datasets at the customer-choice -occasion level. The degree to which the longitudinal data regularize, or inform, the experimental subjects’ likely renewal trajectories is given by the GP’s Automatic Relevance Detection (ARD) kernel, which allows for differential degrees of regularization based on the distances between observations in the joint space of the customers’ characteristics, price offers, and time trends. Beyond the specific application to estimating the long-run CLV of the ‘one-shot’ experimental subjects, the data fusion framework can be generalized to any customer-base analytics employing a discrete-choice hazard function.

As GP computationally scales cubically with observations, conventional gradient- and MCMC- based estimation strategies are intractable given the large-scale nature of the fused data sources, both in terms of runtime and memory. To overcome these hurdles, we leverage the sparse ‘inducing point’ GP approach (Titsias 2009) for stochastic variational inference (SVI), in a first-of-its-kind application of this highly scalable Bayesian estimation strategy in marketing, with applicability to a broad class of choice, response, and latent class models.


Impact of Rhinitis on Work Productivity: A Systematic Review

Vandenplas, Olivier, [et al. including Tian, L.] 

The Journal of Allergy and Clinical Immunology: In Practice (2017)

+Show Abstract

Allergic rhinitis (AR) is increasingly acknowledged as having a substantial socioeconomic impact associated with impaired work productivity, although available information remains fragmented. This systematic review summarizes recently available information to provide a quantitative estimate of the burden of AR on work productivity including lost work time (i.e., absenteeism) and reduced performance while working (i.e., presenteeism). A Medline search retrieved original studies from 2005 to 2015 pertaining to the impact of AR on work productivity. A pooled analysis of results was carried out with studies reporting data collected through the validated Work Productivity and Activity Impairment (WPAI) questionnaire. The search identified 19 observational surveys and 9 interventional studies. Six studies reported economic evaluations. Pooled analysis of WPAI-based studies found an estimated 3.6% (95% confidence interval [CI], 2.4; 4.8%) missed work time and 35.9% (95% CI, 29.7; 42.1%) had impairment in at-work performance due to AR. Economic evaluations indicated that indirect costs associated with lost work productivity are the principal contributor to the total AR costs and result mainly from impaired presenteeism. The severity of AR symptoms was the most consistent disease-related factor associated with a greater impact of AR on work productivity, although ocular symptoms and sleep disturbances may independently affect work productivity. Overall, the pharmacologic treatment of AR showed a beneficial effect on work productivity. This systematic review provides summary estimates of the magnitude of work productivity impairment due to AR and identifies its main determinant factors. This information may help guide both clinicians and health policy makers.


Improving Credit Score Forecasts when Data are Sparse: A Dynamic Hierarchical Gaussian Process Model

Longxiu Tian, Linda Salisbury, and Fred M. Feinberg

+Show Abstract

Credit scores play a vital role in reducing the risk of lending, insuring, and renting to consumers. Accurate scoring aids financial and other institutions, who rely on a portfolio of interrelated credit scores to vet prospective customers and set attractive risk premiums. Like all forecasts, credit scores are vulnerable to data missingness, such as unbalanced, incomplete, or thin credit files. These can in turn lead to non-random gaps in the score history that reduce the reliability of the scores as metrics to target profitable borrowers and identify cross-selling opportunities for financial products and services. To address this problem, we develop a Bayesian nonparametric data fusion model to impute credible intervals for gaps in individual score histories within a portfolio of dynamically and contemporaneously interrelated scores. We apply this model to novel data from a leading credit bureau on scores for a segment of the U.S. population from 2011-2015, along with credit file decision attributes used to generate them. To address the high-dimensionality of both the model and feature spaces, we apply the “Gaussian Integral Trick” to decorrelate prior distributions, enabling scalable and efficient model estimation using stochastic mean-field variational inference. We find that credit score portfolios with imputed missingness are more accurate in predicting consumer delinquencies and bankruptcies than existing scores.


Privacy Preserving Data Fusion

Dana Turjeman and Longxiu Tian

+Show Abstract

Many firms collect data from online and offline sources. These data often include behavioral information that shed light on users’ habits. Despite the vastness of data that can be obtained in these forms, userbase surveys and questionnaires nonetheless remain essential in uncovering deeper and more thorough understandings of customers’ attitudes and preferences. One challenge surrounding user surveys is the reliability of responses to sensitive questions if respondents suspect that their personal identity may be exposed. While anonymous surveys can alleviate this source of unreliable responses to sensitive questions, it hampers the ability of researchers to combine user surveys with existing behavioral data. In this study, we develop a variational auto-encoder framework that allows for inference on the joint distributions of data across the anonymous surveys and existing data, while ensuring that the survey respondents will not be identified. Given that missingness can occur across multiple data features, a key limitation to conventional data imputation methods is that model complexity scales with the number of missing values. VAEs overcome this limitation by treating missingness as arising from a single generative model, and instead seek to encode the joint data generating process as a nonparametric random function that may in turn then be used to decode where missingness may arise. The method allows for data – either individual values or entire covariates – to be missing at random, and for truncation into categories. The method can also be extended to combining several data sets, while maintaining them separately for privacy purposes. 


Ann Arbor, MI


Stephen M. Ross School of Business
Ph.D. in Marketing and Scientific Computing

Dissertation: Bayesian Nonparametrics for Marketing Response Models (Chair: Fred M. Feinberg)


Cambridge, MA


Sloan School of Management
Master in Finance

Evanston, IL


Weinberg College of Arts and Sciences

B.A. in Economics, M.S. in Information Systems