Adding Multinomial and Nested Logit Models for Consumer Choice #1654

NathanielF · 2025-04-27T16:06:26Z

Description

I'm adding two new model classes for discrete choice style models that I intend to be part of the consumer choice module.
As it stands i'm opening the PR as a draft for discussion around the implementation choices and API design I have for these models.

Related to this issue: #1653. There is a lot of potential in the discrete choice style models for Bayesian modelling in particular, the state of the art models in this domain involves a mixed logit parameterisation for which "vanilla" implementations are pretty straightforward using Bayesian hierarchical parameterisations.

Two New Models

Main things to flag: There are now two new model files in the consumer choice folder. The simple Multinomial Logit and the Nested Logit. As outlined in the issue i've restricted the nested logit to no more than two layers of nesting. I believe this will bring us to beyond parity with packages like mlogit in R and pylogit in python which allow for only 1 level deep nesting structures.

API Discussion

The API i'm suggesting for these models differs from the typical X,y inputs on the models in pymc marketing in general. Mostly this is because I feel the use of Wilkinson style notation here is important. For instance this is how you specify the Nested Logit Model currently:

We assume a wide-data input as well:

Causal Inference and Counterfactuals

The value that these models bring is their focus on causal inference. The entire history of discrete choice models stems effectively from the observation that multinomial logit models cannot support plausible counterfactuals around market interventions (due to IIA) and more sophisticated discrete choice models like the nested logit models are able to solve this. See for instance here how a pricing intervention on a multinomial logit results in proportional re-allocation of market share to the rest of the market.

We demonstrate this problem and solution by adding 2 new notebooks to the gallery.

In the second notebook for nested logit we show how the IIA is solved by this extra nesting structure:

Fixed Attributes and Alternative Specific Attributes

One thing i've done is to ensure that the models can identify parameters for the alternative specific attributes (e.g. price) and the individually fixed attributes e.g. (income). I've done my best to benchmark the parameter identification and recovery against R's mlogit package:

How to Proceed?

I have not done an extensive write up of the math behind these types of models and some of the functions need more documentation and tests. But I wanted to share what I have so far to generate discussion and maybe decide on how to proceed. One immediate improvement i could think of would be to remove duplication from the nested logit and multinomial logit model classes, making them instances of a more general "DISCRETE CHOICE" class where we could re-use e.g. the formula parsing functions. Additionally i'd like to benchmark the parameter identification with a second data set and example.

Longer term i think there is room for adding a vanilla mixed-logit example too.

Anyway, open to feedback. Adding a draft PR now to check which linting, and testing failures i have.

Related Issue

Closes #
Related to Adding Discrete Choice Models to Consumer Choice Module #1653

Checklist

Checked that the pre-commit linting/style checks pass. Feel free to comment pre-commit.ci autofix to auto-fix.
Included tests that prove the fix is effective or that the new feature works
Added necessary documentation (docstrings and/or example notebooks) using numpydoc format.
If you are a pro: each commit corresponds to a relevant logical change

📚 Documentation preview 📚: https://pymc-marketing--1654.org.readthedocs.build/en/1654/

…erface Signed-off-by: Nathaniel <[email protected]>

Signed-off-by: Nathaniel <[email protected]>

review-notebook-app · 2025-04-27T16:06:31Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

codecov · 2025-04-27T16:09:10Z

Codecov Report

❌ Patch coverage is 93.99293% with 34 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.01%. Comparing base (900926b) to head (a755b57).
⚠️ Report is 98 commits behind head on main.

Files with missing lines	Patch %	Lines
pymc_marketing/customer_choice/nested_logit.py	93.71%	22 Missing ⚠️
pymc_marketing/customer_choice/mnl_logit.py	94.44%	12 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1654      +/-   ##
==========================================
+ Coverage   90.08%   92.01%   +1.92%     
==========================================
  Files          60       62       +2     
  Lines        6819     7385     +566     
==========================================
+ Hits         6143     6795     +652     
+ Misses        676      590      -86

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Nathaniel <[email protected]>

williambdean · 2025-06-15T21:27:08Z

pymc_marketing/customer_choice/mnl_logit.py

+            "alphas_": alphas,
+            "betas": betas,
+            "betas_fixed_": betas_fixed,


Why do some have suffix of _ and others do not?

I need to 0 out one of the alternatives alphas_ --> alpha and I was doing the same for betas_fixed_ for parameter identification.

e.g alphaz = pm.Deterministic('alphas', pt.set_subtensor(alphas_, -1, 0))....

williambdean · 2025-06-15T21:31:43Z

pymc_marketing/customer_choice/mnl_logit.py

+
+        return result
+
+    def parse_formula(self, df, formula, depvar):


Can we pull out this into a function instead of method. A good separation might be:

parse_formula(formula) -> tuple[target, alt_covariates, fixed_covariates]

check_columns(covariates, df)

check_dependent_variable(target, ser)

Yeah, can do... probably not tonight though. Wrecked. Will try and pick it up during the week.

no worries. take your time

I've pulled out the checks and made them staticmethods that are then invoked inside parse_formula. Not sure if that's exactly what you meant?

NathanielF · 2025-06-15T21:52:34Z

Ok folks. Making progress.

Added a bunch of type-hints and improved doc-strings to the nested logit class. Also added a light write up to both notebooks for multinomial and nested logit on the whys and hows.

Still missing a few doc-strings and will address @williambdean 's request soon too.

Thanks for the feedback @juanitorduz , @williambdean .

juanitorduz · 2025-06-16T05:38:03Z

great! (also, the notebooks are missing the watermarks and the end :) )

Signed-off-by: Nathaniel <[email protected]>

NathanielF · 2025-06-16T09:44:29Z

Added the watermarks too @juanitorduz :

Signed-off-by: Nathaniel <[email protected]>

NathanielF · 2025-06-16T17:20:20Z

Pretty happy with this now @juanitorduz , @williambdean . Think i addressed the concerns above.

docs/source/notebooks/customer_choice/mnl_logit.ipynb

williambdean · 2025-06-17T23:26:16Z

docs/source/notebooks/customer_choice/nested_logit.ipynb

@@ -0,0 +1,2488 @@
+{


Line #3. "gc ~ ic_gc + oc_gc | ",
Are the "|" required here?

Reply via ReviewNB

No, these should be handled here:

pymc-marketing/pymc_marketing/customer_choice/nested_logit.py

Line 224 in b13b67a

if "|" in covariates:

williambdean · 2025-06-17T23:26:16Z

docs/source/notebooks/customer_choice/nested_logit.ipynb

@@ -0,0 +1,2488 @@
+{


Can we used dims in the various places here. I'm assuming that the 900 for P_central, P_room, and denom_top are same as obs (900)

Reply via ReviewNB

It's straightforward to add dims to the marginal P_central etc... it's not as straightforward to dynamically create the coords for every potential nest at each layer. Like it can be done, just a large headache. Trying not to overcomplicate this function if i can avoid it:

pymc-marketing/pymc_marketing/customer_choice/nested_logit.py

Line 353 in b13b67a

def _prepare_coords(df, alternatives, covariates, f_covariates, nests):

Alright. Then let's leave out for now

williambdean · 2025-06-17T23:27:17Z

Really cool stuff. I liked those videos that were linked

williambdean · 2025-06-17T23:29:35Z

pymc_marketing/customer_choice/mnl_logit.py

+        """Do not use, required by parent class. Prefer make_model()."""
+        return super().build_model(X, y, **kwargs)
+
+    def make_model(self, X, F, y) -> None:


Is it possible to put X and F in a single xr.Dataset and pass as "X"

Or would that be unituitive?

I think that wouldn't be in the spirit of the model, the decomposition to fixed covariates and alternative specific covariates is pretty foundational to the literature and write up. Would avoid bundling them if i could.

pymc_marketing/customer_choice/mnl_logit.py

pymc_marketing/customer_choice/nested_logit.py

Signed-off-by: Nathaniel <[email protected]>

juanitorduz · 2025-06-18T21:43:19Z

Thanks @NathanielF ! Once @williambdean gives the last ✅ we can merge and release to get feedback from the community! 🤘

NathanielF · 2025-06-18T21:45:41Z

Exciting @juanitorduz ! Real curious to see if these models are useful for the community. There is quite a few directions we could go with them.

Signed-off-by: Nathaniel <[email protected]>

juanitorduz · 2025-06-20T08:20:39Z

Actually, this PR looks great, and I am sure we can improve in future iterations. So let's merge and iterate :)

NathanielF · 2025-06-20T10:10:02Z

Awesome! Thanks @juanitorduz

NathanielF added 16 commits April 18, 2025 22:37

working on parameter identification multinomial logit and formula int…

ede6186

…erface Signed-off-by: Nathaniel <[email protected]>

allowing for the incorporation of fixed covariates

262cc74

Signed-off-by: Nathaniel <[email protected]>

tidying formula parser

1a6a3ac

Signed-off-by: Nathaniel <[email protected]>

adding intervention functionality and plotting

e59de36

Signed-off-by: Nathaniel <[email protected]>

add skeleton mnl notebook to gallery

374e698

Signed-off-by: Nathaniel <[email protected]>

working on the nested logit

0397226

Signed-off-by: Nathaniel <[email protected]>

updating nested logit notebook

3212cd3

Signed-off-by: Nathaniel <[email protected]>

fixed nested logit with fixed covariates

2ae4b0f

Signed-off-by: Nathaniel <[email protected]>

generalising pre-processing for 3 level nesting

57f0159

Signed-off-by: Nathaniel <[email protected]>

identified three level nesting

978b6ef

Signed-off-by: Nathaniel <[email protected]>

working three level model

2eb4d60

Signed-off-by: Nathaniel <[email protected]>

defining w_nest within each nest

82912b2

Signed-off-by: Nathaniel <[email protected]>

working 2 and 3 level nesting

34b8308

Signed-off-by: Nathaniel <[email protected]>

update gallery

1c096c4

Signed-off-by: Nathaniel <[email protected]>

Adding some tests for nested logit

ad00eb5

Signed-off-by: Nathaniel <[email protected]>

tidying notebook

2e41156

Signed-off-by: Nathaniel <[email protected]>

github-actions bot added docs Improvements or additions to documentation tests customer choice Related to customer choice module labels Apr 27, 2025

NathanielF added 9 commits April 27, 2025 19:02

fix majority of linting errors and update tests

b222e1e

Signed-off-by: Nathaniel <[email protected]>

fix linting and formatting

025626b

Signed-off-by: Nathaniel <[email protected]>

run ruff on test files

21c0521

Signed-off-by: Nathaniel <[email protected]>

run ruff on notebooks

7b352fb

Signed-off-by: Nathaniel <[email protected]>

run ruff format

5bf96fb

Signed-off-by: Nathaniel <[email protected]>

improve test coverage

e2b93ab

Signed-off-by: Nathaniel <[email protected]>

fix key error

6db033b

Signed-off-by: Nathaniel <[email protected]>

update multinomial notebook and test coverage

d343faf

Signed-off-by: Nathaniel <[email protected]>

format test

7d9fddb

Signed-off-by: Nathaniel <[email protected]>

remove az.compare from nested logit notebook

d177d5c

Signed-off-by: Nathaniel <[email protected]>

williambdean reviewed Jun 15, 2025

View reviewed changes

Adding some more docstrings and breaking up parse function

f37519c

Signed-off-by: Nathaniel <[email protected]>

NathanielF added 2 commits June 16, 2025 11:17

fixing some typos

c55298f

Signed-off-by: Nathaniel <[email protected]>

added extra example to nested logit

ed47832

Signed-off-by: Nathaniel <[email protected]>

NathanielF requested a review from juanitorduz June 16, 2025 17:20

williambdean reviewed Jun 17, 2025

View reviewed changes

pymc_marketing/customer_choice/mnl_logit.py Outdated Show resolved Hide resolved

williambdean reviewed Jun 17, 2025

View reviewed changes

pymc_marketing/customer_choice/nested_logit.py Outdated Show resolved Hide resolved

williambdean reviewed Jun 17, 2025

View reviewed changes

pymc_marketing/customer_choice/nested_logit.py Outdated Show resolved Hide resolved

NathanielF added 4 commits June 18, 2025 11:44

Addressing some of Will's comments

cacd7b4

Signed-off-by: Nathaniel <[email protected]>

Merge branch 'main' into discrete_choice_module

6399e13

Signed-off-by: Nathaniel <[email protected]>

adding df["depvar"].value_counts() to mnl logit notebook

b13b67a

Signed-off-by: Nathaniel <[email protected]>

tidying notebook structure

b6e4099

Signed-off-by: Nathaniel <[email protected]>

NathanielF requested a review from williambdean June 18, 2025 20:40

juanitorduz approved these changes Jun 18, 2025

View reviewed changes

re-run notebooks with interaction method

a755b57

Signed-off-by: Nathaniel <[email protected]>

juanitorduz merged commit 1d21506 into main Jun 20, 2025
33 checks passed

juanitorduz deleted the discrete_choice_module branch June 20, 2025 08:20

Uh oh!

Adding Multinomial and Nested Logit Models for Consumer Choice #1654

Adding Multinomial and Nested Logit Models for Consumer Choice #1654

Uh oh!

Conversation

NathanielF commented Apr 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Two New Models

API Discussion

Causal Inference and Counterfactuals

Fixed Attributes and Alternative Specific Attributes

How to Proceed?

Related Issue

Checklist

Uh oh!

review-notebook-app bot commented Apr 27, 2025

Uh oh!

codecov bot commented Apr 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NathanielF Jun 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NathanielF commented Jun 15, 2025

Uh oh!

juanitorduz commented Jun 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NathanielF commented Jun 16, 2025

Uh oh!

NathanielF commented Jun 16, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

williambdean Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

williambdean Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

williambdean commented Jun 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

juanitorduz commented Jun 18, 2025

Uh oh!

NathanielF commented Jun 18, 2025

Uh oh!

juanitorduz commented Jun 20, 2025

Uh oh!

Uh oh!

NathanielF commented Apr 27, 2025 •

edited

Loading

codecov bot commented Apr 27, 2025 •

edited

Loading

NathanielF Jun 15, 2025 •

edited

Loading

juanitorduz commented Jun 16, 2025 •

edited

Loading

williambdean Jun 17, 2025 •

edited

Loading

williambdean Jun 17, 2025 •

edited

Loading