Skip to content

Add Module vipurpca#44

Open
nikhilbhavikatti wants to merge 2 commits intomainfrom
vipurpca
Open

Add Module vipurpca#44
nikhilbhavikatti wants to merge 2 commits intomainfrom
vipurpca

Conversation

@nikhilbhavikatti
Copy link
Collaborator

  • Provides convenience functions built on top of VIPurPCA library for uncertainty propagation in PCA.
  • Supports computation of both deterministic eigenvectors and their uncertainty
  • _prepare_pca_inputs(dists) : Builds stacked mean vectors, block-diagonal covariance matrices, and labels from a list of distributions.
  • For each distribution, its covariance matrix is preserved independently, and all are combined into a single block-diagonal covariance matrix.
  • _effective_rank_from_X(X) : Computes the effective numerical rank of the input matrix via SVD.
  • _fit_pca_with_uncertainty(Y, cov_Y, n_components) : Fits PCA with uncertainty propagation and adjusts for rank deficiency automatically.
  • compute_distribution_eigenvectors(dists, n_components=3) : Returns eigenvectors of distributions under uncertainty-aware PCA.
  • plot_distribution_trajectories(...) : Plots static trajectories of distributions in PCA space, with uncertainty-aware sampling of eigenvectors.

@hageldave
Copy link
Collaborator

thank you for providing this integration of vipurpca. A few things need to be resolved or clarified before merging:

  • Please put the usage example in a jupyter notebook similar to the examples for UAPCA and UAMDS (https://github.com/UniStuttgart-VISUS/uadapy/tree/main/examples)
  • def plot_distribution_trajectories(dists,
    n_components=3,
    pcx=1,
    pcy=2,
    n_frames=10,
    seed=55,
    fig=None,
    axs=None,
    distrib_colors=None,
    colorblind_safe=False,
    show_plot=False):

    The parameters pcx and pcy are indices but start at 1 (and are later subtracted by 1). This is not in line with python standards. Please make pcx default 0 and pcy default 1, remove the index subtraction and adapt the corresponding sanity checks.
  • The plot_distribution_trajectories(....) function is a plotting function. I would suggest to changing it to compute_distribution_trajectories(....) which will then return the coordinates of the trajectories for each distribution. The plotting functionality can be removed. I would then provide a simple plotting example in the notebook as plotting trajectories is rather straight forward.

[x] Plotting function changed to compute_distribution_trajectories
[x] Add Notebook to provide sample plotting example
[x] Remove paramters pcx, pcy and labels
@nikhilbhavikatti
Copy link
Collaborator Author

nikhilbhavikatti commented Jan 20, 2026

thank you for providing this integration of vipurpca. A few things need to be resolved or clarified before merging:

  • Please put the usage example in a jupyter notebook similar to the examples for UAPCA and UAMDS (https://github.com/UniStuttgart-VISUS/uadapy/tree/main/examples)

  • def plot_distribution_trajectories(dists,
    n_components=3,
    pcx=1,
    pcy=2,
    n_frames=10,
    seed=55,
    fig=None,
    axs=None,
    distrib_colors=None,
    colorblind_safe=False,
    show_plot=False):

    The parameters pcx and pcy are indices but start at 1 (and are later subtracted by 1). This is not in line with python standards. Please make pcx default 0 and pcy default 1, remove the index subtraction and adapt the corresponding sanity checks.

  • The plot_distribution_trajectories(....) function is a plotting function. I would suggest to changing it to compute_distribution_trajectories(....) which will then return the coordinates of the trajectories for each distribution. The plotting functionality can be removed. I would then provide a simple plotting example in the notebook as plotting trajectories is rather straight forward.

The comments are addressed in commit : 5afe320

@hageldave
Copy link
Collaborator

We had discussed during our meeting that compute_distribution_eigenvectors should return the full model instead of only the expected eigenvectors.

def compute_distribution_eigenvectors(dists, n_components=3):
"""
Compute eigenvectors of PCA fitted on distributions with uncertainty.
Parameters
----------
dists : list of Distribution
List of distribution objects with `.mean()`, `.cov()`, and `.name`.
n_components : int, default=3
Number of principal components to retain.
Returns
-------
eigenvectors : ndarray of shape (p, n_components)
Principal component eigenvectors.
"""
Y, cov_Y, _ = _prepare_pca_inputs(dists)
model = _fit_pca_with_uncertainty(Y, cov_Y, n_components)
return model.eigenvectors

The reason for this is that we want to be able create distribution plots as shown in the original paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants