[r] add iterative lsi design docs #167

immanuelazn · 2024-12-12T00:11:50Z

Pretty much as we discussed during call!

I think the biggest point of contention is the normalization structure. Normalizations like tf-idf, Z-score norms can have data that they will be fit to. However, the biggest problem is that we want something that can interoperate with BPCells operations, while also return the calculated information (mean, variance, idf). Should it follow the same styling as the S3 class for LSI that we are creating, with cell.embeddings/feature.loadings?

I propose just having a boolean param, with the default returning an IterableMatrix, and the other being an option to return a class that we can project with.

bnprks · 2024-12-12T01:11:11Z

Thanks for this update, very clearly written

Discussion points:

New idea for the base name, maybe select_features_[method] instead of variable_features_[method]? I think this might side-step some confusion that some selection methods won't be based on variance, and has the benefit of actually using a verb. Curious what you think

Comments:

I think mean might be a better general name than _by_top_accessibility -- I think the calculation would be basically the same
for _by_cluster_variance, I thought we discussed that we might not make this helper method layer, and just do the pseudobulking in the iterative lsi function
- That said, we could probably use functions for _variance and _dispersion (variance divided by mean)
As discussed on slack, I think normalize_tfidf should have a feature_means or feature_sums argument. We won't return any extra data and rely on the clarity of the argument name for pepole to know
I think we can ditch feature.loadings for now from the Seurat interface, and maybe rename cell.embeddings to cell_embeddings for better stylistic consistency within BPCells. We can add feature.loadings back in later if we decide it would be useful

immanuelazn · 2024-12-12T01:27:28Z

select_features is good!

For your second point comment point, I think my interpretation from that conversation was that the clustering would be separate, but we would still pseudobulk within funciton. By all means, we can take that out and put it into the wrapper iterative lsi function though! Do you think it would make sense to do variance and dispersion as just a parameter in the same function instead, rather than separate functions?

Overall I agree with your other points. Will reflect here soon!

bnprks · 2024-12-12T02:31:57Z

I was originally thinking clustering + pseudobulk calculation could happen in iterative_lsi. Then we still have a parameter in iterative_lsi to let the user configure how feature selection happens (which could be e.g. select_features_variance(normalize=normalize_log) by default). Then the two functions I mentioned would just be general options for variable feature selection, implemented outside of iterative lsi. I think having at least a select_features_variance function separate is a good idea; still OK to have a select_features_by_pseudobulk method if you prefer to implement it that way -- not a big deal either way

[r] add iterative lsi design docs

082aa4a

immanuelazn requested a review from bnprks December 12, 2024 00:11

[r] remove redundant information

7e34180

immanuelazn added 3 commits December 12, 2024 12:08

[r] add sugested changes to iterative lsi

b5cd3dd

[r] reorder iterative lsi sections

4aa75a2

[r] fix grammar mistakes, inconsistencies

11b8486

immanuelazn mentioned this pull request Dec 12, 2024

[r] add tf-idf and log normalization functions #168

Closed

[r] fix naming of select_features_by_dispersion()

d6c0773

immanuelazn mentioned this pull request Jan 27, 2025

Normalizations, Feature Selections, DimReduction S3 class, (Iterative) Lsi Implementation, Partial function framework, Clustering changes and Framework #189

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[r] add iterative lsi design docs #167

[r] add iterative lsi design docs #167

Uh oh!

immanuelazn commented Dec 12, 2024 •

edited

Loading

Uh oh!

bnprks commented Dec 12, 2024

Uh oh!

immanuelazn commented Dec 12, 2024 •

edited

Loading

Uh oh!

bnprks commented Dec 12, 2024

Uh oh!

Uh oh!

[r] add iterative lsi design docs #167

Are you sure you want to change the base?

[r] add iterative lsi design docs #167

Uh oh!

Conversation

immanuelazn commented Dec 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bnprks commented Dec 12, 2024

Uh oh!

immanuelazn commented Dec 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bnprks commented Dec 12, 2024

Uh oh!

Uh oh!

immanuelazn commented Dec 12, 2024 •

edited

Loading

immanuelazn commented Dec 12, 2024 •

edited

Loading