Skip to content

Conversation

bos1988
Copy link

@bos1988 bos1988 commented Apr 16, 2023

When we use train_only_size > 0.0 in the leave_k_out_split function,
the list of unique users is thinned at the entrance to the _take_tails function.

Then in the _take_tails function, the bincount method produces zeros in the middle of the array and the `cumsum' method starts duplicating the ends.

For example (operation of the _take_tails function):
arr = [0 0 0 4 4 4]
np.bincount(sorted_arr) = [3 0 0 0 3]
end = [2 2 2 2 5]

As a result of the operation of the leave_k_out_split function, ratings multiplied by 4 will appear in the test for user 0

Must be:
arr = [0 0 0 2 2 2]
counts = [3 3]
end = [2 5]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant