Skip to content

Refactor token index#354

Closed
Aleridia wants to merge 11 commits intoocaml-sf:masterfrom
pfitaxel:refactor-token-index
Closed

Refactor token index#354
Aleridia wants to merge 11 commits intoocaml-sf:masterfrom
pfitaxel:refactor-token-index

Conversation

@Aleridia
Copy link

@Aleridia Aleridia commented Jun 9, 2020

Hello,

As a result of #348 I changed the way of getting all the tokens.

Now, we have an index structure. It means that all the tokens are stored in a json file (sync/token.json). The function Token.Index.get will return all the tokens stored in the json file.

If a new token is created, it will be automatically added to the index.

If you don't have any index (updating per example), it will create itself by using the old Token.Index.get's function.

@erikmd
Copy link
Collaborator

erikmd commented Jun 9, 2020

@Aleridia can you fix these errors/warnings as reported by the CI?

# File "src/state/token_index.ml", line 13, characters 53-67:
# Error: Unbound type constructor Yojson.Basic.t
#       ocamlc src/state/.learnocaml_api.objs/learnocaml_api.{cmo,cmt}
# File "src/state/learnocaml_api.ml", line 342, characters 30-34:
# Warning 27: unused variable path.
#       ocamlc src/state/.learnocaml_store.objs/learnocaml_store.{cmo,cmt}
# File "src/state/learnocaml_store.ml", line 11, characters 0-16:
# Warning 33: unused open Token_index.

https://travis-ci.org/github/ocaml-sf/learn-ocaml/jobs/696405269

Copy link
Collaborator

@erikmd erikmd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @Aleridia; I left a few comments; feel free to rebase and force-push (after you'll have solved the compilation error raised by the CI)

@Aleridia Aleridia force-pushed the refactor-token-index branch 6 times, most recently from 737e5ab to ed673ff Compare June 10, 2020 11:52
let create_index (sync_dir : string) =
let l = get sync_dir () in
let data = l >|= List.map cast_string >|= cast_list in
data >>= write_file (sync_dir ^ "/" ^ token_file) mutex_token
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems the four occurrences of
sync_dir ^ "/" ^ token_file
are not portable (under Windows, the directory separator is a \)
but one could replace this with sync_dir / token_file
by taking advantage of the definition above
let ( / ) dir f = if dir = "" then f else Filename.concat dir f in
(which could be promoted to a file-local definition, instead of a function-local definition)

@Aleridia I can push a commit applying this change, unless you prefer to do that yourself.

Copy link
Author

@Aleridia Aleridia Jun 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will be thankfull if you can push it @erikmd

I can do it but it won't be before tomorrow.

@erikmd
Copy link
Collaborator

erikmd commented Jun 29, 2020

Dear @yurug, can you have a look on the code proposed in this PR?
feel free to comment if you believe some further refactoring is needed.

BTW the CI failure − only the macOS job − seems to be spurious:

[ERROR] The sources of the following couldn't be obtained, aborting:
          - odoc.1.3.0: Download command failed
          - omd.1.3.1: Download command failed
          - optint.0.0.2: Download command failed
          - pprint.20180528: Download command failed
          - ppx_ast.v0.9.1: Download command failed
          - ppx_core.v0.9.3: Download command failed
          - ppx_derivers.1.2.1: Download command failed
          - ppx_driver.v0.9.2: Download command failed
          - ppx_fields_conv.v0.9.0: Download command failed
          - ppx_metaquot.v0.9.0: Download command failed
          - ppx_optcomp.v0.9.0: Download command failed
          - ppx_sexp_conv.v0.9.0: Download command failed
          - ppx_tools.5.0+4.05.0: Download command failed
          - ppx_tools_versioned.5.2.2: Download command failed

"ppx_tools"
"ppx_sexp_conv" {= "v0.9.0"}
"ppx_fields_conv" {= "v0.9.0"}
"yojson" {>= "1.4.0" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add a dependency on yojson instead of using ezjsonm? FWIW yojson is part of learn-ocaml’s dependencies but before this commit it looks like no binaries are even linked to it…

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agrn good catch! Indeed it seems learn-ocaml currently depends on these two JSON libraries…

because even if this PR added yojson in learn-ocaml-client.opam, it was already part of learn-ocaml.opam:

"yojson" {>= "1.4.0" }

but it indeed seems yojson was not actually used.

@yurug do you agree with the proposed choice? (removing yojson from learn-ocaml direct dependencies, and rewrite #354 using ezjsonm)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree!

Lwt_mutex.unlock mutex

let create_index sync_dir =
(* Note: we may want to write some line in the standard output telling that
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this comment.

@agrn
Copy link
Contributor

agrn commented Jul 11, 2020

So, I pushed two commits: one to use Ezjsonm instead of yojson (and remove the dependency in the *.opam files), and a second one to use modules to decouple the underlying storage from the index operations, so I can easily remove it if we elect not to do that. Feel free to comment on both of them.

@agrn
Copy link
Contributor

agrn commented Jul 13, 2020

The failure is once again osx, failing to build conf-libssl.

Copy link
Collaborator

@erikmd erikmd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks! Two comments:

You could maybe put these two declarations in the .mli (as they are generic, beyond tokens…):

module type IndexRW = sig
  type t
  val init : unit -> t
  val read : string -> (string -> 'a) -> 'a Lwt.t
  val write : t -> string -> ('a -> string) -> 'a -> unit Lwt.t
end

module IndexFile: IndexRW

Finally, you could implement that comment as suggested by Yann:

    (* Note: we may want to write some line in the standard output telling that
       the token index is being generated. *) 

@agrn
Copy link
Contributor

agrn commented Jul 15, 2020

So, I force pushed a fixed up commit but it doesn't appear here.

@agrn agrn force-pushed the refactor-token-index branch from 953adb4 to 29ce231 Compare July 15, 2020 16:24
@agrn
Copy link
Contributor

agrn commented Jul 16, 2020

Ah no, it's all good now.

@agrn agrn force-pushed the refactor-token-index branch from 29ce231 to a1b8444 Compare July 20, 2020 15:14
@agrn
Copy link
Contributor

agrn commented Jul 20, 2020

I edited the last commit to move token_file inside of BaseTokenIndex, and to use Token.enc instead of rolling my own codec for this type.

edit: Travis once again failed because of macOS: Error: Library "cohttp-lwt-unix" not found.

@agrn agrn force-pushed the refactor-token-index branch from a1b8444 to 30f3921 Compare July 27, 2020 09:50
@agrn
Copy link
Contributor

agrn commented Jul 27, 2020

Rebased this branch on master as requested by @erikmd.

Aleridia and others added 8 commits August 7, 2020 17:38
- Rename variable
- Remove some useles brackets
- Change sync_dir system for token_index.ml
- Change token_index.mli
- Little refactoring

Co-authored-by: Erik Martin-Dorel <erik@martin-dorel.org>
Rename string_to_json : it's now cast_string

Delete useless function

Delete token_to_string

Refactor string_to_token

Remove parens on write_file

Update way to use sync_dir

Delete the 'sync/' from the token_file, change it on functions
erikmd and others added 3 commits August 7, 2020 17:41
We already use Ezjsonm, so there is no need to also depend on Yojson.

Signed-off-by: Alban Gruin <alban.gruin@univ-tlse3.fr>
This transforms token_index to use modules, to allow to change the
underlying storage (currently they are stored in flat files).

Signed-off-by: Alban Gruin <alban.gruin@univ-tlse3.fr>
@erikmd erikmd force-pushed the refactor-token-index branch from 30f3921 to fee06ee Compare August 7, 2020 16:08
@erikmd
Copy link
Collaborator

erikmd commented Aug 7, 2020

@yurug thanks for having merged #356 !

I've just rebased this one to fix the conflict, to sum up:

@erikmd erikmd mentioned this pull request Aug 12, 2020
4 tasks
@agrn
Copy link
Contributor

agrn commented Aug 14, 2020

Hi, I’ve successfully rebased oauth-moodle onto this new version, and I’m going to push it soon.

@yurug
Copy link
Collaborator

yurug commented Dec 28, 2020

@erikmd @Aleridia What is the status of this PR?

@erikmd
Copy link
Collaborator

erikmd commented Dec 28, 2020

@yurug this PR should not be merged and can be closed: all its commits were integrated/rebased/extended in the on-going PR #362

@yurug yurug closed this Dec 30, 2020
let serialise = Json_codec.encode ?minify:(Some(false)) enc

let create_index sync_dir =
let found_indexes =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a guarantee that new tokens can't be created while this function is running ?
I probably overlooked something, but here is a race-condition to be considered:

  • the index is absent
  • two clients A and B trigger add_token at the same time
  • both lwthreads scan existing tokens concurrently, getting the same result
  • the thread for A writes the index file, adds its token ta, and writes the index again
  • the thread for B overwrites the index file with a version that doesn't contain ta

There may be constraints avoiding this at upper level; otherwise I would advise getting a write-lock (blocking concurrent scanning, reads and writes) before starting the index read/scan for the cases where you intend to update and write it back (i.e. add_token).

| e -> Lwt.fail e
in
aux ()
aux () >>= fun t -> TokenIndex.add_token !sync_dir t >|= fun _ -> t
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the index doesn't exist yet, add_token will scan, including the just created token, then add it again, resulting in a duplicate.


let add_token sync_dir token =
get_tokens sync_dir >>= fun tokens ->
RW.write rw (sync_dir / file) serialise (token :: tokens)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible race condition if called concurrently ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants