Skip to content

Edits for Put with Notify #20

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
certik opened this issue Oct 18, 2019 · 8 comments
Open

Edits for Put with Notify #20

certik opened this issue Oct 18, 2019 · 8 comments
Labels
Clause 11 Standard Clause 11: Execution control Fortran 2023 Proposal targeting the next Fortran standard F2023 (previously called F202X) in progress J3 is moving forward

Comments

@certik
Copy link
Member

certik commented Oct 18, 2019

Latest paper: https://j3-fortran.org/doc/year/19/19-259.txt

@certik certik added in progress J3 is moving forward Fortran 2023 Proposal targeting the next Fortran standard F2023 (previously called F202X) labels Oct 18, 2019
@certik
Copy link
Member Author

certik commented Oct 18, 2019

Committee voted to proceed.

@gareth-d-ga
Copy link

While the notify_type is quite similar to the event_type, I notice the proposal doesn't have anything equivalent to the 'event_query' subroutine (say 'notify_query').

'event_query' allows you check if the event has been received, but without necessarily waiting. If it hasn't been received you can potentially do other work, thus facilitating the overlap of communication and computation.

As far as I can see, this should be equally useful in the 'notify_type' case. So my question: Should something like 'notify_query' be added to this proposal?

@zjibben
Copy link
Member

zjibben commented Apr 28, 2020

That's a good idea. I recall there was lots of discussion, including adding to the capabilities of event_type instead of adding a new notify_type type, and the result was that it should be a distinct type with very limited uses, because the synchronization patterns are different (syncing threads between transfers rather than a sync all). But I don't recall why a notify_query subroutine was left out. Probably it had something to do with notify_type having a much more narrow use case than event_type, and all the uses we imagined for a notify_query ended up having better alternative approaches. Or it could have just been neglected. I wish that conversation was documented somewhere. @longb and @JonSteidel had lots of input here, do either of you recall?

Aside: here are the other papers on the feature. They also don't give the reasoning behind leaving out notify_query.

@MichaelSiehl
Copy link

Thanks for posting. It’s the first time I see this, so it could be that my understanding is wrong. If my understanding is correct, the ‘put with notify’ is very different from events. My focus is currently not on the question if a ‘notify_query' could make sense, but to basically understand this ‘put with notify’ feature.

From my current understanding, I would say (- in parentheses there are statements from the J3 papers that may match with my point of view -):
https://j3-fortran.org/doc/year/18/18-277r1.txt
https://j3-fortran.org/doc/year/19/19-259.txt

  1. DATA TRANSFER AND SYNCHRONIZATION ARE THE SAME SINGLE OPERATION:
    The notify-specifier makes data transfer and it’s synchronization to be semantically the same single operation.
    (“...the notify can be incorporated into the same data packet as the value of y.”)

  2. THAT OPERATION HAS NO IMPACT ON SEGMENT ORDERING:
    Internally, this may not require use of SYNC MEMORY, thus the operation has no impact on segment ordering.
    (“...the synchronization is only on this particular transfer, and not all outstanding memory operations on this image. The put with notify operation does not constitute segment ordering, ...”)

  3. TO CONTROL THE EXECUTION FLOW OF A PARALLEL APP:
    From my own experiences with customized synchronizations: if we can synchronize and do data transfer within the same single operation, we can use that operation not only for efficient data transfer, but also to easily control the execution flow of a parallel app. This is important to implement distributed objects without the need for Remote Procedure Calls (RPC).
    (“… the NOTIFY WAIT statement is not an image control statement.”, “...the NOTIFY WAIT execution control statement.”)

@MichaelSiehl
Copy link

I agree with you, that a NOTIFY_QUERY could be an important enhancement to the ‘put with notify’ proposal. See the following example programs.

First, I took an EVENTS example program from gcc.gnu.org and did some modification:

! original program: https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gfortran/EVENT_005fQUERY.html
program events
! working Fortran 2018 example program using EVENTS
use iso_fortran_env
implicit none
type(event_type) :: event_value_has_been_set[*]
integer :: cnt
!
if (this_image() /= 1) then
  event post (event_value_has_been_set[1])
end if
!
sync all ! comment this out to see that event_query is non-blocking
!
if (this_image() == 1) then
  call event_query (event_value_has_been_set, cnt)
  if (cnt > 0) write(*,*) "Value has been set", cnt
end if
!
end program events

The following is how a notify_query could look like in practice:

program notify
! how notify_query could look like,
! no working Fortran 2018 program
use iso_fortran_env
implicit none
type(NOTIFY_TYPE) :: notify_value_has_been_set[*]
integer :: x [*]
integer :: cnt, y, z
!
if (this_image() == 2) then
  y = 123
  x[1, NOTIFY=notify_value_has_been_set] = y
end if
!
sync all ! comment this out to see that notify_query is non-blocking
!
if (this_image() == 1) then
  call NOTIFY_QUERY (notify_value_has_been_set, cnt) ! cnt can have values greater than 1 if x is an array;
                                                     ! Else, if cnt is greater than 1 and x is a scalar,
                                                     ! this could indicate a programmer's mistake: multiple
                                                     ! images, other than image 1, do the same 'put with notify'
                                                     ! operation and thus may overwrite each other's value on image 1.
  if (cnt > 0) then
    write(*,*) "Value has been set", cnt
    WAIT NOTIFY (notify_value_has_been_set)
    z = x
    write(*,*) z
  end if
end if
!
end program notify

Another important point: if we’d use the ‘put with notify’ feature to control the execution flow of a parallel app, the NOTIFY_QUERY could be used by the programmer to implement an abort timer for the WAIT NOTIFY synchronization and data transfer. In practice this could be used to prevent slow running parallel algorithms from further execution at all.

@longb
Copy link

longb commented May 4, 2020 via email

@gareth-d-ga
Copy link

!! Normally you would do the the check the other way (and the statement is NOTIFY WAIT): as in
!! If (cnt == 0) then
!! ! Do something else during the waiting time, then loop back to the call to NOTIFY_QUERY
!! else
!! notify wait (notify)value_has been_set)
!! z = x
!! write (,) z
!! end if

I agree the main point of notify_query is to do useful work if the notification hasn't occurred. Although there are differences between event_type and notify_type, a common use case for both is to confirm that a communication has finished. In this case, the notify_type should be more efficient, which I understand is the main justification for adding it to the standard.

One situation where notify_query would be useful is if we receive data from more than one image. Below is a sketch of how this might be important in a fairly typical halo-exchange type situation. With only notify_wait, we'd get stuck when checking the first halo buffer (left in the program below) -- and thus lose the opportunity to check/unpack the right halo buffer immediately.

program notify
! Concept around notify_query -- important bits have comment "KEY STEP"
!
! Suppose we have a 'halo-exchange' type program
! For simplicity consider a 1-dimensional PDE domain decomposition. 
!
! With notify_query, we can check for data arrival from the left/right
! neighbours, and unpack as soon as it arrives, without any enforced wait.
! This could help overlap communication with computation (in this case, unpack of halo buffers).

use iso_fortran_env

implicit none

integer, parameter :: N = 100, halo_width = 5, max_time = 1000 
real :: solution(N), data_send_to_left(halo_width), data_send_to_right(halo_width)
real :: data_received_from_left(halo_width)[*], data_received_from_right(halo_width)[*]
type(NOTIFY_TYPE) :: right_buffer_received[*], left_buffer_received[*]

! Local variables 
integer :: time_loop, left_nbr_image, right_nbr_image, cnt
logical :: left_received, right_received

! Find images to the left/right
left_nbr_image  = this_image() - 1; if(left_nbr_image  == 0             ) left_nbr_image  = num_images()
right_nbr_image = this_image() + 1; if(right_nbr_image == num_images()+1) right_nbr_image = 1

! Initial conditions
call initialise(solution)

do time_loop = 1, max_time

    ! Main update
    call update_solution_locally(solution)

    ! Create data that we should send to the left/right images
    call pack_halos_to_send(solution, data_send_to_left, data_send_to_right)

    ! KEY STEP: Initiate two put-with-notify communications, which send data to left/right images
    data_received_from_right[left_nbr_image, NOTIFY=right_buffer_received] = data_send_to_left
    data_received_from_left[right_nbr_image, NOTIFY=left_buffer_received] = data_send_to_right
   
    left_received = .FALSE.; right_received = .FALSE.
    ! KEY STEP: Keep checking for the arrival of the left/right halo data. Unpack it immediately on
    ! arrival. Notice we can do useful unpacking work as soon as a single halo is received. I do not 
    ! think this would be possible using only NOTIFY_WAIT, because we would get stuck on the first check. 
    ! So NOTIFY_QUERY seems important (?)
    do while ( (.not. left_received) .or. (.not. right_received) )
        if(.not. left_received) then
            ! Check the left buffer - unpack if received
            call notify_query(left_buffer_received, cnt)
            if(cnt > 0) then
                call unpack_left_buffer(data_received_from_left, solution)
                left_received = .TRUE.
            end if
        end if
        if(.not. right_received) then
            ! Check the right buffer - unpack if received
            call notify_query(right_buffer_received, cnt)
            if(cnt > 0) then
                call unpack_right_buffer(data_received_from_right, solution)
                right_received = .TRUE.
            end if
        end if
    end do 

end do

!

end program notify

@MichaelSiehl
Copy link

notify.txt

@longb Thanks for the comments. According to your comments, I did modify my above example and made two versions of it. I did also attach these as notify.txt. Hope this could help. Feel free to use or further modify them.

program notify
! how notify_query could look like,
! no working Fortran 202X program
use iso_fortran_env
implicit none
type(NOTIFY_TYPE) :: notify_value_has_been_set[*]
integer :: x [*]
integer :: cnt, y, z
!
if (this_image() == 2) then
  y = 123
  x[1, NOTIFY=notify_value_has_been_set] = y
end if
!
if (this_image() == 1) then
  spin_wait: do
    call NOTIFY_QUERY (notify_value_has_been_set, cnt)
    If (cnt == 0) then
      ! Do something else during the waiting time, then loop back to the call to NOTIFY_QUERY
      ! or exit (abort) the spin_wait loop if a time limit has been exceeded
    else
      write(*,*) "Value has been set", cnt
      NOTIFY WAIT (notify_value_has_been_set)
      z = x
      write(*,*) z
      exit spin_wait
    end if
  end do spin_wait
end if
!
end program notify
program notify
! how notify_query could look like,
! no working Fortran 202X program
use iso_fortran_env
implicit none
type(NOTIFY_TYPE) :: notify_value_has_been_set[*]
integer :: x [*]
integer :: cnt, y, z
!
if (this_image() == 2) then
  y = 123
  x[1, NOTIFY=notify_value_has_been_set] = y
end if
!
sync all
!
if (this_image() == 1) then
  call NOTIFY_QUERY (notify_value_has_been_set, cnt)
  If (cnt == 0) then
    ! Do something else during the waiting time, then loop back to the call to NOTIFY_QUERY
  else
    write(*,*) "Value has been set", cnt
    NOTIFY WAIT (notify_value_has_been_set)
    z = x
    write(*,*) z
  end if
end if
!
end program notify

In case, further motivation is required:

Following Robert Numrich's 'Parallel Programming with Co-Arrays' and go even a step further:

He makes extensive use of abstract classes for parallel programming and brings it to the point on page 79: "After Fortran became an object-oriented language, the co-array model fit well into the design of distributed classes:"

I can confirm that the coarray model in Fortran 2018 does allow to implement (and extend) a fragmented objects model, a truly distributed object model, through use of abstract classes. The 'put with notify' feature could be an important addition for easy control of the execution flow with such distributed object programming.

The coarray model together with Fortran's OOP are already forming a new kind of programming language that may fit perfectly for upcoming EDGE. My current developments are still experimental, but you should expect some first example program in a couple of month.

cheers,
Michael

@milancurcic milancurcic added the Clause 11 Standard Clause 11: Execution control label Oct 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clause 11 Standard Clause 11: Execution control Fortran 2023 Proposal targeting the next Fortran standard F2023 (previously called F202X) in progress J3 is moving forward
Projects
None yet
Development

No branches or pull requests

6 participants