Fall 2005 One-on-One Meeting
Week 3 (Sept 19 - Sept 23)
- Plan for this week
- Two critical issues
- Semantics of Tmk_read_ranges(...)
- What is the algorithm? Is it even possible without left hand side analysis at compile time?
- Detecting write ranges which have been written by the same value as the previous
- Floating point: Is there a magic number?
- NaN: In RISC System/6000, a NaN does not compare equal to anything including itself. Thus, for real variable X, use
- IF (X .NE. X) THEN ! This will be true if X is NaN
- Integer and other types: ???
- Implementation of write range acquire (2-3 days)
- Wait a minute...
- Is this new analysis technique not for irregular read?
- Yeah, this technique doesn't do any better than others for irregular read.
Week 2 (Sept 12 - Sept 16)
- FORTRAN COMMON block mechanism for shared data declaration is not working
- Reason: Not sure but, it looks like the ld version update makes this difference
- Workaround: Use F90 and call Tmk_malloc and Tmk_distribute
- Write range detection
- Previously, every process broadcasts its write ranges and then every process calculates the intersection of all the received write ranges and its own read range and request a prefetch.
- Big overhead: it is all to all communication and write range information may not be small
- Complicated implementation
- Sending write range information is not trivial
- Handling of unused write range information
- New plan is to let the compiler API specify which processor will read which ranges at the sender side.
- This will remove unnecessary write range broadcast and prefetch request.
- Problems
- Will compiler analysis be complicated?
- Yes, but it's not impossible. MPI compilers should've done that.
- Should we invalidate all pages that will be written? Isn't it a big overhead?
- OK, we need a better idea.
- Up to date plan (a better idea?)
- Instead of invaliding write ranges, let's invalid all the address ranges (pages) that will be read by other processors in the future intervals.
- New API
- Tmk_send()/Tmk_recv()
- Similar to MPI_Send and MPI_Recv except the fact there is no argument passed to these functions since send and received operations will find their operand and destination at runtime.
- Tmk_read_ranges(read_range_list)
- The argument read_range_list is a data structure which contains a list of (processor-id, read-ranges) pair where read ranges are memory address regions that the processor which is specified by the processor-id will reference in the near future.
- This function calculates remote read ranges which is the union of read ranges by remote processors from read-range_list and it invalidates (write protects) all the corresponding pages for the remote read ranges.
- - Compiler API usage
- Tmk_barrier(0) // begining of the interval
- Tmk_send()
- Tmk_recv()
- Tmk_read_ranges(read_range_list);
-
- original code in this interval
-
- Tmk_barrier(0) // begining of the next interval
- Will this incur any problem? (Now, we are not relying on TreadMark's consistency mechanism any more)
- Yes, we need to check if this invalidation makes huge performance degradation.
- Implementation
- At segv fault, this scheme will not request pages and diffs.
- Since these page and diff requests will be replaced by Tmk_send and Tmk_recv.
- Instead, modified DSM will incur only write segv fault and at write segv fault, it will create a twin and register the page number that incurred this page fault.
- At barrier, don't do anything other than synchronization.
- Later, we may be able to eliminate barriers.
- On a call to Tmk_Send(), modified DSM will find ranges by diffing the pages that were modified during the previous interval and their corresponding twins.
Week 1 (Sept 5 - Sept 9)
- Q: In the new analysis technique, does it require remote diff request for write reference? In other words, does the future reference analysis check only the read reference, not the write reference?
- A: Future reference analysis analyze only the read references.
Go back to
[Seungjai Min]
Go back to [ParaMount Home]