© Distribution of this video is restricted by its owner
Transcript ×
Auto highlight
Font-size
00:00 Yes. Mhm. Yes. Oh. Mhm. So it is

00:23 oclock, so. Oh. Yeah, I'll talk a little bit

00:38 more about the FBI and then uh yes, the demo uh what time

00:45 left, so yeah, much whenever and it's uh part of it,

00:58 some more sites stuff congress mapping, you have to uh Right, and

01:13 I'll talk a little bit about uh known as hybrid programming, their combined

01:20 empty with M P I. Right, so something just some comments

01:31 some of the NPR implementation as well the open MPI implementation, there is

01:38 room for implementation variations and still needing standards, not all first, not

01:45 features are necessarily implemented by everything um vendor or compiler vendor, so and

01:55 they are features that are not part the standard, but that's okay,

01:59 honest. Whatever minimum sectors a standard is included, distance and this side

02:08 just trying to point out some of aspects, what I think I'm I

02:16 may be important. It's kind of for example, so basically uh the

02:24 corner photograph so it's about the 78 weeks. All right. And

02:33 amount of data traffic between the different , the numbers on the back and

02:43 starting with that it's just the night at seven. Uh for processing note

02:52 which to master processes. The main it's best to go with a human

03:00 01. Yes. Uh First uh and the next two lives that you

03:12 to the next hole just taking this it is uh hurried up into camps

03:22 uh you get that order so I to do that. Then the edges

03:32 in this example and number honest. there's a lot of messages uh so

03:40 we take the different levels efficient, possible the mistake. No terra I

03:48 too that things are so The process on one so zero and what so

04:07 find a way but so There are or 55 years old. They were

04:18 70 91 and 1-2 is uh 7 40 ft. So uh in

04:31 okay. Well don't oh onesie, sorry. Long too 46, 6

04:42 take them off what? Uh to this time to stand up the edges

04:51 disaster than just. Uh huh. nos another one of these things.

05:03 six. Well, you know, 44. Didn't learn about that,

05:16 what the market traffic. Uh And some of them then there is

05:24 I support that ways of doing the sentence even progress those in order to

05:33 numbering and numbers is trying to do for a number. So family optimized

05:44 . So you can obviously see that numbers going on that most of the

05:53 , references or data at least, processes are local memory references and things

06:01 on point. It is zero five is 71 - one. Uh

06:15 right. Are six Not for So that's like Uh huh. We

06:31 covered. Thanks. Still in This is what you get is now

06:38 feel safe for me. Yeah. be it Don't be scared significant reduction

06:46 the load onto the communication network. that's part of the problem or challenge

06:53 terms of no mapping is to since an implication that protects the people with

07:01 part. So to get a good of your code, you want to

07:07 a good market. Um And so name of uh system operates.

07:23 System uh if you use a measure the craft and Amanda describing process

07:34 then the system they may have a way easier way of thinking of the

07:42 processes to note I guess in this is another trigger one. But it's

07:48 same example example. Yes, that's doesn't mean that the water very

08:05 Okay, so then the graph describes biological on the there are references between

08:13 . So, in the name of uh traditional waters and mr riddick.

08:22 . Uh huh. Sorry. School . So yeah. Mm hmm.

08:29 bastard. The first of the year I was talking to That's good.

08:42 So funding uh right, supports the winter cold. And our processes please

08:55 up from each other. Mhm. know, they're just feeling kind of

09:03 . So Okay, so this is application. That is correct. Process

09:18 . Now this interactive process street. when it comes to finish, that's

09:35 in the name of So here is , that's the they got the representational

09:42 had from both. Well, two on the lecture. So the medication

09:50 for I mean these are you they got traffic between these two

09:56 Uh, anything about it doesn't look that and your master onto note that

10:10 way. So the process terrible across fun of the transformer compositional And then

10:19 four and 5. Get smart awkward far through three. So this is

10:27 and that's lightning. Hello the notes the machine and the custard. And

10:37 usually don't have any control over So that's the nose gets address and

10:43 so happened. So it's usually start zero up to whatever number it takes

10:49 cover all the notes. But you bet on an idea to get an

10:55 . But the street let me know there's a sign and I when you

11:02 the system so this, this is device uh, arrangements get back on

11:09 investigation. Uh, the pastor It depends on oh, what the

11:18 of different data references that have Where are they? Nothing is what

11:27 are. Okay. All the processes the notes and the machine and all

11:34 difference. The president minimize the overall for all the basic broken up trafficking

11:46 . That's one way of trying to enough of that. But you know

11:51 about the connectivity between these four walls mean to something that actually is because

11:59 you also once you know the fact the nose and then directly after that

12:07 can also signal contentious in the next this point there is no notion what

12:12 network is and what the confession. huh. So in order to do

12:20 think you need to know? So know they want the catholic state of

12:27 study involved and they all activities in to actually buy something attention. That's

12:35 issue. Or if it's uh regarding network our capability to make not necessarily

12:43 that's me just walk to the university we lost the track. Show me

12:50 became they need to process in fact grown up, correct. So that

12:59 that differences are demanding all the the non blocking. Yes. Oh

13:09 . So if the diarrhea historians point but it also done depends on how

13:17 the expertise the principal, But it's in constant. So it depends on

13:26 six the internet you have to uh know, you're very tired.

13:37 That's quick. Mhm. So so here's just an illustration tribute

13:55 Uh I think properties we get along other greens, both clean and prepare

14:08 about 13 certain market potential. Whether know, in this case there are

14:18 things that mhm 10 points in the of the Children. Oh thank

14:27 That is something to the complexity, on the running out by itself that's

14:37 take care of the establishment. FBI generally does not. But the

14:45 person so. Right, wow. uh so the switches uh speak softer

15:00 for some network censors and his Okay, look at the right pattern

15:06 they know about that's multiple. Important. They're trying to minimize context

15:13 it's usually an opportunity that my implementation machine that died the NBA has the

15:21 to the source and destination and then it over a bit mr simmons into

15:29 on something and there are others internet , you know how the messages come

15:35 point A to for people. Uh . The other source this and potentially

15:45 priority what kind of features but it know about the past few days in

15:52 and how to do that. That's network itself. Yeah. So then

16:08 Uh huh additions about cigar stem. telling her it's also something that's

16:19 It's coming in terms of how to the mapping. But as you will

16:25 on the side matter of fact. huh. The math intensity quite

16:34 So there are lots of features of where things ends up because you

16:40 as I mentioned moving there is the critical part of performance uh you know

16:49 started in the urban actors. Of the stress of memory bandits and to

16:54 a problem. Many applications and then a final entity use and there's another

17:01 that is weaker than memory busses, is the Pizza Express and that enables

17:08 to then there's another piece that is a weaker and the peace that expressed

17:13 that an expert itself and that's why of this mapping is important and that's

17:21 there's so many features provided. So I don't think this, you

17:30 want to comment on some of the that have forget to mention that.

17:36 uh and and those expenses, remember was saying uh because it's kind of

17:46 all the options. So that's you know, where else to and

17:51 is opening Yeah. Standards and most supported the standards or even if it's

17:58 internet, you are somebody else's uh a good starting point to save both

18:05 the standard and they're not gonna find tweets, various companies have done but

18:10 particular implementation um mm one having lots things. So there is that's

18:20 I'll come back to this notion of in the next five to but this

18:25 even more yes, move there is mapping, I don't know that of

18:33 processes elevated and with examples general and it's not affecting the binding. Stop

18:48 the difference of and all that not about it. It's not the

18:57 So it's different two nights ago. there's uh as far as pretty

19:07 I would say uh of what the of these things are and I think

19:12 through examples, house gets a little better understanding of, of course I'm

19:18 a different, well ways of controlling things sends up. Um and I

19:28 from that, think about it because the NP Iran refused to, the

19:36 was sort of an FBI program, then we also had this notion of

19:43 went to ask for money knows that controls I'm ahead processes. We want

19:50 know what it's saying. So generally run depends on a scheduler or resource

19:59 , life store. So it's and . So in terms of getting information

20:05 the cluster and the notes and all that. Yeah. So the,

20:10 on what they have told someone that want and uh Cool. Yes,

20:20 . Uh well a little bit about , but then absolutely not, but

20:33 person and that was fine and the , but this that's immediately what

20:40 Uh miss scenarios where what the defaults all fine. So I think I

20:51 some sites mhm later on. so one is specified holes and there

20:59 many different ways to specify holes. you go. I'm directing or but

21:04 least we cannot define, I'm not it's not serious. Uh the people

21:14 tactics and is that french no Venezuela for what happens happens. Um and

21:34 principle the standard last year or not a long, I would say all

21:42 the different caches um, and they all day support that level. But

21:50 things from our reference in court uh . Uh, some of them also

21:58 cool. The one that is but uh, I mean in some

22:12 that's got to be back four harbor the best stuff. The reputation,

22:22 stopped me and different than uh, B I know and the administration's there

22:37 the default. You don't name anything start trying to figure out what

22:43 It's what is a good thing to the target on that machine. Whether

22:51 a court makes sense over the subject sense of kind of you don't need

22:58 . But the good meaning or so conduct. So no. And I

23:14 want to miss something by default. are okay. And the icon allows

23:26 feel more process is that there are that cores or sockets and so some

23:34 them, whatever the entity is, be oversubscribed. But if you don't

23:39 that, wow. Yeah. There's oversubscription then so much that I'm sure

23:50 they're not type of targets to use them acting there is plentiful enough.

23:55 there's no, it can be higher the number of processes but we cannot

24:00 more processes than targets to know all subscriptions. The other one that is

24:08 to what the last thing was that was all right. Like this.

24:18 vocabulary wants to contact spread that way will pack things or stupid things as

24:26 as possible. So bad answer fred Otherwise, if you don't specify diverse

24:37 version other than magic. Uh The number of processes to wow The

24:53 Yeah. Mhm comments is in My comments I think uh over should

25:05 mount everyone that for over subscription. generally it's useful when you have processes

25:14 may stall for some reason and in time the other processes can do their

25:20 on the functional units. So let's if you have a process that's waiting

25:25 I. O. Or a file or whatever internet request or something in

25:30 is you may want to have over so that some other process can utilize

25:34 functional units to get better efficiency. that's none of the use cases you

25:38 have or subscription both arms. It on your application, how you have

25:46 it. I'm other way of thinking that if you if you have multiple

25:54 units and each process does not have work to utilize the all the entire

26:00 units, then in that case you want to have over subscription as

26:03 That's similar to the reason why sometimes have hyper threading enabled on some of

26:08 systems. So that would be one reason to do that. So it's

26:14 necessarily that every time you will benefit over subscribing but depends on your

26:20 Yeah. All right. Always Um We're telling him to walk on

26:33 . So again well, well, subscription is an option available to you

26:44 use that or you can disable So if we is everything that is

26:51 if you just uh straight up for and then most so then maybe I

26:56 give you another and we I will not run your program. It will

27:00 it but it will not run your that will give you more processes.

27:06 whatever the target is not to force topic knows. Oh, so if

27:13 deficient and the resources come back to processes over subscription. Yeah. Because

27:30 at compile time the compiler doesn't know many cores and no treatment. That

27:38 what is finally. But then the chapter resulted from only when you do

27:49 , you specify how many processes you . Mm So there's younger sister kim

28:04 that. Go off man. No is only when you ask for resources

28:11 than what you have available. That's I have my resources. But uh

28:20 need uh Yeah. Then you need cancel your job and then request,

28:34 more resources. And then yeah, a grand time. You cannot uh

28:41 the number of processes once you have some number of processes in M.

28:45 . I. During runtime you cannot or increase their processes. Some processes

28:51 end up having no work and they sit idle but they will still be

28:56 in the runtime environment, they will get deleted or whatever. So uh

29:08 . Yeah. So this is sort this is so fun. Well defined

29:18 . Oh, also uh No. uh to they're supposed to one each

29:26 of the books. Uh huh. number five. Not my process,

29:39 know. Uh whatever I think some about on Yeah. Oh yeah,

30:00 fun. Lovin breakfast. Oh, are process experiment offered. Uh We

30:15 to know 72 sockets indefensible and 30,000 targets for so that to have the

30:28 instead of depending on the person. , no. People talk about four

30:36 them. Thanks. And the other told you look. Uh huh.

30:50 me, you know the earth for . Okay. The contract allocation

31:03 Oh, And the next 4? , the only six years after.

31:12 . And I'll see you get So it doesn't try to those

31:17 But there's something that because Back in 60s, both B- two C um

31:31 here is kind of the example, one word for green. Right?

31:42 there are always subject to pretend. the first thing he does in a

31:48 From four of the time for the responsible for processing the first book holidays

31:56 order agency. Uh and then I'll more processes and always decide market.

32:05 Yes. You uh almost from the here, that's why you may get

32:16 heritage that still to this day. is the work over. No more

32:27 4 41. Okay. But with this home because that was located

32:37 the suspect. Cc Yeah. So the next thing was sort of

32:49 and blind attributes and one comments this . So this central finance actually job

33:03 , the Hamburg scientist and he budget interest that are particularly good for their

33:12 architectures, but many of them. the first few sides here for instance

33:21 the opposite. So let's see what sites here. So it was met

33:26 court to the extent that so in case and so sick offices for so

33:39 words no meat. Uh those by core uh, take them in

33:47 to transport their ranks. Don't say moves up beef. I don't know

33:56 socket. Uh Oh, it's all course. Uh First soccer.

34:05 reserved for the first process. yes, ma'am with that. And

34:13 the next one that report next Okay. Buy the product. This

34:20 from dropping version of the school. he's uh do something before processes all

34:28 . That will be just a quick the dots in that diagram is there

34:38 got send up one above. So means that there is hyper threading enabled

34:43 each core has tried to hardware trains . So that's why there's two easy

34:48 you don't have hyper threading, you see just one B or whatever symbol

34:53 implementation uses. So yes the the corps. Yeah you can If you

35:03 map by hardware thread then you'll see one of the bees and write one

35:08 . I don't know if you have example. That's what happened. So

35:12 , the other one. Oh come 55 right now And still not but

35:30 right. four. That's quite the . Um Okay. Different order.

35:38 smooth. All right. The first one of them, the second

35:49 It's very tough. Find the ground quarter This is And the findings from

36:02 place. Water four. So that's finding explain the process. They're always

36:16 the process between the chords in Osaka succeeded. I want to find it

36:23 more. So then. Oh even . Uh huh. What? I'm

36:39 ? Mhm. Uh The most fire . So and that is uh words

36:51 first. Okay. And then vaccine for the expert has been all levels

36:59 suppose the top position. So it's chose you know all kinds of ways

37:08 trying to Allocate things and depending upon application one or another maybe better.

37:15 I think they have some others. like discussing that a little bit uh

37:23 on this one and uh well, for four. Uh huh.

37:42 And Oh this is B. Found the the other one. But it's

37:54 many of the resources used for across person? This is Excuse report.

38:10 Fun. Oh awesome. Based on . So typical for what?

38:27 mhm. Who is this? Uh prophecies. I'm so hopeful. Stop

38:40 . Uh huh. It's all kinds options and and I think business

38:48 I wanted to common time. So is the way to think about began

38:56 the architecture it looks like So remember they on some processor from protectors?

39:08 is so thin stock it. So started the uniform member channels. I'm

39:18 if you want the tax might stand and you want to to lose both

39:25 because you make more money. And the number of members that compared

39:33 using one socket. Yeah, three sockets so try to But it's

39:43 seven and it's optimized. Mm. anyone work here? When you say

39:56 guns? Yes. Okay. 2000 approach that somehow we mentioned,

40:08 I'm gonna usually mentioned that the domestic in an implement reverse tickle too when

40:18 was that similar to those on the sir. Yeah. Okay. Uh

40:26 agreement. Well, you're in the 1. You're mapping by socket but

40:33 the bottom when you're mapping by right? And I think the comment

40:40 , you don't want your alternating processes go to the next socket but rather

40:45 in the same socket. So in case you can use the slot as

40:50 mapping. Uh We don't switch. is large. If you have only

41:05 field we say different. That stuff the meaning of stop it's something doctor

41:17 an FBI content system son it could our reference or you can stop

41:27 So all the thing this case the before by what the end result is

41:41 you start to learn the meaning of . So many things that happened.

41:48 is uh it's not so that there huge of loss believes process. Yeah

42:04 of course of course this the So I mean but when he was

42:17 in the demo we got the same in the same fashion. I don't

42:22 know what that discussion that's so what trying to say is that the definition

42:31 slaughter is implementation dependent. So in case slot meant approved. So that's

42:38 it's looking similar to what you get doing map I corps. Right.

42:46 that that actually were the right that it's an imaginary concept so it can

42:51 anything either it can mean hardware threat socket or anything that depends on the

43:00 . So you can see in this for the world but everything has a

43:08 of the one the one suffering acute you kind of only about half of

43:15 family. Uh people want to work the next slide but this concept we

43:21 on the other hand sometimes these things anyone they they optimized because not everything

43:32 on the same subject. So Have good day that we use the cash

43:37 up 90 spouses. So it depends the application, whether you kind of

43:44 to stick things from every standard is a mystery but they may want

43:50 nothing. But it is that the is that we want to. Mhm

43:58 huh. So that's why I have those knobs to manipulate because it starts

44:03 clear that uh from whatever they run knows about what's happening and they always

44:11 they can figure it out. Oh , so for example and see what

44:22 not over here. Oh yeah. , oh I see that. So

44:34 has additional options but before mother component on the defense, how they built

44:42 own processors. That's kind of very And the modular even in this blue

44:49 itself. Yeah, so I'm not to dive into too much on

44:56 You see something you want to So I just have these examples for

45:02 . They're interested to take a look and see all kinds of options that

45:07 available for us specifying basically what jay gets allocated to each process and where

45:16 end up and if you find them I want to, you know,

45:22 open MPI I had a bunch of that shows that you can get in

45:27 case either binding the core or hardened the socket so that you know,

45:32 different options for the always simple things that could be good. Uh If

45:38 not the only one running on the because Mr may allocate other jobs to

45:44 of the courts of europe. so that they can dependent and it

45:49 move it around except the human benefits could also be affectionate. So mm

45:59 there's, you know, different happens uh yes. Okay, so next

46:05 I would do so anything more about ? Nothing was done. Something

46:15 Just a great general comment. It's what they've been doing in their assignments

46:21 uh open MPI or single credit It was necessary to understand the memory

46:27 pattern. That's kind of the similar that we're trying to achieve an open

46:31 as well, but where you send , where you allocate your processes has

46:38 significant effect on how much data communication between the sockets or across the

46:44 So whenever you're trying to do you take your project. So it's important

46:47 understand how the memory is accessed in the processes and what the data communication

46:54 is now and that would really help to get good performance if you can

46:58 and map your processes accordingly. Like of the not mapping examples that was

47:03 initially. Right, So next a comments about combination of np I and

47:17 mps ascending principle. Iran you have FBI processes running on the same,

47:25 . But then process the process communication much more weight than with the friends

47:32 opening piece. That's why it I've been programming, developing story the

47:42 that's what's happened. Data between processes different process source or else. And

47:50 within the nose to the memory stand open every day for that. I

47:57 have any but it's but they both once again the process is I know

48:03 we just talked about but some of point of view is standing right

48:08 right 61 I really like to know then there's empty. I there is

48:16 easy that's the notion program him on decided really want to silence place,

48:28 know, secrets. Yeah. Uh then there's a few different ways on

48:39 . NPR those we are used to . I can open and eat um

48:50 available. So here is against one the pictures. Again, the combination

49:00 this process that can be friends. uh no. Uh the Beatles you're

49:12 between. No, then there's all of this uh victory. The extent

49:22 which Japan I told Uh one French allowed to sleep. FBI Cool.

49:38 one safer. Mhm. Oh, the main threat to have an

49:49 So that's a friendship and it's the one that please for a second time

50:03 uh please for all the possible but the giants for different the city itself

50:14 with it. Uh And that's why have these options. So here is

50:19 a single, That means just one 1st. No. Mhm. one

50:26 at the time or you can do bundle in this case the master for

50:31 exclusive and then they can spawn a of threats um and come back transporting

50:40 again. And this is the cereal based upon at the time version in

50:46 of the FBI calls and it's just I said everyone can do something but

50:53 more of a challenge to keep track so it affects which one you want

50:59 use, depends on your application and on. But it just shows the

51:05 . Uh huh options that from us of course developed tickets. It's more

51:10 a challenge both on the programmer to sure that you can always get the

51:15 answer regardless of what you do as as for and the irony of system

51:23 make sure oh the rules of the are kind of Yeah, questions and

51:35 comments and this is just pictures in coma and we'll soon stop online.

51:44 of them are some versions of Uh How did it? I'm not

51:50 we have the every emotion. So is just and this I think this

52:03 and damp black is on that Yeah, it's so you if you

52:08 to play with it on your Um And this is another version of

52:16 trade and whatever and then because you also pointing out that when different groups

52:22 issue things, things are very critical mature. You get the correct

52:25 So because mp doesn't necessarily which of in this space um broadcasting on the

52:37 to end the protest operation. So on um you don't want the threats

52:46 be confused. Who's doing what or . P. I. To be

52:50 . That's still what you want to . Some of the exposed an incorrect

52:54 and take a look at. Then also they finally incorrect example of trying

53:01 see her in order to And this definitely NPR so let's freedom for multiple

53:12 , history material pulse. So this the government, the demo.

53:25 So national too. And that's I a few more slides here. But

53:35 just so you take a look at think there's something specifically has chosen this

53:42 bit performance. All data but now professors system data structure. Okay,

53:56 me just take over from now Yeah. Sorry. I didn't prepare

54:01 the uh everyone's but the cool examples like okay, if you guys can

54:08 with. Yeah. Yeah. You awesome. Uh huh. Yeah.

54:33 that size good enough? Yeah. just show so these all these examples

54:41 you see here in this directory they're on, I think under the content

54:46 on blackboard uh holder for your That's why it doesn't make fun.

54:53 just to download them to make Uh huh. All the developments not

55:02 mm hmm. I'm sorry, but very simple. Uh huh program.

55:13 this lecture. So basically NPR program the five points there. So uh

55:21 uh decompose your entire grid into uh three smaller grids and distributed across the

55:29 . So, uh in terms of , you also have a duty grid

55:32 processes. And each process in that gets one block of the matrix that

55:38 supposed to work on and stand. as it's called the nearest labor communication

55:44 . So when you decompose it, all the processes needs to communicate with

55:50 neighbors which are as according to what call north, south east and west

55:56 for the for the reasons that the processes need to perform their own

56:04 So initially nothing special here, it's mostly all the FBI initialization calls doing

56:11 parameter parsing here. The main thing from here here you letter mind what

56:20 you're for neighbors are based on your drank. And so your P.

56:27 . Is the number of absolutely There is a P. Y.

56:32 well somewhere that's the number of facts the Y direction. This is

56:37 Uh Good sizes were always a simple list oppression that allows you to compute

56:44 if you have a neighbor and north or east or west if you have

56:49 or not. And then based on you communicate with that particular neighbor that

56:54 exist. Uh these are your 2 Great old and old and near 11

57:02 your computer results, that's the old then later on used to have uh

57:07 progress for the next generation to We and like 1000 iterations which will

57:11 on one sec. Oh yeah the there as well. Mhm.

57:19 So in this in this example the it's a very simple one. You

57:24 define uh these offers which are contiguous for each side. So as uh

57:33 means for each of the samples, back your data for each direction into

57:39 particular offers and then send it to neighbor that actually requires it and I

57:45 our prefix uh the receive buffer and that's about where your neighbor sends you

57:52 data that you require for your own . So in this case those are

57:58 minded your stuffers, we're not using data from the from the internet.

58:04 then simply here you start your inflation the jacoby uh you act your data

58:12 each direction into into the land And then uh what is here that

58:18 use these non blocking center received for ice. And and I received.

58:23 first you do I send for each and then you also also, I

58:29 the directions we wait for you to the data from your neighbors as

58:33 And you call the FBI at all is basically a synchronization point to make

58:37 that you got the data because you know for these non blocking girls if

58:41 got the data or not until the point. And once you get your

58:47 are you perform the updates on your . That's very simple implementation for this

58:53 be uh the communication and they need do your computations. So I'll just

59:02 go through all the old examples and I have some numbers to show you

59:06 you the uh after that. So than using the contiguous buffers, you

59:17 you can also use these derived data from NPR. So in this case

59:22 you remember from the last lecture we uh they don't like called FBI convenience

59:27 allows you to define data like that a contiguous buffer in this case

59:32 X. Is that the number of that we want to send? What

59:38 and the I double is your data . And then you just give your

59:41 type and name in this case. north south dive which allows you to

59:46 continue uh communications for normal for eastern side did I strided if you

59:54 So there's because we work with row format and sees or police investigator destroyed

60:00 . In that case we use these protector the data. Again you have

60:07 size uh number of elements that's one unlock and then the strike between each

60:14 and the glass on us again the of your of your of your local

60:21 . And then I I actually actually the uh in your bed and give

60:29 name for that data. And then this example is pretty much the

60:33 Uh you need to make sure that commit that you're right, you can't

60:39 it and that's pretty much the So now rather than doing the packing

60:45 unpacking of data and those continuous government the previous example you can simply do

60:52 I send with your delight data rights than because they're doing that back in

60:57 biting into the country's coffers. And you perform your computations on the on

61:03 data that you have. So again using the non blocking centers us the

61:10 and and I received here man does uh can anyone think of one simple

61:23 that you can perform? Uh For this expensive example we're doing

61:34 We do and then we do the on all grids. Simple, very

61:39 optimization that you can think of. God. Well the simplest example uh

61:58 optimization you can do is imagine that these moments. Why are you performing

62:04 uh these communication operations? The only is because for the alleged elements you

62:10 data from your neighbor because uh that has the The elements that you require

62:16 do the five. for the inner of the grid. You really don't

62:21 a needle in your neighborhood. So this case you can simply post your

62:26 operations using the icing. And I while the communication is happening you can

62:32 your computations on the inner part of world so they have eaten. You

62:36 need to wait until you get the from the outer part of the tribe

62:41 where you can overlook the combination of . And once you have done your

62:46 computations, you can check whether you got the data by having uh about

62:52 in the end and then perform your from the air on the outer part

62:56 the. That's supposed to give you speed up but that also depends on

63:01 your network speed is and how you map your processes. So you can

63:05 get some speed up. You may get it application as well as implementation

63:12 . But in general I guess you probably get some speed up the

63:16 I I did not get a speed . That's why I'm making a

63:22 Um Yes. So even in this we were using the that they arrived

63:28 . Right? So again the right to live as I've noticed from my

63:34 , it's mainly a way to allow to have easily program easy program ability

63:39 your program because when you have only numbers you need to figure out the

63:46 and then you have to figure out indexes of elements to copy it,

63:49 data to. First to the buffers then to the senate you have to

63:52 through all that assault the right data you only need to do that once

63:57 then you have the right data type then you can basically just just refer

64:01 that there are data types can be to handle all that conversation for

64:06 So again with my measurements, I not see much speed up using the

64:11 data but I feel like it's more a program ability uh future for the

64:24 to come from the next project. is this I mean so the internet

64:30 uh going up quickly so you wouldn't to do. It has more number

64:34 things but double overlooking only become time most of the, depends on how

64:46 you have distributed your, how much your ancestor is. If you have

64:55 a balance between the work for process the communication time That happens between the

65:02 processes then you may probably see spr overlapping communication and computation. But if

65:09 is the bottom line and whatever you in terms of computation, that time

65:14 still going to be a part of total total execution time communication.

65:21 But how was dinner time? uh in in the one time it

65:29 is doing the same thing that people manually in the one time there should

65:33 a function that's actually going and getting the values whether instructors tried and Australia

65:41 and the back and that's probably what's on. But it's likely to have

65:45 it and some more optimized fashion than it entirely like you would with a

65:51 problem. A more optimized. So , I mean comments from you.

66:07 . And they didn't. Right. huh. I would think in

66:18 Okay. You know what time it . First of all the ladies,

66:28 one night. Mm hmm. And the bible is not on the let's

66:36 . How has on the genes that up copy is picking up?

66:50 I agree with you. Yeah. the 19 it's never enough. Uh

67:02 . Just do one thing to remember to change it somewhere. Okay.

67:15 I hope to perform different. Here's problem with the First nations, things

67:22 that. Yeah. On the news the giants. Yes. Yeah,

67:32 the one that I showed you initially first example that I shall initially.

67:43 . Oh no, that that uh you do a scent for each,

67:50 other. Yeah. You don't generally . No, I did not do

68:03 problem. Yes. So rather than the data first to a buffer and

68:09 standing it, we do a stand each of the elements that you want

68:13 send in that case the same operation an overhead of doing it.

68:17 So it needs to talk to the I deliver whatever networking there there is

68:22 then send that one element. So case of copying the data into 20

68:27 before and then sending it in that you only pay the overhead fee of

68:32 a sense only once for that environment you're doing one element at a time

68:38 sending it one x 1, you that overhead of doing a scent once

68:44 each element. So that overhead adds the total execution time and then you

68:49 see a heart condition. Well seriously? Yeah. And then the

69:01 example I have heard was showing you the remote memory access is that's the

69:10 , that's the one side of communication . So rather than doing uh send

69:17 receive, you know what and um , so where is are you

69:27 Yeah, so this is basically from lecture you have to set up a

69:31 uh an FBI for doing this uh side of communication and then you can

69:38 put so in this case you don't need it centralization point. But since

69:45 application depends on getting the data and on it, you need to have

69:49 organization fund in this case. So this case also I did not see

69:54 uh speed up, I'll show your in a moment this other thing in

70:01 of yeah been using these one sided , remember you need to define these

70:08 which are remotely accessible members faces on process so that you can do this

70:13 won by determination because one side of remember the receiving process does not post

70:20 receive operation. Right? The center simply puts the data into the into

70:25 remote, remotely accessible memory of the . So in this case each process

70:30 need to create a window there that can assist. Right? And there

70:40 some of the numbers that are and got pretty much last night. Uh

70:46 these are on the on the bridges note I got access to two

70:51 there's 1 28 processes on each showed that's equal to the number of course

70:57 their nose as well when uh Uh I would go to 256 uh piece

71:06 these numbers that you see are for by poll. So each the next

71:12 goes to uh to the next physical in the in the system. And

71:17 it's also blinded only a thinking thing we cannot move between any other

71:23 So one very simple. Mm mm hmm. one Very simple observation

71:32 can make here. Is that The scalability of your this around

71:39 Yeah, as soon as you uh one Grand EA processes, we are

71:46 only on a single line here so does see how good balance between the

71:52 or process here. So the 128 you pretty much see uh the execution

72:00 to decrease for the entire execution, as soon as you move uh to

72:08 process easier that you're using our tools then there's a communication link between those

72:14 months. Then the scalability starts to their Because I have introduced one model

72:21 your own time at center. what else? Yeah, that's

72:34 Right? Yeah. And yeah the thing that I want to mention

72:43 is the map the importance of So now on this, right,

72:49 timetable I'm mapping by note, that each consecutive process goes to the next

72:58 uh in your other patients. So got, you've got who knows The

73:01 zero stays on North zero Process one to second note as the long

73:07 Right? So now, because this the nearest nearest neighbor communication application now

73:13 process zero has to go across the regard its data or send send the

73:18 . Now I have to travel a distance to the data communication and I

73:23 get all the results before getting to in here. Now you can see

73:28 the scalability now drastically uh increases as you move towards more processes. And

73:35 if you compare here, let's say processes here, which is the first

73:42 here and then the full processes with of mapping, of course You can

73:48 see there's almost four x decrease in atmosphere. That's just one example to

73:54 you the performance or Yes, these execution banks so Yeah. Large.

74:05 . Mhm. Talk to you. . Exactly. four point they

74:14 Exactly. Yeah, that's mhm. showing you the importance of mapping.

74:24 has to be careful very well, much even mapping in open MPI or

74:29 uh Both only that some right now everyone. Uh huh. All this

74:44 know, although spoken on network. right, well that's fine,

74:52 Data references of memories prominent. So the point seen in this case.

75:04 . Oh obviously everyone else was No, okay. On the right

75:15 that time. Rare point of Oh right six. The association and

75:31 that's why I'm uh both models are used uh basically involvement, processes,

75:41 on the outside. On the right is all the time. He this

75:48 tough. Uh huh. Yeah. . But still it will be larger

76:00 that. I don't have the Yeah. Yeah. So called liberal

76:07 . Oh yeah. Okay, window just a logical term for for saying

76:34 referencing a remote memory location information. just a logical grounds. So that

76:39 of the patients would be on another , it would be on the same

76:42 . It would be in another The window is just one.

76:48 it could be on the same notice it could be on the same socket

76:53 in that case it really doesn't make sense. Yeah. Yeah, but

77:02 that's the flexibility that allows you, can you can still do it if

77:06 application is as such, that it make sense. I'm not saying

77:10 I will never make sense. I'm saying that it depends on your

77:13 Whether it makes sense to do it not. Window, Window, window

77:21 be a window is a remotely accessible location. That's right. And those

77:30 uh Windows are known to every process oil in your runtime environment. That

77:35 that inside any inside your environment can single socket. A one note multiple

77:47 can be anything. So, all , they know about that remote memory

77:54 . That is blinded to one particular . So. All right. And

78:00 you can the other processes can come basically say that I want to put

78:03 daylight in your remote memory location that have presented to the end. Because

78:13 a there's a concept of having remote and having a section which is

78:18 private private memory. Private memory can be used by the process that the

78:24 memory location is the Yeah, the location that that process exposes to other

78:29 so that they can communicate with that . Oh, alright, sorry,

78:35 just have two minutes because this will Yeah, because this will be

78:42 What? Yeah, but I just to show one thing that's something that's

78:51 you're thinking. So because uh some you might use Mp, I just

78:58 to show how you can use the . But uh now with the FBI

79:03 that will be useful for you. again, if you the uh loading

79:07 that module. Mm hmm. Then can have you need you need to

79:16 your exile and in this case that be the main file that has the

79:24 the FBI option uh because that's that's one that's been set up. So

79:31 have this Mhm. Uh and then these are super sports programs that I

79:38 here for sensitive, so I'll use uh how C X C X

79:47 Which is the uh C X x plus plus from my other. Uh

79:57 . Yes. See, excites And basically that's the same command that

80:03 there for any of the programs here then you do uh expensive FBI dot

80:11 . And then whatever your output is beautiful name, whatever you want to

80:15 it, I want to be there now. And then while running your

80:20 , what you need to do is have to give all your mp irony

80:25 parameters for, So let's say you to support system for your program That

80:30 specified 1st. And then it's similar what you did for uh for your

80:37 programs. So now, exactly. then whatever thou uh options you want

80:42 give it here uh since you mentioned run, you don't need to set

80:47 p flag to say to specify to . But there's a way to do

80:53 as well. And then it's complete your program there. The the only

80:58 you'll see when dull generates these metrics you. So, again, if

81:04 remember if you have multiple metrics separates , one directory for each metric.

81:09 , if I Surely one of Yeah. So I ran this uh

81:20 example, four processes. And until until open MPI you've been running only

81:26 process. You've been doing multi credit , but not multi process program.

81:31 multi process program drumming. So that's after now you've been only seeing one

81:36 that was generated by the data for the threats that were involved in your

81:41 In this case, because I have processes here. You it generates four

81:46 files. Which process. All But reading it. Uh still the

81:51 . You just read the graph. that opens the uh all the profiles

81:57 each other in the process. I think there is laid up for

82:00 the processes here. So if you from here is the Not 0 -

82:07 . Um Go down. No, 11. basically culture. The process

82:13 not. So, I don't get in that and I'm not too.

82:19 then in the end there is no . We had those four processes which

82:25 responding is No. And then towards end you have the summary or uh

82:31 the uh all the processes. So get the total for somebody for each

82:37 for each function that was involved. then you also get the mean

82:42 Probably a little process is one of metric that you're running right. It

82:48 you details about all the FBI functions would bother if you have any other

82:52 are doing in there just like they're of energy. So I just wanted

82:57 show you that you can receive I know sorry. Now if you have

83:02 questions go ahead, I just want finish it up before me before hybrid

83:12 . Yeah steve right. Yeah. that case you need to do dash

83:17 uh okay, A and B. also you need to do the FBI

83:22 first The 500 won. And then guess each of the four profiles that

83:28 showed you each profile rather than having one section, it will have multiple

83:32 for the uh Okay, that's pretty implemented. Thanks to you. It's

83:51 lot, you can be sure that kind of thinking through. Sorry.

83:57 happens when we think about this. well if you do it then it

84:04 you uh all the other processes, how much work? It depends how

84:17 No, no, no, it on how much how much work each

84:22 . So if you have four processes like I have uh process one Process

84:26 , Then, let's say 1000 Process students, some other number of

84:32 , and then the response will be different. Right? I'm not going

84:36 get the same every time unless each did the required amount before I'm

-
+