© Distribution of this video is restricted by its owner
Transcript ×
Auto highlight
Font-size
00:00 before. So, um, today's essentially focused on what it is like

00:17 work? Uh, and they said the way you will be working for

00:26 assignment and most likely the project as . So and for that, there

00:36 a lot of background. Ah, . I would say that is

00:41 too. Uh, get used to it's, um, what critical in

00:49 these environments are used, as well for the assignment you're going to do

01:00 binding and many of you may I'm these use trail environments. But

01:07 most of the time, student birth not for me with it. So

01:13 talk quite a bit about resource managers in particular Nolan's learn. And then

01:23 will goeth through some of it slowly some of it quickly, and then

01:29 . Josh will do a demo that , I guess hopefully give you some

01:40 feeling for how things actually work in shared in Rome. It's then I

01:47 either I or so I will talk them. Module commands, which is

01:54 common, is used in the shared . It's a command that best will

02:00 you to load software environment or a of software that is useful for

02:08 but it's not necessarily accessible unless you this model command. So things will

02:15 the paths to the very soft energy . And I talked about timers because

02:22 of the assignments you will do I need you to figure out what

02:30 the cold, because down the classes my performance. So for performance,

02:35 need to understand how resource is still , one of the most basic

02:41 miss timers. And it turns out to be all that simple to do

02:45 good job in timing thoughts and then simple hints about I have to collect

02:54 and how to report it for The sameness. And again, for

03:00 course all this time, this is on development understanding. Have a

03:07 behaves on the detector platforms. You're more soul than for you too cold

03:13 the platform sounds game. The incidence you know what to do to get

03:18 good performance. All right. Delancy all right. So some of you

03:33 or ever gone ahead, a sports last lecture to get yourselves accounts.

03:43 , do the platform so I think may have omitted attack on the horse

03:48 , but it is on this So three of you before costs already

03:58 been used for a council. Both and, uh, I have confirmed

04:03 you have account access to the camp attacked Account for the class force approved

04:10 . So you remain this offer that that useless to the account may not

04:18 Run yet. So But tomorrow I'm sure you will have access to the

04:23 . Can't, um, in a so in conventional last time that the

04:35 personal computer and central has his computer or cluster known as bridges. And

04:40 the one it would have account? , they kind of portal or than

04:48 through with you get access through. system is known as exceed for the

04:53 science in the dining and discovery environment is, an NSF funded into

05:00 Attack is, in fact, also of exceed. Um so feed is

05:06 an umbrella organization and just off of centers participating in C. And when

05:16 comes to tax, that is the A trance computing Sentra. We kind

05:21 have a little bit of the privilege in Texas so we can get access

05:31 the Texas institution separately from its So part of the reason why I'm

05:37 it's not listed as an exceed sentries we use the Texas privilege to get

05:42 accounts, you know, through the exceed organization. Um, so this

05:53 something on a pretty much already sent . Works that, um you just

06:03 to have their own account and then use of them get linked to possibly

06:11 different projects in the North. you. I will link you to

06:15 accounts. And some students may already accounts from this very insistence, and

06:21 all fine. They should not be . And the one thing I do

06:27 for kind of honesty used the class , four class assignments and projects,

06:35 not for whatever signs you normally be institution. No problem in getting accounts

06:41 that as well. But we get TV educational cool and more fairness.

06:50 should not abuse it. Anyone that not have seen clusters are put in

07:00 kind of picture off both homegrown and professional build clusters. And I'll talk

07:06 little bit more about that. Just nobody would never seen. Wanted for

07:12 to look long when it turned Just log in remotely. Someone doesn't

07:17 know what the physical thing may look . It is gonna live with more

07:24 all and again on the left son in kind of homegrown thing. Where

07:28 on the right side? You kind seem one of a professional put together

07:34 . And we'll talk more about actually put together this just to give

07:39 guys a notion of what it might in real life. So no to

07:47 Kamler, that is important for Using the resource manager as well as

07:57 understanding your coats. And some All for capture is on the figures.

08:06 unfortunate some of it is somewhat So I tried to be consistently in

08:14 class, Um, and this slides of going from the top Proton for

08:24 bottom. It is a little bit a bottom up. You off what

08:30 going to use. So what you in the upper left hand corner is

08:35 a soda off what's known as a . Mississippi's a silicon that has always

08:44 logic and crashes and what not on ? No, Uh, that's what

08:56 I will refer to as and processor or it used to be, in

09:03 good notion. CPU Central Processing Unit back when there was only one processing

09:09 at most in the beginning, that not in abundance portrait that was put

09:16 on the circuit court. Nowadays, have many course. Even things in

09:23 phone or laptop tends to have a course, even if they may not

09:28 that many. So this piece of that is more in the company's like

09:39 , Intel and the I am designs her produced. They package it

09:47 And what's kind of this process is you let you see in the

09:52 nothing gets, then plugged into what's as a socket on the circuit

10:01 So on the right hand side in top row, you see sort of

10:06 instance on the circuit boards and somewhere the red green years as under upped

10:12 basically the processor Hyzy and they when comes to clusters, one returned to

10:23 to these individual PC's, which could or servers just a normal things when

10:30 go to a website and trying to something you can order a PC over

10:33 server on. That's usually then something the circuit board, possibly packaged in

10:44 couple of different ways, as so rack units or blades. Bracun interplay

10:53 really just for your information, it's something that you well actually need to

10:59 for your class. Want understanding course and knows it's essential. Then,

11:10 when it comes to the things that being put together to form Custer's,

11:15 on either uses rap units or when blades, cracked units, they gets

11:23 mounted into racks on the door left corner, whereas a place gets first

11:32 in the was known as the surface then the chances goes into the

11:39 We won't go into too much, we'll talk a little bit more about

11:45 . Many let Chris down there into glass on the reason for blazes

11:52 They tend to be more, and efficient. Then just using rack units

11:59 though it's the same process is being . There's a test to deal with

12:03 packaging and part of The reason that a more efficient is that come in

12:10 as your point of view is that chefs is allowing it to her if

12:18 , because then the cooling infrastructure is for all blaze at once, and

12:24 fans tend to be more efficient than small fans. The Finn fits in

12:29 raccoons gun in a way that to home message sent. You need to

12:35 this processor, cores and notes, the rest of it is just general

12:41 for how things are put together. is what kind of more functionally a

12:49 looks like. So, um, kind of a schematics, I

12:58 on the right hand side of this , um, so clusters on up

13:04 put the homogeneous toe have different All knows. So there is a

13:11 note now someone you connect to when try to use a cluster, and

13:19 it's not just a single load in . It may be single if it's

13:23 crossed over a few users, but ah, cluster, that sort of

13:28 of useless like stumpy to attack on . There's usually a few of

13:35 but not that many, so they missing. The logging those are supposed

13:41 do is basically to allow you to to the cluster, so you're not

13:50 supposed to do anything on and the mistake off. I was saying new

14:01 , which is the case for many . They tend to, Jin

14:04 Sir, Computer and in principle you do anything you want on it.

14:10 you shouldn't because it's not configured and not, um, there's not enough

14:18 them to actually support other things. this administration of uses from a logging

14:26 . So one thing never compiled or coats on logging knows then clusters have

14:36 knows that you're not really going to uh, exposed to. That's something

14:41 the system of mean used to managed cluster and their compute notes. Those

14:49 the ones they're going to use for and given other kinds of things.

14:54 your colds on the first and compute tends to not be one of a

15:02 , either. There you feel a kinds on its Custer, and they

15:10 different in terms of them out the on the note on and as you

15:14 see on the Regis cluster, and they have three or four classes on

15:22 nowadays. Respect the how much memory is on the notes. So they

15:27 regular memory knows which is what you have for the class than their what

15:32 call large memory knows on there extra member nodes and then they have

15:40 Old that is, knows what has deep use on them that comes to

15:48 . Other sites may have f GS there knows as well, and then

15:54 separate, Iron knows. So it's to I understand that bargain notes

16:05 They are just for logging purposes. there are other no such a use

16:11 compiling and running coats. I A there's something again that is Carter all

16:18 you actually build or configure your But usually users, at least for

16:24 class you don't need to worry Are you notes as well?

16:30 yes. This is something you can season, particularly. Some things flavor

16:36 what you will get on bridges. it is a regular memory knows I'm

16:43 focusing on up to 752 Children um, they Each of them

16:54 uh, the somewhat old I It's called Intel Haswell Processors, which

17:00 14 course on them. Sorry. it's got confused. That's what

17:09 uh, sorry. I would backtrack for a second, because these are

17:13 things like that I brought up at . Um, slight. So let

17:18 go back and try to get the slide. Ah, so for home

17:41 from this business. Very. You see in this lineup?

18:57 Okay. Sorry about that. So is There's no more specified that

19:04 This regular member knows their 14 uh, safety here. And it

19:13 what's known as a dual socket type . No. So there's two CP

19:18 in each note and each see you as 14 course. Then there's on

19:25 28th gigabytes off memory in each knows means up the two cp use on

19:34 implied 28 course in the know They do share the same 120 gigabytes

19:40 them. Right in the note. , then there is the GPU knows

19:49 you will use for the exercise are you, and potentially in projects on

19:56 . If that's what you do on they have two sets off. Knows

20:04 JP use have 16 knows with an version of immediate, diffuse no

20:12 Jepson. Okay, any model and they are free to nose for the

20:18 reason. Keep you that a C 100 nodes. It's still not the

20:25 recent deep use from a media, they are fairly recent on this Tempe

20:35 system that is more recent version off Euston. It's available on bridges,

20:46 it's known as Skye Lick. that's what they called it Escape on

20:52 like it's typical interviews. All comes naming on their processors. People look

20:59 but a camp alongside have basically I would say numbers the It's good

21:09 know for actually exactly what it but the name Sky Lake, or

21:15 full tells you a little bit more . How old than what generation technologies

21:19 used anyway. So these are also soften nos. On. In this

21:26 , there is 24 course for seeking subject. Um, it's a little

21:33 more memory and use one of them , and this kills 1 92 instead

21:37 1 28. Um, so I that's wasa little bit just arm

21:48 And next, and talk about the management software and the questions on the

21:57 clusters in general or the particular processors no such it will be using.

22:13 , um, so, um, is commonly used on lots of

22:24 not just academics or this this type but also in industry is something called

22:33 that stands for a simple, lunatic limited utility for this mismanagement. There's

22:39 open source piece of software on, so the little bit that works.

22:44 then I will fairly quickly go through few sites about how to use CERN

22:55 this nicely basically a way for you remember the demo that so yes will

23:03 so and look trying to spent too time on the sites because that's reliving

23:10 documentation of the demo of them and other purpose so serious all these resource

23:19 type of things work. Um, haven't used it before. Floor

23:26 actually, and the one that uses puncher or Maxwell August. Your visual

23:33 Irma's forests and remember, but in the way you worked is you

23:40 to the logging note over the Internet hard when it comes to these centers

23:48 from the lucky knows you, then low interest of nature jobs to the

23:56 manager, um, there is force in submitting the job, you need

24:04 tell the resource manager what joy job needs. So that means you need

24:12 tell it. How many nose do want to many course? Do you

24:16 how much memory? No de along all of that information, then

24:23 handed Today resource manager that, Then I'm a bandage system mission and

24:34 the job to the actual clusters. , things are not sitting idle,

24:43 that means jobs ends up in the on getting huge for some time and

24:50 most situations. And certainly that is case for both bridges and stampede to

25:00 more than one Q. Because the institutions set them up so that

25:09 there may be accused in the think bridges for knows that has caught the

25:16 memory, large memory and honor to for GP juice. There may be

25:22 for short running jobs and the separate for very long running jobs. Never

25:28 yet another queue for jobs and Very large number of nos exception.

25:34 all of that thing is dampened by resource research manager, um, and

25:44 never run jobs on the lovey. , and sorry for harping on that

25:50 much, but it tends out. a common mistake than if both I

25:56 us tend to stress not sitting And I'm very, very quickly.

26:03 , bits here about Sturm. It's fairly substantial piece, but softer

26:10 Last time I looked in the tense , it was again over half a

26:16 lines of code that then manages the and the next lie open oil a

26:24 bit ahead of myself. But it's for run on potentially and has is

26:32 on very lost system, with Sicilia more than 100,000 notes. And and

26:42 , you know, millions of fares not familiar or thought more about threats

26:46 the bill are basically execution streams of paralleled calls tend to have potentially very

26:53 number, and and then it man just a large number of jobs as

26:59 . Um, it's open source is , you know four lee necks and

27:06 in most fair versions of Lennox. , so you know, a little

27:17 that is critical to know you when request resource is Yes, sir.

27:31 have a way off the question that are the only use it all the

27:40 is you request. That's not necessarily case. Otherwise, um, the

27:49 system man choose to, for have other jobs working on some course

28:00 the Noja using because, remember, are not, of course so,

28:07 terms of registers 14 course on it's and a total of funny course and

28:14 , you know, and they always decide to share some of these course

28:21 other jobs. So when you do of colds and you want recently good

28:32 , it, um yeah, stable timings. Then the timer will

28:41 count time that that is used by jobs. But it's still a lot

28:45 resource management that goes on. If several jobs running on the same No

28:51 , then potentially even on the same . So bunion, it's not the

28:58 , and you're doing called development or on some flavor. But if you

29:02 the time Something you should use known exclusive old side. Um, that

29:15 . The rest of the thing on , like this Gibbs wants pretty apparent

29:19 resource management that it manager shared It is a little bit of this

29:27 slur. So, through your various are ways of the question things from

29:35 resource manager, so on and suggestible them or some of them.

29:44 useful thing is to request information s differs. Slur himself in front of

29:51 so you can get info with The information of accused. You can

29:55 the information, but the count's and they kind of stand it.

30:01 you're definitely reduces the run command that is the way that basically to commit

30:07 job to slow him, to manage was kind of control demon and runs

30:17 some Mine has been no. And on each one of the notes that

30:21 been used for the job, there local or compute. Nadeem Damon.

30:26 possible controls Eamon about what's going on the job on. There is a

30:36 coming back to this notion, off processors and nose. And actually,

30:46 morning and this slow. There's something as partitions, so starting with the

30:53 inside petition is a grouping. All , as I mentioned, that there

31:01 then for the regular memory and olds keep you nose and it's also knows

31:07 grouped into petitions. So when you the job, you're submitted job given

31:16 . There's a fair amount of flexibility hunting petition or configure petitions, but

31:24 not something you will be doing in class. But it's gonna be useful

31:29 know that that is so. Petition not necessarily be distinct, so some

31:35 maybe pork Immonen. One petition that matters a bit in terms of again

31:41 training, son curing times and the on the left hand side of the

31:50 have this notions off threads course on diskette sockets or processors. So Threads

32:01 comedy unit of execution that is managed the operating system. So it is

32:09 sequence of instructions that then has their dedicated registers. Azaz, for

32:21 So until sometimes referring to threads in context is harder threads, it's not

32:31 me. It's an is normal because no particular harbor, um, connection

32:40 gets kind of petition up to some among friends for that, since it

32:47 resources. Some of them are unique but not home resources and unique to

32:57 . So threads they do actually execute the course, which is the physical

33:04 . And, of course, the resources. What it is, regardless

33:10 , how many threats you want to on a particular court when it comes

33:18 Intel. Uh, think still, all of the front current products,

33:28 support to up to two threats sites configure if they allow was known as

33:37 threading. That means manual are more one friend on in court.

33:43 but the maximum is to when it to internal. The trip for was

33:48 , that's nice landing that I think love for threats. Other silicon

33:56 MD also uses the maxim tooth France court by the M for their car

34:04 processes. And now more. this was on the micro photograph that

34:12 early on today. Several, of , and are cooked, um,

34:19 common piece of silicon that is, processors are there many course, but

34:23 14 from the comfortable GIs and whatever number was 24 or something on stampede

34:35 socket is unique name again. That's thing on the circuit board into which

34:44 process of package, uh, gets . Now things are someone than being

34:53 . Is there meaning or CPU? when it comes to slur CPI,

35:00 is kind of the least schedule So when it comes to slowem,

35:10 CPU is effectively. That's right. that's where we typically people refer to

35:20 processor. We'll see P year as physical entity that houses the large

35:29 of course. But when it comes CNN, sleep you is. That's

35:40 . So again, one used to track of this notional course Processors were

35:52 and notes on petitions toe with Some of the jobs on this picture

35:57 or less just says forever. it is a little better and suggest

36:07 them of this. So that one how many knows one wants.

36:15 Specify how maney the hospitals this one in this job want to specify.

36:27 wants the task elevated to cores sockets . Two notes on not today,

36:36 much later in the course, we'll about why you may want to control

36:41 the various friends were allocated to Sockets notes because there are many shared resource

36:51 that effects the performance off your and that's already said that they're

37:02 So shed reinforces starting with the chip the processor. The core's yes,

37:12 talk about later on, too. tend to have their always have their

37:19 execution unit. They always have their registers and then depending on the

37:31 Some caches are private to the but not all of them tend to

37:37 private. Two courses. Some of cash is no shared to all the

37:43 on the processor. Now, when run single process of jobs, then

37:52 may not matter. Exactly. The capped. I can't but normal it

38:00 . But when you run jobs that uses many nodes move schedulers today.

38:11 you know care involved how the nodes connected? But the connection between knows

38:19 network that is used has an impact the performance venues use most people's.

38:29 in that case, if you Johnson is harsh number and your nose

38:34 demon, a few number of The performance may very depending upon where

38:40 the network of snows are located and for that purpose, you can specify

38:47 nose and want. Jemaine no has very good computation capability between the

38:59 the networked and also Fisher. So , flu cold has significant dependence on

39:09 between notes. It may be effective other jobs running on totally different

39:14 because the packets up there used for may interfere with his job.

39:24 then this pointed out, really course notes election there is. And the

39:30 day that shouldn't be. But in out that it's not all not uncommon

39:36 even if on the notes are supposedly to be identical, the same processors

39:43 the same amount of memory, same systems and same everything it happens that

39:51 still run a different cock rates. you will get different performance, even

39:57 you shouldn't expect it to be the . And someone always has to

40:01 ah, conscientious that if there are hard behavior, you may not necessarily

40:07 using a code that maybe something Um, no. So this is

40:17 a little bit of commands, and will go through a few slides

40:23 and then I will left hand. is just to try to Endemol.

40:29 , um yes. I already said policies and all this other kinds

40:37 So this had been talked about. is a few of the floor commencing

40:42 years in particular. I was me , maybe the council command and chasing

40:50 messed up a little bit and one killing the job. Uh, there's

40:54 info on the to command is strong . Um, there's an issue.

41:00 , we will not. Then we'll of them. But against that,

41:03 on this run and in full definitely. Um IHS again. Information

41:14 can get out of the infocomm man it tends to have the number all

41:20 you got and this ice off the . Physical memory. This Andi,

41:29 of these things have thank you with also. I will not just flip

41:35 STAIs here because you will see these real life on But just is a

41:43 precepts on the left hand side in dark screened shop that you see in

41:49 middle of this life and attempted what patrician name is the tendency weather the

41:55 of that petition in this case for event there was up than attention.

42:00 the time limit has been given to jobs in terms of hours, minutes

42:04 seconds. Um tells you also ah list that has been allocated or reserved

42:12 the job on the ritzy news um, Best more tells you whether

42:23 running or something in the human. is another one from another side,

42:32 command that tells you that's pretty much same thing notices. The different frustrates

42:38 . The acronym is no reflecting in cluster and zero um, there are

42:48 running things interactive, LeAnn, and unusual for call development. This is

42:56 for other things I personally would encourage to do that's submission. Write the

43:04 and let the racecourse manager handle and it's a sit and wait until

43:11 job runs. Andi thinks gets and then you can go on look

43:16 the but suggestible from comment on uh, so so then necessary.

43:26 are best documentation. Slide voice on video should also captured there. If

43:37 and the devil that substantial make so on time, stop here. And

43:47 so yes, do the demo, then I can resume one suitable.

43:55 I'll figure out where Bristled wants to us just down the devil,

44:03 Should I just go ahead and start my screen? Yeah, it will

44:07 Stop shirt. Okay. Okay. can everyone see my screen?

44:18 For the moment, it's a blank . Now I can see something.

44:26 , So in order to connect to clusters, he will need faith as

44:33 such client on Windows, you can a flying such as footy for another

44:41 white for of necessity on Mac, can you pretty much have the message

44:48 mineral already on the consoles off even that now to connect to the regis

44:56 that you are all that you need put in using me. All

45:05 which is you see a seed. this will directly connect you through a

45:17 in north on the edges. Closer you are connecting to in the corn

45:25 for the stamping to cluster. So , in that case, you're you

45:36 Look, something like this. We'll a name and stamping. Dude attacked

45:40 utexas so These are the two ways can get connected through a loving on

45:48 clusters. For this demo, I'll connecting toe affected, the bridges

45:55 So just go ahead and the biggest the a C. Now, when

46:17 are connected, just putting password and be connected to a lot. Clears

46:29 font size, right? So Aziz, you all know these clusters

46:42 based. So all the cluster, the commands that you would generally run

46:47 UNIX operating system you can from them . The first thing to notice is

46:55 then used and you noticed the You would see that if you don't

47:00 walking notice so that circles making nature now the simplest remind that you can

47:07 to get details self more see for is what the processors, processors on

47:15 particular note even use NFC for That will give you details off more

47:24 available. So I'd say until it , if you that has 14 cores

47:33 its socket, you are two subjects we just find the slides on this

47:40 and yeah, this The CPU itself from the hospital my for architecture from

47:48 And then you can see details is like the cache sizes and notes are

47:55 for how the course? I can it on this particular. So that

48:00 be part of one of your first to query the CPU. So you

48:07 to know what kind of hardware you're on. Great. There are a

48:14 more months that you can use for the amount off memory that's available so

48:21 can used that slash process slash mammon that will give you the amount of

48:28 as the DDR memory. So as can see, there's almost from 128

48:35 , gigabytes of memory available on the . And then there's the partitions.

48:41 mentioned there. You can also use command Ellis Beauty. These minus A

48:51 will give you information about what the system does notice running. So that's

49:02 pretty much all that come on steady usually here, apart from the commands

49:10 requires administrator rights, you won't be to run that because it's a shared

49:18 , okay, so that's the comment it is not for this particular you

49:25 the screen, but in general and when you are putting together a report

49:33 paper for publication, you should always clear on exactly what Francis A model

49:43 this table data as well as the environment in terms of operating system and

49:48 and center that was used for from experiment or for your project because things

49:56 different also on the different operating So this type of information should really

50:04 every paper that tell us anything about to performance. Unfortunately, it's not

50:11 the case, but mining. Keep in mind. Always find out what

50:19 versions you have and put it into reports. Papers that also cost

50:28 Yes, uh, right. So was general commands that you can turn

50:38 the machine now coming through this long that has disorder, there are quite

50:44 few of them will show a few them that, like you will be

50:49 for the most assignments. And obviously the first assignment as well Eso The

50:55 one is in Pokemon, which is long term on for getting information about

51:01 the partitions. So when you run this is a sort of for

51:10 So now on the left side, will see all the partitions.

51:14 So these are regular memory knows some a regular, maybe notes which have

51:21 memory than hurt. Those, the GPU notes and some notes with

51:28 are Did you use that also support . A. Here is a

51:37 There is a large memory notice but are mostly allocated for more scientific,

51:43 computational intensive jobs, difficult lots of and lots of processing. Follow this

51:51 . You can also see the time again as we signed the screenshot on

51:56 as well as what are the states ? Each of these knows if it's

52:01 or if it's training or if it's guest. One of these notes is

52:06 as well as a foul, so can get all the information about the

52:12 that are available on this cluster. the next command you can use a

52:19 is sq. Come on. So if I run this, it's going

52:23 be over. This is a Answer yes to Stressful said earlier.

52:29 don't think on there. Uh Our fourth particular on the 700 knows

52:40 can see that the note number ranges quite different, so they're not likely

52:45 be next to each other. But resource manager tried to find enough no

52:50 they might be. Three when the jobs again, unless you steer it

52:56 may be here called unquote. Far in the network. Three questions are

53:08 is probably not running on. The note is running on a separate cluster

53:14 notes, so it doesn't interfere with user interactions directly because it's kind of

53:25 monitoring on scheduling and during the hue off potentially hundreds of users at the

53:34 time. So it's not usually running configured to run on a log in

53:39 that runs on seven noise. that was the question in chat,

53:50 ? Yes. Oh, this is answered. It's not just pick up

53:54 mute yourself speaker. For some I can't open the check.

54:03 so so that Okay, trying to . Um seen so the question also

54:20 how do we communicate drinking? Morris precisely. Probably communities using Sturm

54:37 so well says you may want to it, but, um, the

54:44 command is what you used to submit job on the Sturm, and then

54:50 can use informed few commands. But suggest may want to talk more to

54:55 point of hungry interact with CIRM. the way I'm questions. Yeah,

55:03 think I think you pretty much answered . That's there are dedicated North spatter

55:09 this lot of child continuously monitoring states each, funded all the other compute

55:16 , and then you don't These demands commands are submitted with those are

55:22 which provide you with all these in and how to submit the jobs.

55:27 get to that in a minute. next part. Okay, that was

55:36 good quick enough of unanswered at this and feel free to come back.

55:41 but more questions as suggest continues. yes, just to give everyone an

55:50 how many jobs there are currently So as as soon as I entered

55:55 command, you will see quite a list that's off the jobs that's that's

56:01 on on the cluster right now. obviously, when you run sq

56:06 you don't want to see all of all the jobs that are running on

56:13 you may want to start them in of the mouth. Eso There are

56:17 a few flags that sq has a of times that you may want to

56:22 . The first one is the hyphen flag with which stands for the partitions

56:29 case you're going to see which what are running on a particular partition that

56:34 saw in using, Yes, info . So you can just provide the

56:39 off that partition and see what jobs running so you can see the user

56:45 . Uh, what notes are being for that particular job and how?

56:49 long? Those, uh, job funny you can. You can also

57:00 jobs from a particular user as As of now, I don't have

57:04 jobs, so I'll just make any these user names and, uh,

57:12 white that the hyphen you fly to desk, you and that will give

57:16 all the jobs that are running for particular views. This is useful when

57:21 will submit a bunch of jobs like your assignments to make sure, Basically

57:29 the progress off all your jobs and if any any of those have failed

57:33 not or what state they are in have been allocated. The resources they

57:38 have been completed or not. So this this command will be very useful

57:45 on. Uh, right. So when you have to run jobs there

57:53 the most general terms, you have base don't to submit your jobs to

57:59 computers. So first thing noticed that still on the log in north.

58:04 I had a simple We'll work around just inside the world to see

58:15 uh, already have confined it using can. I used to see Seacon

58:22 that's available here. You can use content, Compiler. If you happen

58:27 choose to use in the compiler, just need to change GCC to I

58:32 c. Attrition and you're down a All right. As of now,

58:42 I don't quite foot. You're So right. So, as I

58:53 , there are three ways crew running your jobs. Uh, first is

58:58 using the Estrin. Come on. what? When you are on a

59:04 in note? What Estrin command does it submits your job the way computer

59:10 on what parameters You passed to So in this case, what I

59:14 do is fast the end tasks which for number off tasks for a number

59:23 instances off the job that you want run and give us the executable

59:31 You know when I do that. , as you can see this,

59:35 has given it a job. I be and it's waiting for The

59:39 to be are located. Now, you can see, when you use

59:43 run, you have to A You to wait until the job has been

59:49 to the sources and the job has executing, which is not very useful

59:56 you're trying to do ah lot of parking or making sure your quote,

60:01 working fine and or doing a bunch testing in their sort of things.

60:10 it gets done quickly, just And this is also a good example

60:23 show you that the resources are shared you may not get access toe the

60:30 that you want instantly, so they working on your stuff early. Don't

60:36 until begin because it's a shared We're not controlling what's going on on

60:42 . So it may take a while get your, Uh, no,

60:49 think I'll just skip it, because in, uh, as soon as

60:56 resource is get located, you will another message that your job has been

61:00 the resources, and you will see output off your program right here on

61:05 console so that its company headstone Uh . So this but that was the

61:16 way you can run your jobs. , is to run your jobs.

61:21 me by getting interactive access to the ALS that you can do by remembering

61:28 command of what you're trying to It's now the on there. I

61:37 there was a screen. Shoulders were the slides as well, so you

61:40 take a look at that later is . But do is the most important

61:46 that you would want to fast going command are the number off nodes that

61:51 want so that it goes by capital and perimeter. So that's if

61:57 remember the noticed one full entity that those two to see if you mother

62:04 on it. So that's the Now, as we also saw on

62:10 slights that each off these notes has two C p use, which have

62:16 4 speech. So the number off that you want to get access cool

62:23 you, Bill Facet using the hyphen and narrative. And for one

62:28 you can have Max give it 28 there are 28 force on each note

62:36 of 26 years or four cores, , uh, on stampede. If

62:41 try to do with that because there hyper threading enabled on Stampede knows you

62:48 give double the number off physical course are available. So that way you

62:54 access to the courses. For I just get access to one note

63:04 one court off. It does believe busy so as you saw,

63:11 as soon as I did that I waiting for. Resource of that has

63:16 allocated resource. But the important thing notice is an hour. Council has

63:21 from Logan this article is eager to , which was one of the North

63:27 the partition are in the autumn small that resigned the lesson for So that's

63:36 simplest way off, making sure that you're on a logging road or if

63:40 on a cure now, one thing remember is when you use the interact

63:47 month, let's say you gave a of nodes as to and number,

63:56 course, as eight. Let's So in total, you will be

64:01 access to eight course on are on , not 16 force. It will

64:09 in a total of eight course. , if you if you happen to

64:14 something like this, then you will only access to flick off the

64:23 Still, you will get access to force. Even if you get a

64:26 nerves. Remember that this is the number of fours. If you're getting

64:32 to no one's here. Once you're on a computer, you can

64:39 You was Esserman. Do it on shoulder. Just get you will get

64:49 outfit not seen from what happens when increase the number of tasks more than

64:55 number of force that I asked It will get you get a letter

65:00 Sloan that you're requesting more resources than have information for, so you can

65:06 run or snow will allow you to jobs. Only on the resource is

65:11 you have been allocated. You cannot jobs. And more than that now

65:17 directors, very useful to useful. you're trying to deepen your forward and

65:23 a bunch of testing, what happens you are done? But all the

65:29 and you are you're comfortable with your that everything is working fine. Now

65:34 have to get a bunch off performance or any kind of measurements from your

65:39 . So in that case, you use the third way off submitting your

65:45 , using the S and magic a so that just picking throughout off the

65:53 . So again, as you see back on back on a log in

65:58 . So again, probably off. meeting your judges using a bad

66:05 Thank you Will submit your single test . Come on, now, These

66:09 script, if they have all the is that don't require for for the

66:16 of your job and the words the will also have submit the command,

66:21 commands or the sequence of the months you want. No, When you

66:29 to submit this script, you can simply use the s batch command and

66:34 the name of your bad street That not necessarily has to be bad star

66:38 edge. It's you name it. you want on, just simply do

66:44 . If you go and check using , you will see that this is

66:51 job. Its state is said as leave. It stands for ending.

66:57 they requested only one note. As can see in the bat script,

67:02 was the job that this was name for a job which we provided here

67:06 so on and which partition we So that's still it. And check

67:11 its skill offending. Uh, but it's still running. But when?

67:22 it will be done, What you see in the same directory that you

67:27 your John from you will see another here which will name something like slur

67:35 , most likely the the job number something. It will be named something

67:40 that. So when you opened that , you will have the outfit off

67:44 program. So what that means is , as I said after indirect,

67:48 you're done debugging your food, you your jobs using its patch. You

67:52 have great for your jobs to finish to submit it and just go away

67:57 do something else. And when you're , hopefully your job will be

68:00 And three output will be in one the friends. So, uh,

68:07 for a second and see if there's questions. Okay, Now, that

68:15 for me. Okay? A few is everyone wants last depression.

68:36 So, uh, here's the thes the contents of the batch five.

68:42 else Can you show the front ends the batch fight, so yes,

68:48 also, I will post along the off some samples on on the blackboards

68:55 . You don't have to worry about . Everything but everything will be

69:08 Okay, right. So they final that I would want to show.

69:16 , there we go. That job finished. So as you can see

69:20 , this file here that that's came and if you just open it,

69:26 will have for the world four times we asked for four tusks and just

69:32 job to run. He was impressed . Come on, that's the

69:37 Now let's say your job has been for too long, and now you

69:42 you want to make any changes for gold and don't want branch off will

69:45 any four so you can use the s canceled and just provide the job

69:53 that you get using the sq commode white death and your junk could be

70:03 , right? So I believe those most of the commands that everyone will

70:09 using and pretty much all the assignments , no coming to the model

70:16 which is so marketing is a package in so many sensors. So when

70:24 have, that's a library that's in on the faster you want to

70:29 If you indeed the model like a three sperm package manager to load it

70:36 your use so you can use first award models or packages are available on

70:42 first really big simply used what will when you do that, you can

70:49 all of the packages that are available the festers where you can see fight

70:53 packages. You can also see the's compilers over here. Also extension for

71:02 , just raid and all different kinds packages we could see here.

71:09 if you want to change what models your account poverty has loaded by the

71:17 you can use or has already you can use the command more your

71:25 , and that will show you all currently loaded modules for your for For

71:32 of the assignments, you may need load some new modules. Let's say

71:38 one that we use the dune, is a which is a profiler from

71:44 . If you want to know that , you just go ahead and buy

71:48 . You, Lord Reid oon if you want to be more

71:52 you can also give the version But you just get the name of

72:00 most updated was, you're not that it was 2019 06 So I think

72:06 same way you can also unload a that morning. Very move. Now

72:15 that many times if you haven't unload death, uh, on which there

72:26 Congress, there are other models that the defendant models for will also be

72:31 . So make sure you just don't what the mornings that you actually want

72:37 and that Because if you happen to more deals that you that you want

72:43 use and to go ahead and try run your cold, they would most

72:46 not worth found. Now, this kind of a common mistake that happens

72:57 and forgets to make sure that the margins are loaded so things don't work

73:05 . So I will try to provide correct versions for the most things,

73:11 began, to run the cords for assignments. But yeah, if if

73:16 see any beard better by running make sure you have the night mornings

73:21 this work. That's pretty much it our guest. Any questions from?

73:42 . So stop shooting now. yes. So no guns in freighting

74:10 questions on So Eyes on the other more signs that it's not covered by

74:20 demo. So these nine say There's one question in the chat.

74:27 , now someone disappeared. We have to the cluster after creating that account

74:34 exceeded off No. So you have create your account on exceed, and

74:39 send your user name to professor and , and we will add you to

74:45 class location on the clusters that you run jobs until have been added to

74:51 allocation for the last. Make sure one more step to get their account

75:05 needs to be taken before you can start to run Coats. All

75:20 so think of one's given the but out of time. So which was

75:26 because most of this canceled. It already a demo old. And I

75:34 know if you wanted to say something the control command on, but there

75:45 and us, you know, Stammel trying to new scripts. Once you've

75:51 the debugging kind of done and use batch command to submit things,

75:57 I talked to my dad hit on . This is just if you want

76:01 learn more about Sturm, it's not something you need for, um,

76:07 assignments. Um, And here, , there is stuff slide that as

76:17 number of useful anywhere else and storm finally boosters and his open source of

76:23 . It's a storm, um, for around period. But they're also

76:30 off George data centers and very nice that provides kind of could use,

76:39 information about So So there is a that gives you links on, uh

76:49 talks about the module command. And a few size again that supports what

76:55 talked about already. And there's also showing a little bit about that Would

77:01 useful for assignment one or two. remember, um, the best.

77:08 shows how you get to you processor software information that so Joshua de Mold

77:17 so I think these Oregon things that captured by But you talked about,

77:27 that was it. Yeah, Maybe want to make some comments about

77:33 I said, That's Ah Trickett thing usually for your assignment, the best

77:45 is to use a timer, not clock cycles or sometimes called ticks.

77:55 the most precise measuring forget, whereas of day or wall clock time,

78:03 not quite as except, but I , one awas on this So I

78:12 or not. But one of the that that is here about timers is

78:20 . Clock ticks should be fine. you do something health, you need

78:27 be aware off. There is time . Oh, your time. Many

78:40 the examples you will do are sufficient small that the execution time may be

78:50 than the resolution off their cart. all the time information that the time

78:56 use imports, it's effectively nonsense. the solution of timer compared to the

79:06 time on the coat, it's something really need to have a good understanding

79:13 . If you use clock ticks of count, then you don't need to

79:19 about it. But anything else it like to give you a lot

79:26 kind of missing from it. So see a show. I don't

79:30 You may want to comment more on , right? A few dining libraries

79:43 see for slightly provide some samples on blackboard is fourth, But make sure

79:54 use a star, Fluency said. time of that has enough resolution theater

80:03 . Keep in mind is other than saying, the wall clock fine,

80:10 may also include time for your is actually running diamonds for you,

80:18 may also include the time that the was waiting for resources because you will

80:24 running in jobs in a shared so you should use diamonds that report

80:31 actual see for your time for your . I have seen a couple of

80:40 , I believe. Last year, was some students job? I had

80:46 really small execution time so they didn't the correct timing measurements. The simplest

80:54 to get around that is to have program run for quite a few

81:01 At least that gets the total time . Multiple inflation's off your job in

81:10 in the cloths resolution and then tried escalate the average time for penetration off

81:18 job. Right? So I guess was a couple of slides down.

81:24 hopefully we'll see this line that this out and it on throughout the

81:29 It was distressed. You need to , Figure out how to get an

81:34 of what the running time should be you have a good cold. So

81:40 that you need two things. You to understand what the workload is.

81:46 , how much work visit to execute ? Well, there is memory references

81:51 or large corporations or arithmetic operations. notion off the workload, and then

81:59 need to have a good understanding of capabilities of the platform you're using.

82:06 that's the one of the nutrients have reasonable expectation for how long time is

82:11 to take. And that also helps avoid some of the pitfalls we talked

82:16 that you're you know, you made the matrix. That is 100 felt

82:22 by 1000 matrix and the mustard by of them and that maybe not enough

82:31 to be discovered by a clock if don't use again something that can cycle

82:38 clock ticks. So current processor pretty . So things like 1000 My 1000

82:46 problem is a very small problem. this is what I'm trying to do

82:52 side too again from this in you what to expect them assuring you you

83:03 typical time. But I will point , what do you use again when

83:07 comes to the assignments and the time up? But the rest of this

83:13 today has more. It's not and it's basically advice. What should

83:19 to put together a report since you do time in your work problem produced

83:24 a few numbers and you should set up some town, have some script

83:32 a Sarge run the numbers and organize . So fight to make sure that

83:38 can use tools to process the That's some tips in this fights that

83:48 the discussion off the customers and how use the clusters in this neck for

83:57 . Send a noise. So says them exposed to it both in

84:04 and in his own products are to need tools to process time, just

84:09 of the data. So I don't if you have suggestions. Well,

84:16 I said, for when you're submitting for us to make sure that everything

84:22 running correctly, just do some naive games. Once you're confident. Figure

84:30 use scripts to order. Make data because you don't want to sit for

84:37 multiple hours. Just collecting data annually a script that gets all for the

84:43 that you, I think when my disappeared. So the various unused yourself

84:59 as questions also, and some shrapnel that was your time is up.

85:09 closed the session himself. Shares That's what. Okay, now,

85:35 stuff. According so

-
+