© Distribution of this video is restricted by its owner
Transcript ×
Auto highlight
Font-size
00:02 Professor Guan yin chang from computer science , give us a guest lecture on

00:09 and why that is important for researchers us. His resource area is

00:16 So um in this presentation we will only get to learn about visualization,

00:20 if you have any questions about PhD towards the end, I think there

00:25 be a few minutes to ask these as well. Professor Groningen will not

00:30 staying for the full An hour and half hour, 20. So we

00:34 to wrap up from 1:50 or So with that let's uh welcome Professor

00:40 . Thank you. All right, you. Thank you for the

00:45 I'm glad to be here today to you a few things about visualizations.

00:51 insurance, I try to condense four , spreads around four lectures of my

00:57 into this one consigns presentation so hopefully get it done properly. So I

01:06 try my best to introduce what is , what is important and then I'll

01:11 about some connective the properties of our perceptions that will be important to know

01:19 we work on visualization and a few that you may feel handy when you

01:25 to create effective flaws and trust for tasks and if I have time,

01:32 will briefly talk about how to properly colors in visualization. So for

01:38 so please bear with me if the are slightly less organized because you know

01:45 try my best to put them into . Mhm. But so we start

01:50 the question why visualization is important and give you a simple a couple of

01:58 . Right. Many times we do is try to convey a message story

02:05 my useful information from data. Here's one example I have. This

02:10 data block contains some integer numbers. can imagine this could be some temperature

02:17 in space. Right. If I you, can you see any patterns

02:22 this representation? Would you be able tell me that anyone may be?

02:39 right down corner is a little bit . Alright. So yes you can

02:44 front left to right and actually upper to lower right. The values increase

02:52 . Right. But in order to out these patterns, what you actually

02:57 , what did you actually do in brain? You march through those numbers

03:02 x one and compel them in order figure out those values those trends.

03:07 doable. But if I give you huge data block with billions of

03:14 do you think you still can do in a finite amount of time?

03:18 not. Right. Okay. Let's it in a different way rather than

03:23 just compare the numbers. How about do some change. Alright. I

03:28 colors to the individual cells. Well values are measured and then I create

03:37 new representation like this. Do you the partners immediately? I guess the

03:43 will be yes. Right. You the patterns immediately. But you don't

03:48 know where it's low wages, But you see there's something and you

03:53 even need to read the actual values those numbers. Right. Okay.

03:58 that's one example demonstrated how effective visualization convey information than the other channels.

04:08 . Here is another example, I have this table with some information

04:12 the percentage of fats within a few of participants. Alright. My first

04:20 to you is how many groups are studying? Can you tell from the

04:38 ? Okay. Mhm. Mhm. know I mean to female, any

04:50 any other answers? Mm hmm. . Or eight. Alright. So

04:59 see how confusing this table for that questions. Right? There were actually

05:04 groups, two groups of males with income, Two groups of females with

05:10 income. And then we try to how the percentage of fat of those

05:15 of people change over time. So first measured them when they're younger,

05:21 they were younger, right? 65 or younger. And then after several

05:26 , measure that again. So that's this. 100%. Right now,

05:30 question will move to, Do you any outliers among these four groups or

05:39 actually change the same over the years these four groups of people? I

06:00 . No, for example, no . I mean like if I tell

06:08 there's one group that behaves differently than other three, which is true.

06:16 that be Female, Right? So is the same group for each

06:31 It's just We measure them in different of their lives. So you're

06:39 Alright, So this group behave So basically all the other groups of

06:44 of fat drop over the time, this particular group of people the percentage

06:51 fat increase over the year. Again, like the previous example,

06:56 can get this information by just spend time to read those numbers and

07:02 But is it effective enough when you try to tell the story to your

07:06 ? Probably not. But if I into visual representation like this.

07:13 so, first the number of it's very clear Before four lines.

07:18 . And the chain of that particular that is different from the other

07:22 It's also easily perceivable. All Rather than just going over the

07:27 you can just easily get this information the chat. All right. All

07:32 . So, these two example demonstrates important visualization can help convey information

07:39 Of course visualization can do more things we'll see. Right, this is

07:43 famous saying. Alright, visualization is about external connection. What is the

07:53 ? All right, let me finish . I'll go back to this

07:55 Visualization is really about external connection. is how resources outside the mind can

08:03 used to boost the connective capabilities of mind. Alright, I have also

08:13 same. So what visualization is actually is try to review changes and patterns

08:19 in the raw data made invisible we have different things around us.

08:26 can feel it, we know they're but we cannot see it without seeing

08:30 . It's really hard to understand. . Visualization can also make Abject concepts

08:35 intuitive to understand like some mathematical concepts limits. What does limit means

08:42 Oh sorry, no chance um means and probably tensions of a curve or

08:49 . What does that mean? without a visual presentation. Those concepts

08:53 not be easy to understand but let go back to this. What does

08:58 mean? Is that it's really an connection. How resources outside the mind

09:05 be used to boost the connected capabilities the mind. So what connection

09:11 what does it do anybody? So means that we try to understand

09:27 We try to learn something this connection . All right, okay, so

09:33 means that visualization is an external boost help us understand things more effectively.

09:42 , and here is another way to what it's visualization. It corresponds to

09:50 means to enable a user insights into data while visual representation right inside into

09:59 data can mean the understanding of the behind the data might useful information or

10:06 things. Okay, Alright, so what visualization is trying to do.

10:13 goal is trying to boost our conviction so we can see things that previously

10:18 couldn't. All right. That said saying probably should emphasize more about visual

10:27 resources there are different resources that can us boost our connected process. Speaking

10:32 that another way to boost our condition . So allows us to get into

10:37 data to know more about the data data mining, right? That's another

10:44 of techniques that allows us to get the data and help us understand the

10:49 . Data mining. Right. Can tell me the difference between data mining

10:55 visualization? You can get some hints the two logos I put here I

11:01 a machine on the data mining site a human on the visualization site.

11:13 I guess visualization makes the data more to humans with the mining we are

11:23 large amounts of data that the computer easily process but it's harder for us

11:28 make sense out of it. Thank you very much. Alright so

11:33 mining mostly use machine power to process large amount of data that's beyond our

11:39 to process. Remember the first example show you if I have billions of

11:43 points, how can you ask human process it effectively. Alright, machine

11:49 do it quickly paralleling right. Data is about automatic algorithm to help us

11:56 useful information from the data. While contrast visualization actually utilize human expertise and

12:05 effective representation of the data to help made decision. This decision can be

12:11 simply extract patterns or change or it be bigger like Okay, based on

12:16 I know and what the data showed and make a critical decision.

12:21 Is it that for instance in the diagnosis, is it a humorous or

12:26 ? Right? This you cannot completely on machine, right? You don't

12:30 to put your life on the You want experts, Okay to be

12:34 your side. Right. So these different sets of techniques for camps are

12:40 competing each other. They are actually each other like that. Students

12:44 data mining can help us process large of data where we cannot, but

12:49 extracted information may not be easily So how can we actually see that

12:55 understand it? We use visualization to the information extracted from data mining.

13:02 . On the other hand, visualization need data mining to pre process some

13:07 . So we don't show anything. don't show anything. Okay. We

13:11 show those important things to help experts narrow down what should be focused

13:18 Right? So these two are not each other. They complement each

13:23 Alright. We're important. Alright, go to hear what visualize it can

13:29 do. There was three major functionality tasks that visualization can help us

13:37 It can help us present information or story in the most effective and intuitive

13:43 to the targeting audience, but that's when we actually write our reports or

13:50 a presentation to our boss or We always include charts and graphs that

13:56 easily understandable, not just tables or . Right. Present information in an

14:01 eight second, analyze data to verify falsify hypothesis. Right? This is

14:09 . This task can also be done data mining techniques depending on the

14:14 but visualization can also help with Right? If your verification and falsification

14:21 purely based on some geometric setting in right then this will be easy to

14:29 and process based on the knowledge of experts. Okay, analyze the data

14:35 the biggest task that visualization can help is to explore the data where we

14:41 not know well to look for useful . Okay. We do not know

14:46 is in the data, what can useful or what cannot. So,

14:50 typically is the first choice you probably try to discuss, govern some useful

14:57 or change. So you probably form initial hypothesis. So this go backward

15:03 before that you probably don't even have hypothesis. Now with visual representation,

15:08 may have some hypothesis to say, there's some interesting patterns. Maybe

15:12 Alright. These patterns may be similar something that I have seen and be

15:19 using some established machine learning or data techniques and then I can try some

15:25 mining and machine learning techniques. They express those parts. All right.

15:28 then I verify that. Alright, explore something that's not known to discover

15:35 knowledge and findings. Right. Those the three main tasks that visualization can

15:40 us. Right? So those are basic stuff about visualization. Why visualization

15:45 important because it allows us to get the data effectively using our knowledge and

15:52 the effectiveness of our visual perception channel we'll see next. Right? What

15:58 can do? It can help us , can help us analyze and can

16:02 us explore. Right? So those the basics about visualization. Any

16:12 No. All right now let's move the second topic. The connection perception

16:18 things about ourselves. Right? We to know some of those unique properties

16:25 with our visual perception channel because visualization all relies on our visual perception to

16:33 the generative visual presentation and understand what going on. So we need to

16:39 some properties important properties of our visual channels in order to be able to

16:44 effective visual presentation to present our Right? This shows a very very

16:53 , high level pipeline of visual perception to connection. So first we have

17:01 visual stimulus. Light colors, right? Those are visual stimulus but

17:09 starts with light without light we see . Okay, those visual perception visual

17:15 will be perceived by eyes. eyes is a very very complex

17:19 So we have lens. We have . So the visual stimulus signal will

17:24 through lands and form some upside down at the back of the eyeball.

17:28 which will be the retinal and retinal be connected to some of the

17:34 Throughput channel fibers. New road network you pass the signal to the back

17:42 our brain without the cortex handles the signals. Okay. When we when

17:48 signal reach the back of our the cortex the part of cortex we

17:53 seeing things. We see shapes, , sizes, textural orientation and

17:59 Okay then we add onto those geometry objects. So this is the visual

18:06 process. Right? It's a signal signal capture and processing process. After

18:13 perceive things we see things now we thinking what we are seeing this.

18:19 enter into connection process. It's a process of trying to understand since that

18:26 perceive. Alright, so this connection will be applicable to not just visual

18:32 channel, It can be applied to sense channel of us. Right?

18:37 have hearing, we have taste, have feelings. So anything we feel

18:42 go to the brain. We starting was the thing? I'm what is

18:47 thing I'm touching? What is the that I hear? Well it's from

18:52 . When once we start thinking of question like this we enter into the

18:57 process. We process information we So the same thing happened to the

19:01 perception once we perceive it we start what they are. What are they

19:06 now. All right. In order understand or interpret those things. We

19:11 to assess our long term memory. . So I post here so what

19:17 stored in our long term memory that need to assess to help us understand

19:24 we perceive like visually or through hearing through taste of touching. What is

19:31 ? What are those things that are in our long term memory? Do

19:43 mean senses long term memory since it's senses shortened? We sense it.

19:50 it. Right? When we touch his surface with the extra with withdraw

19:56 finger. Right? That's it. how why do we know it hurts

20:05 ? Of course basic instinct. But understand the things that we see,

20:12 need to assess long term memory. instance if we see an animal we

20:19 thinking what animal it is. Then know okay, it's a cat or

20:23 dark. So we need to assess term memory. What is that?

20:29 would be like experience. Thank Right. It's right here, it's

20:35 knowledge experience we have learned. If you never know what is the

20:42 , what is a cat? Even animal is right there. We will

20:45 be able to recognize it. So connection process. Actually need to

20:52 long term memory. It's a rich of all those entries that register in

20:57 brain. Okay. Alright, so is the entire process of how we

21:02 things and how we process what we . Alright. How we understand what

21:06 see. But to summarize visual perception only these two stage, right?

21:12 then later this if it's effective then can quickly trigger the long term memory

21:19 then help us understand it quickly. ? This is related to the effectiveness

21:23 the visual representation. But again, is the visual perceptions. Alright.

21:29 we perceive things visually. After understanding we perceive things, we need to

21:38 some important properties that may be useful generating visual representation. So next time

21:45 you try to generate some visual you may need to go back some

21:49 these properties to think. Should I this or should I should not.

21:55 , Let's start with two short Okay, let's just follow the instruction

22:03 complete the task. Try to count many times the players wearing white.

22:10 the past people. Only the Alright. How many times did you

22:44 ? 13. 15. Alright. is the correct answer. All

22:52 Next question will be surprising. Did see the gorilla in the video?

23:01 , I did not know. Let's watch the video again. Now

23:06 have the hints. Ah Now you it right? But if you're focus

23:19 fully occupied by the task that is to you, there's certain details in

23:27 visual, you may ignore. All . And let's see another video.

23:40 , let's watch this. Yeah. you see that? Yes, but

24:03 is not the same person. Somebody pay attention. Now after the

24:07 video it's not the same person. not the same person. They post

24:25 classes but and the long hair but have different shirts on. Okay.

24:37 . So that's the the property that want to highlight. Change politeness.

24:43 this can be significant. Right? so they did this. We need

24:50 pay attention to this unique property. huh. That says human needs a

24:57 of tension in order to capture The changes and the details right through

25:02 above example. Especially when the changes over time. Alright. We watched

25:07 video frame by frame but sometimes we shortened memory. Alright? And there

25:12 a lot of information to process, ? We process it sequentially. But

25:17 perceive the information in a parallel We'll see that right? We perceive

25:21 those pixels at the same time. problem at all. But whether they

25:27 make an ink in your mind, a different story. So sometimes they

25:32 okay if you know what things you to you pay attention to right?

25:36 the first video I asked you to attention to how many times the people

25:41 white pass the ball and then you attention to those things and other things

25:46 it's in the same frame you Right? So those are the property

25:50 need to pay attention that said if the data or the story of the

25:55 you want to convey. Need to changes for difference for anomalies in the

26:02 that you need to find a way actually highlight it visually. Okay.

26:07 let the viewer to look through your presentation to find out things that are

26:14 . Alright. Made those different pop . Okay, we'll talk about

26:17 There is some important property that we to make things pop up. So

26:22 following three already showed up right. another property that you should utilizes our

26:29 perception system is good at observing reality in space, reality difference in

26:37 not over time because over time things I said we look at frame by

26:42 , this is short term memory when time passed the framework changed.

26:47 But the relative difference in space we scale it long enough so we notice

26:52 and it's easy to be drawn to boundaries of different regions and objects.

26:58 , so this property people already use to show clusters. Alright, regions

27:04 have different meanings that we actually use colors or different geometric highlight or shading

27:11 highlight them because the boundary between these even they are small. Okay,

27:16 can see that right? And another that this property can lead to or

27:22 should utilize can be showing this example say I have two values. I

27:27 to compel their values which one is of course the two values easy.

27:33 . And you can just compare the of values based on your knowledge.

27:39 if I had many many values and want to compel them ah effectively

27:47 So I tried to use visual So here I use two bars the

27:51 of the bus to represent their actual values. So this is the first

28:00 . Okay. First visual representation. attempt. Can you actually tell which

28:05 is bigger to me? It seems are. Yeah they're the same.

28:20 differences so subtle. All right now do a better job. Okay let's

28:24 okay if A and B correspond to percentage values. So we use 100%

28:30 the silhouette, the outline and then try to fill up their corresponding

28:36 Now it was slightly better right? on the amount of the white

28:42 the size of the wide region we tell. Okay. Be it

28:47 Alright so this is an improvement. the most effective way that if I

28:53 the property that I just mentioned human really good at figure out the relative

28:58 . I would do this properly ally with respect to a common baseline or

29:06 line. Alright so this is we Weber's law. Okay there is another

29:14 property of our human perception that you be aware of. So this to

29:19 which one seems longer to you. left one. The left one.

29:27 vertical vertically aligned one. But you what I'm going to tell you.

29:35 ? They actually have the same But you field this vertically aligned or

29:42 out table longer. This this is we call vertical dominant in our visual

29:50 . So how can we utilize it our visualization or how should we avoid

29:58 kind of situation when we visualize So I go back to this situation

30:05 ? If I want to compel to right? And the two numbers,

30:09 very, very similar. Alright. I put a vertically and I still

30:15 the bar if I put a vertically be horizontally. Do you still be

30:20 to effectively tell their values difference which is bigger? Probably not.

30:25 Because of the vertical dominant. So said if you want to compel the

30:31 right? And you use certain visual aligns your visual representation in the same

30:39 are either horizontally or vertically or whatever angle but they have to be oriented

30:47 . Do not a random differently. , that's one important thing.

30:53 so here is another thing. You pay attention when you create the visualization

30:58 of those things. I show you far it's really high level. It

31:02 talk about specific visualization techniques, its properties that you should be a weld

31:08 when you produce any time of visual . Alright, let's go to the

31:13 what you see when you see a , depends on what the thing

31:17 So, this is the first what you see the sink earth depends

31:23 what you know about what you are . All right. Any interpretation,

31:29 tried to help us understand these two . It echoed back to the visual

31:41 pipeline that I mentioned in the Yes. So what you see when

31:48 see a thing depends on what the is. So if I have to

31:55 a word for this sentence, the correspond to what fact fact.

32:06 this is fact. What you see sink us depends on what you know

32:11 what you are seeing. This is , opinion. All right. Now

32:19 , can anybody tell me why this important in visual visualization, anyone.

32:34 basically if it is an alien who and looked at these things, they

32:39 basically not find why we would find particular images so unusual. Right.

32:47 huh. Yeah. Yeah. Thank . So that said when we create

32:55 , we had to think about the background of your targeting audience.

33:02 so your visual presentation should be tailored on what your audience knows. Do

33:09 include things they are not familiar If you have to provide sufficient

33:15 Legends or captions to help the audience through your visualization because if you leave

33:23 room for guessing, you know what happen. Right. People have all

33:28 of interpretations. Right. One simple example is you know, have you

33:34 went to the gun to the art and look at some paintings?

33:41 Right. And then, you the the visitors looked at the same

33:47 . We will read different things. right. How many times when you

33:50 look at the painting? And then of the people next to you

33:54 oh yeah, I can see wow, this is cool. And

33:58 you see em I see nothing. ? So, this is really purely

34:03 on the knowledge, right? If never seen this kind of painting style

34:07 you say, oh, this is mess up. Right? So be

34:11 . All right, This is a , very important things. You have

34:14 keep in mind when you prepare your presentation. Right? Now, we

34:20 to the most single important property of visual perception that we utilize a lot

34:26 racial science. Okay, now you back to this as the summary.

34:30 let's see what is pre Attentive. attentive means that we process things very

34:37 in a parallel fashion, right? a large scale task, this pre

34:43 property allows us to process it. is a number that's based on some

34:48 study perception. Study Alright, Alright. I didn't decide the

34:53 All right. So, if you process it within 500 milliseconds, we

34:59 , okay, this is pre We can process it right? If

35:03 individual project approach object, we can it in less than 10 milliseconds attended

35:09 that we need longer time to process and the process is typically sequential.

35:17 so our visual perception actually it's a attentive process Alright because we perceive things

35:24 at once. We see the image at once. All the pixels get

35:29 our brain. You pass through of all those channels through lands Latina and

35:35 visual uh communication channel in our Until we reached the the back of

35:45 break. Right? This is purely parallel process. Pre attentive.

35:50 Alright. And but when we start what we see, this gets into

35:55 let me give you an example right look at this vision right? Without

36:01 too much time. What we can . Is there a lot of

36:04 And then there is some dots that to each other than the others.

36:09 ? This you perceive it in less 500 milliseconds assume that somebody really time

36:16 ? All right. Best pre Now we start thinking right. How

36:25 points in this particular cluster? Those close to each other close to each

36:32 . Now in order to answer this we start counting, counting is a

36:38 process. We count things one by is a sequential. Alright so this

36:43 attentive. Alright and then we'll probably thinking the other points are evenly

36:50 Right to measure distance, We have measure them pair by pair right?

36:56 is sequential. Alright, in our mentally. So now you see the

37:01 . Pre attentive that you barely need process it. You just proceeded

37:06 You notice something right? And then means that you try to go deeper

37:12 understand what you see, what what information there. Right?

37:18 Okay. So visualization mostly utilized the visualization mostly utilized the pre attentive property

37:28 our visual perception channel. Okay, short, so despite all those things

37:33 you can read through it yourself. . So in short, what we

37:37 about is when we create visual we try to make things that we

37:43 the audience to pay attention to pop . Okay, We try to make

37:49 problem. Let's start with an Let's count how many? Three in

37:54 series of numbers. How many? right. In order to find all

38:04 three, you have to go through one by one. Right? This

38:07 attentive or pre attentive process. Now is pre attentive attentive right? Because

38:18 present those information sequentially. Alright, how to make things pop up.

38:27 column. Alright, you immediately see immediately? The street. Of course

38:33 next step will be counted. But see their locations. Alright, three

38:38 up. All right, So there different way to make things pop

38:45 So this just list a few situations few visual properties. Sorry, visual

38:53 that people can utilize to make things up. Right? So this comes

38:57 some visual signs. Okay. Not visualization communities but visualization community utilize some

39:03 these conclusions to help make visualization more . Right. Make the important information

39:09 up. All right. Of course need to know what is important.

39:14 ? So that's a separate issue. by case. Alright. Any questions

39:21 attentive and pre attentive properties of our perception. Alright, nope. All

39:34 . In addition to those uh visual that people can utilize to encode information

39:40 make things pop up. We have other things as shown here that people

39:45 use to encode different types of information data. Right. Data is not

39:52 . Ah singular, Right? We different types of data that describe different

40:00 . Right? We have numeric data probably describe describe quantity information. And

40:06 we have labeled data that describe categorical groups, classes. Right? So

40:14 can we encode this information visually So, here are the options.

40:19 ? There are many of them. the question will be are they all

40:23 effective in different situations? Like if information I want to visually encoded describing

40:31 of information, Which one would be effective? Right, are they or

40:36 effective to answer that? Let's look another example. Alright. So if

40:44 two bars represent two values which one bigger and how big it is.

40:47 one is bigger is? So But which one? So that the

40:51 1? Okay, so how big much bigger it is compared to the

40:57 bar I guess about five times. . Five. Any option if you

41:11 at my mask cursor 4 4. . four. You can read

41:18 So now let's use the two All right. The right one is

41:26 bigger. But how much bigger? guess. Probably areas four times if

41:37 diameter is double. All right, me quickly show you this. It's

41:41 five times bigger. Alright. It's very effective right? Compared to the

41:46 representation. Alright. So that's because bar representation we used the lens use

41:53 of the geometry to encode numeric value . We use the areas of the

41:59 to encode numeric values apparently lends it's effective than areas. Okay. This

42:06 be very very important when you uh at some later things of trust and

42:15 something. Some type of trust. should avoid music. Alright. That

42:21 I do want to ask one quick going. So if we're just trying

42:26 say one thing is larger than the thing by the certain amount. Most

42:31 the time we would prefer to just numbers, right? Unless it's like

42:35 more complex scenario. Yeah. If just two numbers right? But if

42:39 have many numbers then you still need use so if you just care about

42:43 one is bigger then yes. The can also work right? If you

42:46 want to read the precise values So that says you know the among

42:53 effectiveness of encoding the quantity of data different visual properties, they are not

42:59 effective. Right? So this is trust that I want to share with

43:06 depending on the types of information you your visual perimeters to encode right?

43:11 have different ranking of those attributes to but things are not always absolutes.

43:18 are always exceptions that some of those ranking primitives may not be as effective

43:27 it should be in some situations so have to use it based on your

43:32 situation. But this is the overall based on some studies. Alright,

43:38 that is the visual perceptions and connective for our visualization. Any questions before

43:45 move to the next topic, I we're running out of time. I

43:49 my best. Alright. The next will be about generating affected charts and

43:57 but there is a set of principles gestalt principles that people typically exercise when

44:03 decide whether the plot is effective or . So I will not be able

44:07 go over all of them in So some of them related to our

44:12 perception channels. Some of the property our visual perception channel that we didn't

44:18 . So like the enclosure. So , our eyes tend to connect things

44:24 on our knowledge of a shape. we only provide dash line, straight

44:28 line, we know this is a line. Right? We don't need

44:32 full solid line to realize this is straight line if it's dash like so

44:37 some of those properties still related to uh perception channel but I would jump

44:44 some criteria of determining whether your visual is effective or not. So here

44:50 one famous criteria. It's called graphical . So what is that? It

44:56 that we can the generative visualization can the viewer the greatest number of ideas

45:04 the shortest amount of time. With list in in the smallest space.

45:09 lot of criterias come into here. ? And people can roughly ah reformulate

45:20 criteria into something like this, That's easily ah check about how to

45:29 easily usable. Okay, expressiveness and . Okay. Effect expressiveness in other

45:38 , means that tell the truth. requires the visual representation accurately encodes the

45:45 of the data that needs to be . Try not to distort the

45:50 Try not to include buyers into your . Tell the truth. Alright,

45:56 one criteria, graphical integrity and if I can guarantee you encode the data

46:05 do it effectively, effectiveness with precision and emphasis. Okay. That

46:12 people read the information effectively make things up if needed. Alright, so

46:19 are the two make criterias people utilize they check whether their visual presentation is

46:26 or not out down to specific graphical representation. Right? This is

46:33 set of visual representation, plots and . This is some set of principles

46:40 people practice when they generate plots and . Like like Tre hissed a gram

46:49 and scatter plots, et cetera. . And despite the many principles,

46:56 of the of them, I would guidelines and use them as much as

47:00 can. And sometimes those principles may be useful depending on the situations,

47:06 among them, the highlighted ones are applicable. The first one reduce clutter

47:15 data stand out. Use visually prominent elements to present the data all the

47:21 secondary information line, scale lines, line should be put in the

47:28 All right, owning the data should the full attention right. And here

47:35 an important thing you should keep in for visualization less is more Less is

47:42 . If the simple visual presentation can sufficiently conveyed information, do not use

47:48 things. Simple is good. And remove all the unnecessary elements.

47:55 elements. Okay, okay. And understanding provide explanations for each. Trust

48:04 graph should provide a caption to explain this chart or graph is showing what

48:12 can be drawn from the visual If the charts. If the the

48:20 contained multiple charts for instance, for purpose, that you cannot put them

48:25 in one figure because of the overlap those plots properly along their common

48:32 A like just tickles plots. so those are the highlighted principles are

48:37 applicable in most of the situation. other depends on the data and the

48:43 and information you want to highlight. may or may not be applicable.

48:47 right. But one thing you should attention generating effective plots is always an

48:54 process by many times the first iteration plot. It's not the best.

49:01 ? I think you already have experience also many default settings of the plotting

49:06 cannot generate the most effective plots for . The default setting provided by those

49:11 . Many of times you need to the parameters. Alright. Change the

49:16 , changed different visual to generate the effective plus. So plotting is an

49:22 process. Make sure this is in mind next time. All right.

49:28 , those are the things we should next. I spent a couple of

49:32 to talk about things you should not . Since we should avoid. This

49:36 a little surprise. Try not to pie chart. There are different elementary

49:41 type, right. Petra is one the most popular plot type people use

49:46 plot information. Alright, what's the here. Alright, I have full

49:54 . I want to show their market in terms of percentage and I hope

50:00 can intuitively show show me which supplier the most market share. But can

50:09 tell Which one has the most market . Seems like speed or a.

50:21 a or B. But you cannot for sure until I oh I think

50:25 didn't have the number but be looks . Right? But the actual largest

50:36 , the supplier who has large market is A. I will show you

50:40 the next plot. But visually supplier looks bigger. Why there's one property

50:53 mentioned because the like that right one to table Right? three d.

51:02 is always a killer. Don't use D. Effect. This is basically

51:05 the information why you need three This is one thing to a

51:09 I will emphasize that. Again. reason anybody still remember vertical dominant this

51:24 corresponding to supply B. Unfortunately it's vertically while A. Is horizontally.

51:33 see that. Alright. So So actually used the area and the

51:43 of each sector to encode quantity Right? We know these two are

51:48 effective. If your goal is to which one has the largest market

51:54 use this right? We know this the most effective way to encode quantity

51:58 information if you want to compel which is larger. Alright. And there's

52:04 situations that people prefer petra right? Petra gives you a whole if your

52:11 is some percentage within 100% right? split into different sectors then Hi chart

52:19 give you a feeling of whole it's and each one take a chunk of

52:24 but outside of it. Don't use tra. Okay. don't use Petra

52:29 if you have to use Petra do use three D effects. This is

52:36 another example of how 3D effect come the way. The how we can

52:41 read the information. It doesn't It doesn't add anything other than visual

52:47 . Right? We don't need visual . The visualize. The goal of

52:50 is not trying to generate beautiful No no no that's not the goal

52:54 visualization. We're trying to generate effective presentation so the underneath information can be

53:03 conveyed to the targeting audience in an intuitive and precisely accurately. Alright So

53:13 not use three d effects. Secondary Y axis should be avoided or

53:19 as much as you can. So times you need to compare two sets

53:24 data. They have different meanings and data range to associate them. For

53:30 whether they have some correlation but but are defined in different range And have

53:35 units like these. two. One is revenue the units millions in

53:40 of money. The other is number salesforce. How many salesman's? It's

53:45 theaters Right? If we use to one for one data people get confused

53:52 the tendency of people read the value we try to find the nearest reference

53:57 to try to read the value for bars. Right? And then it

54:01 cause confusion. So try not to it the better way is directly at

54:08 values at the data points. If are not many data points or if

54:13 are too many data points that prevent to add those values onto it to

54:18 wide occlusion. Then you put them two plots by aligning them properly.

54:24 you still can compare their chains. . Secondary y axis should be

54:30 Okay. Okay. So I think can still spell a couple of minutes

54:38 with you exercise. Alright, let's to exercise to practice what we just

54:45 about the principles to improve some of charts. So, this first

54:50 Alright, so this is the survey . What survey? We surveys the

54:55 of the preference of music amongst Right? Over the past two

55:02 So, this survey was done in University of Miami. So, students

55:07 the Survey subjects. So they did first survey in 1994. And then

55:13 did it again in 2014, two of how? Right. So they

55:18 to surveyed the favorite music form. ? So, and the question they

55:30 to answer is how this preference changed time. How this change over

55:37 So, the change is the focus . Alright, So this is the

55:42 visual presentation. We use pie All right. And we currently see

55:49 changes. Hard rock music. Got lot of love After two decades.

55:57 ? Apparently have a few more fans the others. But how about the

56:02 john's other music johns like samba raggy , classic? Anything change? It's

56:16 very clear. Of course we can the numbers, right? But this

56:20 you to do what attentively compelled the to notice the change. Right?

56:29 any better idea of plotting this information than using pie charts? Remember the

56:37 here is the change of the preference time. We don't care about the

56:47 percentage it takes right? Each music takes. We just care about the

56:54 over time, bob jobs post. Batra can only show one year.

57:06 need to show the change over Remember the second example, I show

57:11 about the change your percentage of That will be the chart. I

57:17 use. All right. I don't about the percentage the precise percentage of

57:27 particular music young that is preferenced by students, right? I don't care

57:34 the exact percentage. I just care the chain over time. So,

57:39 this plot will be more effective, ? You can clearly show we get

57:42 lot of new fans for heart. walk. Simba drop a little bit

57:49 hip hop. We gained a couple friends, new friends. Sorry,

57:55 friends, but not a lot. , But country and classic jobs.

58:00 right over the year. So less and fewer people like to listen to

58:05 and classic. But this information can more effectively converted using this truck than

58:12 truck. Right. Okay. So one exercise next. Right? This

58:19 another survey data we try to So the So the question is in

58:26 , what attributes are the most important you In selecting a service provider?

58:33 this is a service company and there's couple competitors that this company try to

58:40 out what part of the things that should improve in order to attract more

58:45 . Right? So there were 77 they are looking at. They try

58:51 decide which one they should focus on improve their business. This is the

58:56 results and they conclude that demonstrating effectiveness the most important consideration for customers to

59:03 a provider. Right? So they to visualize their results. Alright.

59:10 any issues you see and how you it. Anyone. Okay, So

59:41 of the time limits. Let me show you show you the improve with

59:46 presentation. All right, So this much better. All right, tell

59:52 what has been improved. We changed collapse of the most important notices and

60:02 but stopped. Alright. We exercise adding emphasis to emphasize the information that

60:11 want the reader to pay attention Alright, previously, they're all having

60:17 scent emphasis. That means there's no in your vision. All right.

60:23 your story has always a seam You to complain. The scene is here

60:30 is the most important consideration. You want to highlight that conclusion.

60:35 ? This one highlights it anything How about the alignment of the

60:46 Which one is easy to read? have to say the new one,

60:52 . Why cliff was left aligned? . Left justified A line. That's

61:01 human reading habit. We tend to from left to right, not central

61:07 . Try not to use central Okay. If you want people to

61:11 your information left justified align. So course this is a lie because this

61:18 right justified line because they are close the bars, right? You want

61:21 align them closer to the bars so the exception. But most of the

61:27 you should align them to the left another changes. You see all those

61:33 key elements. They're gone. That's also our human reading habits.

61:40 tend to read things are that are horizontally more effectively than those informational elements

61:47 in non horizontal orientation. Right? instance in the previous representation, this

61:53 fortified degree oriented. You have to your head to read it right?

61:58 first situation some of the before the , the default setting of some plotting

62:04 . We vertically aligned those labels. bad. That's really bad.

62:09 Try not to do that. All . Always oriental labels text horizontally as

62:16 as you can. And also we rid of non necessary visual elements to

62:21 . So we get rid of the vertical axis. This exercise. One

62:26 the visual property We can connect those even they are not physically connected.

62:32 human perception can align them following the part. Right? And we also

62:37 rid of those this reference line. don't need it because we have the

62:43 . All right. So those are improvement. And of course we keep

62:48 . Sometimes. Wide spaces, not in your visual representation whitespace just like

62:55 punctuations in sentences without them. It's , really exhausted to read or listen

63:04 right. The same thing happened for . So leave out some white space

63:09 it's needed. Okay. All So any questions when I play this

63:14 animation. I just wanted to make comment about both of them that do

63:20 very good job is sorting by the of the bars. Because if you

63:26 have a lot of experience, you just let the default graphing software to

63:30 . Depending on the order of the points to display. I think it's

63:34 better. Do you agree with Going great. So it's much better

63:39 sort the bots based on their otherwise, you know, it's really

63:44 to read. Especially when the two are too close and they are located

63:48 far away. Right? So this another example to demonstrate how you can

63:58 clutter your visual representation to make an representation of the data cleaner. Any

64:07 questions. And then to emphasize something not critical. Right? Those uh

64:20 lie those numbers. You know you just emphasize them or even just put

64:25 actual values on the bars. If don't need to show all of

64:29 One question I had going. If just had those bars and which we

64:35 want to emphasize any particular bar. you want us to use black on

64:41 or different colors or? Great. you have any suggestions there? If

64:45 don't want to emphasize anything same color which color would you like us to

64:51 ? Depending on If everything is equally , Everybody is equally important. You

64:56 still use visually prominent color depending on background. So most of the time

65:01 background is white you should use darker color for your chance for your

65:07 . But if you do want to things like this particular example, all

65:12 other bars that are not need to paid attention to should use a very

65:17 color like the Great. All Like this one because you emphasize this

65:28 only. This one got color the stone. But if they are equally

65:33 , which means that there's no then all of them should be color

65:37 color using the same color. Try use many cards. Try not to

65:46 many colors in the plot. It help unless they have different groups.

65:51 bar trust belong to different groups or clusters. Then each cluster should get

65:58 unique color that's good questions. Any questions dr Chen I have two

66:12 Yes. Go ahead on the What course is this from? You

66:15 you were teaching this India course visualization And my second question is given a

66:23 . How do we decide which type visualization to use? Good question.

66:31 very good questions, depends depends on you want to show and of course

66:36 nature of the data. So if have a tabular data right? The

66:43 entries it's stored in a table and entry has multiple attributes then depends on

66:50 attributes and what type of this particular values in this particular attributes they

66:56 You choose the proper plotting type for if you want to plot numeric value

67:03 data entries are organized based on some order. For instance time you measure

67:10 over time then you can use line right? Like the temperature change over

67:16 . So use line charts. But the data entries do not have intrinsic

67:23 not order based on time. It simply labeled like this french fried potato

67:28 that use by trust. If the are numeric. Alright. And if

67:36 is only true. If you only to plot one attribute. If you

67:40 to plot to excuse to to show correlation then you can either do the

67:48 live close or you do the just plots like the example I show and

67:53 said try to avoid to set to access right? You can superimpose the

68:01 live plots together to see whether they similar change or you can do scatter

68:07 . So it all depends on what you want to review from the

68:12 But in the beginning, most of time you start with some basic common

68:19 types. Okay. To check those attributes to find whether this attribute is

68:26 or not. Should I further explored or not? Sometimes quickly future.

68:30 this is not interesting and move on the other attributes. So you only

68:35 time to the attributes that after the exploration you decide this is important.

68:44 you. Alright, let's thank Professor for spending his time with us.

68:49 if you want to learn more about . He teaches an excellent course.

68:54 you. So I think everyone is cars to take because you learn a

68:58 because visualize system is a really important regardless of what you do in your

69:03 . Thank you Professor. Yeah, probably have more slides about colors but

69:08 was just showed it with dr normally . And then he will share that

69:13 colors is also very important. Thank . Thank you. So we're gonna

69:20 spend a few more minutes with the If you guys have any anything you

69:26 to discuss from the from what you today, you can ask those questions

69:32 if you want to discuss anything, know, we can stick around for

69:35 few more minutes. Mm hmm. the anomaly. What are some of

69:45 libraries that you use for plotting? , that's a good question. And

69:48 should have also asked, yeah, Glenn ng Chen that question. These

69:56 , a lot of the libraries have very good. Even let's say Microsoft

70:00 allows you to produce very good graphs days. I personally use a Mac

70:06 live. It's a python library and I, you know, you guys

70:11 doing a lot of data science, already had a lot of visualizes on

70:16 of it. But remember we're mostly about research papers and there, all

70:21 need is basic graphs. And A for you to customize basically various elements

70:28 that graph. So these days, example, if we're having this conversation

70:32 years ago um, you know, , Excel would probably have been a

70:37 poor choice, especially the defaults. you know, things have gotten very

70:40 now. So, um, so you can use pretty much any

70:46 they are reasonably good. I personally that part live in a new

70:51 That's another tool that I've used over time. I can let's hear from

70:55 , you know what tools you might using beyond the Excel method labor and

71:09 just do a quick survey. So many of you have written research

71:15 or reports in the last, let's one year. You know, what

71:18 do you use? Maybe, you , everyone can basically mention your tool

71:23 , on the chat. Probably it's efficient to do this kind of

71:26 So if you could just type up tool that he used for plotting

71:33 because you know, it's kind of to know what other people are using

71:37 in case. You know, we to check them out and we're not

71:40 about diagrams, we're talking about We got one answer saying anything

71:51 Anyone has used everyone used math Okay. The crazy part for using

72:13 , did anyone actually use Excel? have used Excel. Yeah.

72:22 um, you know, one of issues with something like Excel is for

72:26 , partly discipline, etcetera. You very easily export to pdf and include

72:30 in the atlantic document. And we're most of your, I'm going to

72:36 latic's or have used that trick for research publications. It becomes very easy

72:43 , if you change the data, becomes very easy to rerun the

72:46 generate the pdf, it goes to right fight and you just compile

72:49 you know, paper when you're If you use something like Excel,

72:54 a little bit more clicking around opening exporting, you know, there's a

72:58 bit of overhead there. So that's to keep in mind. But you

73:02 export even Excel charts as Pdf, is what you should be doing,

73:08 your paper. You should not be exporting the Jpeg or something like

73:11 You want the vector format that's uh and nice. And Excel does allow

73:16 to explore that but it's a little clunky. Especially if you're changing the

73:21 . But if it's in the data have to you have to somehow import

73:24 data re plot and then maybe, know, manually exported, move the

73:30 around. It's a it's a you , it's quite involved. But if

73:33 using python my plot live or juicy organized plot, it's um it's a

73:39 easier because you can have the read the program, plot the

73:43 export it and put it in the folder paper. They kind of want

73:48 efficient pipeline when you're close to writing paper but when you're in an exploratory

73:55 , something like that. Yeah. you know, Seaborn is I believe

74:00 some something that's built on top of part. Is that correct? My

74:05 is I think it uses my part . Yeah, but it allows a

74:10 higher level control, you know, the hole in the charging process.

74:16 if you're trying to build some complex especially multi chart interactive um visualization.

74:26 you would probably use something like It's also very popular, which is

74:32 more like a data science type Alright, so let's wrap up today's

74:40 I'm gonna discuss. Stop

-
+