© Distribution of this video is restricted by its owner
Transcript ×
Auto highlight
Font-size
00:04 Hello everybody. The topic for this is strings and hopefully everybody has gone

00:12 the reading assignment and this would be review of what's cowered in strings in

00:18 assignment. And and the important things strings that are relevant to this

00:24 So let's get started. A string a so called collection data type.

00:33 this is a data type where the variable or one data object is composed

00:40 smaller pieces. So there are other coming up. Strings are the first

00:47 data type that are a collection data that we're seeing. We're going to

00:52 less to pulse dictionaries at least some them in this course. So for

00:59 with what we've seen before the data we've seen before include in teacher floored

01:04 bullion. And these are the so primitive data types. We never use

01:10 term because that was the only type data that was around. But so

01:17 basic idea there's primitive data types and collections which are composed of multiple primitive

01:24 objects and those and one of those . And that's what we're going to

01:29 about today. So now back to . It's very simple. It's a

01:35 collection of characters. Hello world is string. This is the same

01:41 You can enclose the string in a court or a double court. It

01:45 matter except that you start with a court to end with a double court

01:51 vice versa but otherwise they are identical of expressing the same thing. So

01:58 sequential in the sense that this is just a bag of characters that add

02:04 to A G. L. S O. S and so on,

02:07 ordered. It goes in order from E L L O and so

02:12 And you'll see a blank here. is also a character, a completely

02:17 and valid character of a strength. only exception to a sequential collection of

02:26 is when there is nothing, there no character that's also a valid string

02:32 a so called empty string which has nothing inside it. And the way

02:39 express it is like this just courts nothing in between or double courts with

02:45 in between. Not that if you this then that would probably be a

02:53 with several blank characters. You can blank characters in string but an empty

02:59 has nothing in it. Not even blank character. Okay, hopefully that's

03:04 so confusing. String is just a collection of characters or nothing. So

03:13 keep going Over the next, I know, 10:15 minutes will basically talk

03:21 what you can and a little bit what you cannot do with the

03:29 So this is a review of a things about strings that we've already seen

03:36 this class. Of course we've touched strings before but not in a serious

03:40 so far. So we've seen this . Addition of strings is allowed.

03:49 legal but it has an interesting interpretation interesting meaning when you combine two

03:58 let's say the first is john and second is doe the addition just become

04:04 concatenation of those two strings. So just john doe similarly multiplication of strength

04:11 legal. You can multiply a string any number of times and then the

04:17 string that is the result of that is just the string multiply that many

04:24 . So just to be clear, cannot just do any arbitrary operation and

04:31 strings and numbers and arbitrary mathematical These are sort of exceptions have been

04:38 defined because they have sort of a in the context of a strength.

04:45 not everything is gonna work and I have to see if something makes

04:53 It might work if it doesn't, won't. For example, if I

04:58 this expression, first name minus First name is john that's the

05:03 John minus one doesn't make any You're a bunch of characters. What

05:09 you gonna do? Subtracting an individual it? So this would be

05:13 Similarly 34 plus two except 24 is string. So here three and four

05:21 a string or just character there could been X and Y. It makes

05:25 difference. And so you cannot add simply add an integer or string.

05:32 can add a string to a string can add and intention to an integer

05:38 cannot just add an integer to a . So you'll get another illegal

05:43 So you really wanted to do what appears someone is wanting to do and

05:49 will have to do something like convert this string to an integer within

05:56 and then if you do plus two should be valid operation. Okay,

06:03 there is summary, there is a way you can add and multiply add

06:09 strings or multiply strings by an But in general you cannot have mathematical

06:15 between strings and integers or floats for matter. Okay. So we already

06:24 that strings isn't is a ordering of is a bunch of characters in a

06:32 specific order. So here as python is a string and it goes

06:38 this from left to right, B y then T H O N.

06:45 uh so in in here we're going introduce the concept of indexing. You

06:53 get access to individual characters in a . So over here the first character

07:01 P but we start counting from As often we do you've seen in

07:06 course, so string zero as the P if you wanted to get to

07:12 you just use this syntax the name the string followed by this square parenthesis

07:20 number zero. Okay. Similarly one the next element that would be why

07:27 so on. The fifth element is five is in and this is how

07:35 access it. Okay. Um So guess this is pretty straightforward. So

07:52 move on to a special way of . So the the indices or the

08:04 to access we talked about so far positive number 01234 and so on.

08:13 python allows you to also access the with negative indices. So you can

08:21 as minus one minus As -2 and on. And what it does,

08:30 starts counting from the other side. when we were initially counting 0123 and

08:37 on, we wouldn't say that we counting from left to right and you

08:43 equally well count from right to left in that case uh this would be

08:52 . We don't say zero because if say zero then this and this the

08:57 and B would be the same. therefore the convention is that the last

09:02 and the strength is minus one, , last is minus two and so

09:07 . So this can be accessed as for as well as negative two.

09:14 , so that's the idea of negative . So we just just covered this

09:23 indexes specifying a position with respect to end which is over here somewhere.

09:31 that's clear enough. And that's illustrating , what we talk about the numbers

09:38 was 012345, negatives is the same the other direction. Pretty straightforward.

09:50 . So um what does it mean be strained? And what does it

09:56 if something is a string, something a list, something is dictionary.

09:59 of those were not seen or something an integer or a float. It's

10:05 what they look like and which we've seen. A string looks like an

10:11 , an ordered character group. And other thing is what can you do

10:19 it? These are the attributes and . What can I do? I

10:22 a string in a variable? What are the possibilities? Now? The

10:29 library itself provides a lot of things can do with strength. We're not

10:33 go through all of them. But to give you a flavor of that

10:39 look at some examples. So you say string upper. Okay. S

10:46 dot upper. And for now let's worry about there's a parenthesis. Um

10:52 can eventually we'll see you can put things in there, but right now

10:56 say the string variable name, dot . And uh for this illustration the

11:06 string is assigned this value. Python . Just a string of characters,

11:15 of alphabets in this case. So you start upper then the result is

11:24 good. So here you don't have make an assignment. The variable

11:29 At this point becomes python good. with just with this statement in your

11:37 . And so you can easily can any string that is likely a mix

11:43 uh lowercase and uppercase to all. with just this called. Of course

11:51 an analogous call as this lower and you did that, your string would

11:57 like this and everything same except everything lower. Cause lower is then we

12:10 another method called count, you can the string variable name, dot

12:18 and then in the parameter or attributes pass is a character or a sub

12:28 . In this case we are passing so it's gonna count how many times

12:33 character occurs in the string and I see it's here, it's here,

12:38 here. And so with that it's answer would be three. So you

12:46 count a particular character. We'll see sub string within a string.

12:52 And uh along the same lines. call is fine. So the call

13:03 will look for the location of whatever fast as a parameter. So in

13:11 case it's gonna start this way and this is not an this is not

13:19 all, this is not an This is not an all. This

13:22 not an all this here is an on and all. And it wants

13:27 report what what location the o So this will return um it'll return

13:44 . Where was it? 01234. will return four. And uh of

13:51 who can occur more than once and is returning the first value, it

13:55 from the left. You can also for the first two value you find

14:01 the right and uh that is gonna this way and here is the

14:08 So that should be it. So see um these are all the numbers

14:17 one place. So they're typed up you will see the correct values if

14:26 need to but try to do it your own and see if you agree

14:29 these numbers. Okay so it's some basic string methods convert to uppercase,

14:36 to lowercase count how many times a sub string occurs. Look for

14:42 first location left, look for First location on the right. A

14:53 common function is length of a And in this pretty it's just the

14:59 function in this case you would have say X equals length of S.

15:06 if whatever is the size strength, get that as a value in

15:11 You can count 123456789 10. So should return 10 if I've counted

15:19 Okay so you can find the length a string which comes in handy all

15:24 time. Okay. Another her um property of the strength is that you

15:41 access parts of the string. A of the strength is called a slice

15:48 is a consecutive sub string and Two M. Is essentially you start

15:58 sub string from index N. And keep going until M. But you

16:03 not include the element M. So include up to element and minus

16:07 So let's go through this 0 to . The first element is p First

16:15 is p. and the last element be uh six. Thanks. So

16:25 I've been switching back and forth by so let's just try to look with

16:30 clean sled. Uh The first is here which is B. And the

16:37 element would be this one G. you don't go to the index

16:44 you stop at m minus one. the last one would be in.

16:49 this line is from 0 to 6 be python. Okay so similarly you

16:57 see slice from 6 to 10 is with six which is G. Goes

17:04 to 10. 10 doesn't exist but doesn't matter because the last actual value

17:09 interested in would be nine. So would be good. Okay so you

17:17 just have a string as short as to 3. So the first element

17:22 be T. And the last element also be t. Because t.

17:31 uh three minus one is against it goes on to up to three

17:36 three is not included. So this be just t. Now this is

17:43 Interesting case. You want to go 3-2. Okay so think for a

17:50 What should happen when you are asked go from 3-2 and essentially there's an

18:00 order from left to right. When you're thinking of a slice, it

18:05 work the other way around. So 3-2 is empty because starting from three

18:12 go this way you'll never reach to know you can go right, so

18:17 should be an empty string. Okay at this for a 2nd -4,

18:25 -2. Let's try to remember what um what these negative indices are.

18:37 one more time. This is minus minus two minus three minus four minus

18:43 and so on. So this uh substrate from minus 42 minus two is

18:50 same as uh the sub string from corresponds to 6 -2 corresponds to

19:01 Sub string from 6 to 8 and would be six and seven. Not

19:09 . Okay, so that should be O. So basically there's a 1-1

19:17 between negative indices and positive indices. if any of these industries is

19:23 you just match it, map it the positive equivalent and what you get

19:29 your that will give you the slice you're looking for. Okay, so

19:36 is completely legal. You can say two and colon saying nothing is a

19:46 for saying that you're going to start the beginning of the string. So

19:50 two is start from the beginning up two but not counting two as

19:55 but this would be Y zero and . Okay, whatever the beginning

20:01 You go from there up to index . You can also do it the

20:06 way which is saying I'm going to with four and uh then go to

20:12 end of the strength. So here would be this element. So you

20:20 say uh from starting from this o the way to the end of the

20:28 which is on good. Okay. no no value before the colon.

20:39 from the beginning. No value after colon. Go to the end.

20:44 that's actually a nice way, especially be able to go to the

20:48 As you do your programming, you not know sometimes how long the string

20:53 . So in that case you can say, you know, go to

20:56 end um wherever the end happens to . Alright, so I think we

21:08 at the same values over here. take my word for it. Run

21:13 on your computer with these slices and you get the same thing. So

21:19 will slice is a very important part strings. You can access any part

21:25 the string, any sub string in string with the right indices.

21:33 Next thing is comparison we saw you compare um images, You can compare

21:43 numbers, you can say one is than equal and so on and you

21:47 do that with the strings two and do go through this step by

21:52 The first operator is equal equal And uh this one is essentially all

22:04 saying is what's in this variable, it exactly what this string is,

22:09 out it is So that's true. if it's you say if this variable

22:18 which has the value over here, it any other strength other than that

22:23 get a false? It's just a equality. Even if you try to

22:29 compare this variable, who has the string python good with some an uppercase

22:36 . And an uppercase G. The matters. So if the cases don't

22:43 then it's not inequality. And similar this, you can also see you

22:50 have not equal to so for example if you said not equal to then

22:58 would then be true. Just like else, not equal to its opposite

23:03 equal. Okay, um now that straightforward. Two strings are the same

23:11 not and the same. They're They're not they're not. And what

23:16 inequality? Say python is less than . String? Python is less than

23:23 , java. Okay, so here use our common sense, the

23:30 What is higher? And what is is the order of the characters in

23:38 in the electrical graphic order. so here you'll find that uh that

23:48 comes before P. Right in the . So therefore it's not true.

23:55 Anything starting with J should be less I think anything starting with D.

24:00 therefore it should be false. And what about python and python? So

24:08 the first character there's a tie for second character. There's a tie.

24:15 . And then we look at the character here we have C. And

24:21 have A. T. T. first in the alphabet. So

24:27 Is a lesser value. So essentially say that uh the you know,

24:34 don't count anymore. They got canceled essentially saying T. Is less than

24:41 . And uh that's not true, not true. So this would be

24:50 again. Um Next is uh fight is less than scholar. These are

25:01 upper case. So but still the is the same. Uh does natural

25:08 as in python come before letter assets scholar. Yes, they're both

25:14 So the case is not an So in this case this is

25:18 So this will be to T. now we have an interesting case.

25:27 the string py and python. So of course uh first character cancels our

25:35 character cancels out and after that having character is less than having anything.

25:42 , so this is true. A or nothing is less than anything that

25:47 after. So this would be true so basically everything is in lexical graphic

25:54 alphabetical order. So that's pretty straightforward hope. And I guess I have

26:07 the the same answers here which should in the power point also. And

26:13 take a look okay now now into interesting and may appear irrational. What

26:22 we compare python with python? The difference is that this is a lower

26:28 and this is uh sorry the first was an uppercase and the second is

26:33 lower case and there is no rhyme reason for this. But it turns

26:39 that this would be true. The all the upper case characters are supposed

26:46 come before the lower case characters. just a plain old convention. It's

26:51 there's not there's no reasoning behind So upper case comes before lower case

26:56 that's the way it is. Uh a little deeper. There are actually

27:04 values by convention that are associated with character. Okay so and you can

27:16 get to that Value by using this already. So if you say only

27:22 lowercase a. You get 97 you an orgy of uppercase a. You

27:28 65 and this is why a. less than lower like character. A

27:36 less than the character character. Uppercase considered less than a lowercase A.

27:46 for fun if you want to convert integer to its equal and character which

27:53 be the opposite of what the function do does. And you can use

27:57 function ChR for character and you get back so um character it can be

28:05 to the integer 97. That's it's fixed value anytime you have an internet

28:12 you want to find out what is corresponding character that then you can use

28:18 the the CHR function. So basically case comes before Lorca's and it is

28:27 it is. Okay. So much ordering again. This is just a

28:36 up of what scoured. Uh but basically get the idea that two strings

28:42 be compared and you can figure out other rules uh from from other sources

28:51 the comparison is can be done and some standard rules that govern the

28:56 Okay? Um Next topic. Something you will hear a lot uh moving

29:07 with this course is what's mutable and immutable. Okay, so strings are

29:18 . Whatever what does it mean? it means that you cannot change an

29:26 element of a strength. Okay, let's say you have access equals python

29:32 . And it's perfectly legal after that have a next statement says S.

29:38 equals, you know, python. good. Okay. There's nothing wrong

29:46 it. It's just another variable. like you can re assign a variable

29:51 to a different temperature, you can assign the variable name to a different

29:55 or for that matter, you could just access equals four and that's perfectly

30:02 . You're following a statement as as by Thunder. Okay. But what

30:07 cannot do is uh is something like S zero equals lowercase B. So

30:17 appears what you're trying to do is you have a string python good.

30:24 you just decided that you want to that value. But you would like

30:29 convert the starting instead of the upper P to a lower case B.

30:36 you cannot do that, that's that's that's not the way to do

30:41 And that's what makes it string and . If you've got a bunch of

30:50 Bunch of characters here, you can individual elements like here, you could

30:56 as a 0123. We've seen it times, but you can't go in

31:01 and say, you know what, don't like this one, replace it

31:03 something else. And hopefully down the , you'll understand more the rationalization of

31:11 , that's what makes it immutable. we'll see down the line that other

31:18 structures that that we will learn in class, including lists, are

31:23 You can change elements of a but not a string. So now

31:28 think for a second. What if wanted to convert this thing to um

31:38 to uh same thing that the Lord And for now we'll just say this

31:46 uppercase G. We just want to the first platter to something else.

31:51 told you that you can't just say the first element, which is a

31:55 element and assign a different value. lower case. So you have to

32:00 more heavy duty method of doing something that and that's what that will

32:08 So you create a new variable ss that is S S. 1 to

32:15 . Okay. So what will this ? It will have right on good

32:24 since you started with one and not . The first element will not be

32:30 the first element which is the uppercase . Which has the index zero.

32:36 now you can have another variable where say lowercase B as a character then

32:44 as a swan. And that will you an ss to exact sorry what

32:50 looking for the same thing but with first character being the lower case

32:58 And north. Uppercase P. And you really wanted it back in the

33:02 um variable, you can say S equals S. Two. And

33:09 essentially the same variable name has everything as before except the first character has

33:16 changed from uppercase P. Two. B. Okay, so the main

33:20 here is you cannot directly change elements a string and that's what makes it

33:29 . Okay, now we've seen We've seen four loops while loops.

33:37 you want to go through a string do certain operations and there's two basic

33:45 you can iterate over a string um a four loop and then we'll see

33:52 a while loop too. So you see four character in city city is

33:59 . Okay so and then let's say loop says sprint character. So the

34:05 loop will go through the variable that here, one at a time.

34:15 time it'll be uppercase each, then is old and youth and s than

34:20 than old an end so it'll pick of the string. So city is

34:25 string, it will just go through element, second element, third element

34:29 so on. And every time it variable car will have that value and

34:35 you can go and do anything with in the string. In this particular

34:40 you would see that when you start for Carson city the first time it

34:45 have the value each. So it just print ege and that's what it's

34:51 next time it will have all. essentially then oh third time it'll have

34:56 you'll get something like this. Okay you can just travel through a string

35:04 a loop, going through one element time. So this is an alternate

35:12 of doing the same thing X. . And depending on the exact problem

35:19 solving one or the other way of over strength will make more sense.

35:27 here is the same variable city as . And now you say index in

35:33 range of length of city, now of city Is what 1234567. That's

35:44 range of seven. So you're essentially this comes down to four index in

35:56 seven and you know how that Right? The first time you have

36:01 second time, one second time to iterations will have 0123456, right?

36:10 seven, you stop one before. then you say pre print city

36:17 that means when index is zero, you come and index has value

36:24 then this is gonna say print city . Okay, so city is still

36:32 variable and this is gonna be And next time you come, index

36:38 be one and then it says print of index, Print city of one

36:45 that will be all. So these loop is doing exactly the same thing

36:52 what we saw in the previous but in a different way going through

36:57 indices and one more time depending on exactly you're trying to do one or

37:03 other way of. Uh iterating would , would make more sense, would

37:12 easier to solve the problem at Now you can, the second way

37:20 we used for trading, you can do something similar with a wild

37:28 So you want to do the same with a wide loop. It's the

37:31 as before index you have to actually it in a four loop is automatically

37:39 and moves in a wide loop. set the index to zero and then

37:44 you have to have to increment the with the things that are built in

37:50 white loaf and here you're gonna So let's just look at these

37:55 you're gonna go zero first time, index equal index plus one, you

38:00 +1234 and you want to stop at , right are or six or seven

38:13 on how you're using it. So way to enforce that is you put

38:18 condition saying that the index starts from and it goes until the length of

38:26 variable. So in here the length the city is one more time

38:31 And remember, it has to be than that. So the index will

38:35 01 123456 and then start. and inside this court is identical to

38:45 was seen before in the previous four . So essentially first time it'll be

38:52 city index index will be the zero which is h I know you as

39:00 goes to 123 and so on. . So anyway, those are the

39:06 ways we saw two ways you iterate . A for loop and the second

39:13 of the for loop also can be uh as a wild loop or that

39:20 of a pattern and that's that, not going to solve this problem but

39:29 may want to try to solve it . Um uh to see if you

39:36 , remember index inside the loop is like any other variable. You can

39:43 anything with it that you would the that you can do with any

39:51 So here it's trying to multiply index by whatever is in this uh a

39:57 index to see what happens. So a try and I think if you

40:03 it, that means you're getting the of it. If not just like

40:06 up in your in any python window you'll see the answer. Okay,

40:17 that's as much as we're going to about loops here, we're gonna talk

40:23 about one more function. Not not an operation and that's in and not

40:32 so in and nodded. This is straightforward, very intuitive, you can

40:41 . And in python it is a expression which is true. If there

40:47 end somewhere in the string python, it falls otherwise. So this is

40:52 and not in python it's false because is an end in python falls whitey

40:58 python. So this element that you're A N. B. The first

41:04 doesn't have to be a single It can be a string itself.

41:08 here you see, oh yeah there A Y. T. So this

41:13 true. And now you have this string in python. So that is

41:20 little confusing. What does it So here we just go away can

41:24 an empty string is in every possible . Okay, so empty string in

41:30 whether it's python or java or Houston will always be true. So this

41:36 true and i in python same string again that is true. Okay so

41:47 and not in a really important and anytime you want to check if something

41:52 somewhere in the string, if we you, you know, you may

41:56 a long string. Is there a in the string Instead of going through

42:01 the string from one end to the ? Looking for a H E L

42:05 O. You can just say uh hello in string tell me you know

42:11 print hello in string and if there a hello industry, print true.

42:16 there's not it'll print false. So will be using in and not in

42:22 your assignment. So hopefully these are same as what I mentioned a second

42:33 . Again, changing topics a few things that are quite useful when you

42:40 to solve problems which are not conceptually but come in handy when you're trying

42:46 do your work with strings in terms programming. So there are some string

42:53 . String constants That will look at . Um these are provided by string

43:01 . So before using anything you have say import string. So string of

43:06 is a string 0123456789. Okay, this comes in very handy anytime you

43:14 a character and we ask you is a number or not? You just

43:20 if you have say a character And you want to say if it's

43:26 the uh if it's a digit or , just say X in string dot

43:39 and if X was indeed a it will return true. If it

43:44 it'll return false. So, you , if whatever you have in that

43:48 is a digit or not. And similarly you can say string, string

43:58 lowercase, as you can imagine. is so again, you can check

44:05 character if there is a lower case , uh uppercase same thing and the

44:17 , whatever the punctuations that are legal here, I don't know what all

44:21 are. So if you're given a of tax and and someone asked you

44:26 through every character and tell me how punctuation that there are, how many

44:31 case, how many upper case it's straightforward to to do that with these

44:38 and using the in function. that's pretty much all we're gonna cover

44:48 uh in a nutshell, what's important what's most relevant in string string

44:56 You can access any any elements of string through this indexing operator. Um

45:03 methods we looked at upper, lower , fine. There's a few more

45:08 a lot of simple things you can with a built in method or you

45:13 use this method to build more Um programs are uh do more complex

45:22 and not have to start from scratch time length function you use all the

45:28 it comes in handy. One more strings are immutable. So if you

45:37 a string variable variable zero equals Doesn't matter what is here. This

45:44 not, this is going to cause error every time you try to assign

45:48 inside a string, it cannot be that way. Um So you'll get

45:54 error. We looked at two ways traversing loop. What's unique is to

46:01 able to say for character in python stuff, the same as before,

46:06 can just travel through every element of string and that's a unique and sometimes

46:11 really easy and uh efficient way of string. A string, you can

46:18 slices. They come in handy all time. So, you know,

46:23 512 to 4 is character 0123 th you need to get familiar with

46:30 the string comparison and uh and then in and not and if you understand

46:38 things well, you should be fine strings and thank you. That's the

46:46 of this

-
+