Comp 112
Lecture 4
Strings and Methods2018.02.20
We know that a string is a sequence of characters.
So far we can:
write (print
) and read (input
) strings,
concatenate (_+_
) and repeat (_*_
) them,
compare them for equality (_==_
) and difference (_!=_
),
and determine their length (len
).
Now, we’ll learn how to do more sophisticated string manipulations.
The index operator, “_[_]
”, looks up a character in a string by its position:
The index is an integer representing an offset from the beginning of the string.
The character at an index is the one that “begins” at that offset.
For convenience, you may specify a negative index.
The index [-n]
is just a shorthand for [len (<the_string>) - n]
.
This has the effect of counting backward from the end of the string.
Any position in a string can be specified by either a forward (i.e. non-negative) or a backward (i.e negative) index.
In order to be valid index for a string s
,
a non-negative integer i
must satisfy:
a negative integer j
must satisfy:
It is an error to provide an an out-of-bounds index:
Combining looping with indexing to process a string:
The slice operator, “_[_:_]
”, returns the substring of a string between two indices:
The start index is the index of the first character included in the substring.
The stop index is one greater than the index of the last character included in the substring.
If the start index is equal to the stop index then the substring will be empty.
For convenience, you can omit either index:
if the start index is omitted, then “0
” is assumed.
if the stop index is omitted, then “len (<string>)
” is assumed.
To organize them in a logical way and avoid name clashes, functions are organized into namespaces.
To refer to a function from a namespace, we use dot notation:
We have already seen a few examples of this:
Namespaces can hold global variables as well as functions:
Many useful functions for strings are located in the “str
” namespace, including:
Method syntax is a different notation for calling a function.
It is often used when one argument to a function is considered “principal”.
Instead of writing:
we write:
Since Python knows the type of the principal argument, it can look up the right namespace automatically.
Method syntax is more concise but rather restrictive. It works only for functions that:
live in a namespace corresponding to a type,
take as first argument a value of that type.
All of the functions from the “str
” namespace that we just saw can be written as methods:
The comparison operators (_<_ , _>_ , _<=_ , _>=_)
are overloaded to compare strings in dictionary order,
except that all uppercase letters come before any lowercase letters:
To do case-insensitive string comparisons, case-normalize before comparing:
in
OperatorThe in
operator (_in_
) takes two strings and returns a boolean.
It returns True
just in case the left operand occurs as a substring (i.e. a slice) of the right operand.
Notice that the empty string (''
) is in
every string, even itself:
for
LoopsIt is common to write a loop over a string that processes each character in turn.
For this task, there is a simplified looping construct, the for
loop:
A for
loop automatically assigns each character of the string to the loop variable in turn in the loop body.
An equivalent while
loop would be:
A for
loop relieves you of having to keep track of the index.
But for
loops can iterate in only one way: by moving forward through the string and processing each letter in turn,
whereas while
loops can iterate in any order and based on any boolean condition.
Reading:
Come to lab on Thursday.
Complete homework 4.