Comp 112

Lecture 12
Files and Exceptions

2018.04.17

What Could Possibly Go Wrong?

There are many things that can go wrong in a program:

unassigned_variable += 1           #  NameError: name 'unassigned_variable' is not defined

42 [3]                             #  TypeError: 'int' object is not subscriptable

answer = 42 / 0                    #  ZeroDivisionError: division by zero

number = int ('forty two')         #  ValueError: invalid literal for int()

Some errors could be avoided by static analysis, which is analyzing the structure of a program without running it.

But others depend on the program interacting with the outside world and can’t be predicted.

Proceeding with Caution

One way to avoid errors is to use many conditional statements and only try to do things that we know will succeed.

def is_int_string (text) :
    # a predicate for strings parseable as ints
    return str.isdigit (text) or (len (text) > 0 and text [0] == '-' and str.isdigit (text [1 : ]))

def cautious_interactive_division () :
    x_str = input ('please enter the numerator: ')
    y_str = input ('please enter the denominator: ')
    if is_int_string (x_str) and is_int_string (y_str) :
        # it's safe to parse ints from them
        x_int = int (x_str)
        y_int = int (y_str)
        if y_int != 0 :
            # it's safe to divide them
            result = x_int / y_int
            print (x_str + ' / ' + y_str + ' = ' + str (result))
        else :
            print ("you can't divide by zero")
    else :
        print ('in order to divide you must enter two numbers')

Ploughing Ahead

But sometimes it’s easier to ask forgiveness than permission.

def brash_interactive_division () :
    x_str = input ('please enter the numerator: ')
    y_str = input ('please enter the denominator: ')
    try :
        x_int = int (x_str)        # may raise a ValueError
        y_int = int (y_str)        # may raise a ValueError
        result = x_int / y_int     # may raise a ZeroDivisionError
        print (x_str + ' / ' + y_str + ' = ' + str (result))
    except ZeroDivisionError :
        print ("you can't divide by zero")
    except ValueError :
        print ('in order to divide you must enter two numbers')

This style can make programs more clear and concise by focusing on the main execution path.

Try Statements

Exceptions introduce a new flow-control construct, the try statement:

try :
    <any_block_of_code_that_could_result_in_an_exception>
except <exception_type> :
    <the_block_of_code_to_run_if_the_specified_exception_occured>
⋮
except :
    <the_block_of_code_to_run_if_any_other_exception_occured>

The try statement has two parts: a try clause and one or more except clauses.
The try clause is run unconditionally.
If the try clause ends without encountering any exceptions then the except clauses are all skipped.
If an exception does occur within the try clause then execution jumps to the first except clause whose guard matches the exception type (like an elif clause).
The last except clause may omit the guard, in which case it handles any exceptions not matching the guard of an earlier except (like an else clause).
Only the first matching except clause will get executed.
If an exception is handled by an except clause, program execution continues on from the end of the try statement.

Raising Exceptions

You can intentionally cause an exception to occur in your program with a raise statement:
```
raise Exception ('uh oh, something went wrong')
```
You can use try statements to handle your own exceptions just like the built-in ones.

A common pattern is to conditionally raise an exception.

if some_list == [] :
    raise Exception ('the list is not allowed to be empty')

There is a built-in shorthand for this, the assert statement:

assert some_list != [] , 'the list is not allowed to be empty'

Assertions can be used to enforce function specifications:

def quot_rem (x , y) :
    # signature:  int , int -> tuple (int , int)
    # precondition:  y != 0
    assert y != 0 , "can't divide by zero"
    return (x // y , x % y)

Files and Paths

In order to access a file on your computer, you must first locate it.
Files are organized in a hierarchy of directories, which are typically presented to the user as “folders”.
The path to a file is a sequence of nested directory traversals, ending in the desired file:
```
outer_dir/middle_dir/inner_dir/my_file.txt
```

There are two different ways of specifying a path:

relative paths specify the location of a file starting from the current directory. This is the default.
absolute paths specify the location of a file starting from the top of the computer’s filesystem.

They are indicated with a leading “/” (in Windows, a drive letter may also be needed, e.g. “C:”).
There is a shorthand for the current directory, written “.” (think, “here”).
There is a shorthand for the parent directory of a given directory, written “..” (think, “up”).

Files and Paths in Python

To work with paths in Python, we use the “os” library:
```
import os
```
To test whether a string represents a valid path to a file or directory, use the os.path.exists function.
```
os.path.exists ('sample')
```
To test whether the object represented by a path is a file (rather than a directory), use the os.path.isfile function.
```
os.path.isfile ('sample/file_1.txt')
```
To test whether the object represented by a path is a directory (rather than a file), use the os.path.isdir function.
```
os.path.isdir ('sample/dir_1')
```
If a path represents a directory, you can list its contents with the os.listdir function.
```
os.listdir ('sample')  #  raises NotADirectoryError if path not a directory
```
You can get the current directory with the os.getcwd function.

Opening Files

To open a file in Python, use the “open” function:

open (<path> , <mode>)

The open function takes two arguments:

a string describing the path to the file you want to open,
a string describing the mode for opening the file.

The modes we care about for now are:

read mode ('r'): opens an existing file for reading input, starting at the beginning.
create mode ('x'): creates a new file and opens it for writing output. The file must not already exist.
append mode ('a'): opens a file for writing output, with new content appended to the end of any existing content.
overwrite mode ('w'): opens a file for writing output, overwriting any previous file contents.

If a call to open is successful then it will return a filehandle object.

This is a token representing your right to read from or write to the file.

File Handling Exceptions

If you give open a path that should refer to a file but doesn’t then it will raise a FileNotFoundError.
If you give open a path that shouldn’t refer to a file but does then it will raise a FileExistsError.
If you give open a path to a directory rather than a file then it will raise a IsADirectoryError.

Rather than using conditionals for all the things that could possibly go wrong, it’s easier to use a try statement:

try :
    file = open (path , 'r')  #  possible FileNotFoundError , IsADirectoryError
except FileNotFoundError :
    print ("There's no such file")
except IsADirectoryError :
    print ("That's a directory, not a file")

File Input

To read a single line of text from a file opened in read mode ('r') as a string, call the “readline” method on the filehandle object:
```
file = open (path , 'r')
first_line = file.readline ()
```
To read all remaining lines of text as a list of strings, call the “readlines” method on the filehandle object:
```
rest_lines = file.readlines ()
file.close ()
```

Or you can iterate over the lines in a file with a for loop:

def print_file (path) :
    """
    signature: str -> NoneType
    opens the file at the given path and prints its contents to the terminal
    """
    try :
        file = open (path , 'r')
        for line in file :
            print (line [ : -1] if line [-1] == '\n' else line)  #  trim trailing newlines
        file.close ()
    except :
        print ('no can do')

You should always close a filehandle once you are done with it using the “close” method.

File Output

To write a string to a file opened in create ('x'), append ('a') or overwrite ('w') mode, use the “write” method.

def copy_shouty (in_path , out_path) :
    try :
        in_file = open (in_path , 'r')    # possible FileNotFoundError , IsADirectoryError
        out_file = open (out_path , 'x')  # possible FileExistsError
        for line in in_file :
            out_file.write (line.upper ())
        in_file.close ()
        out_file.close ()
    except FileExistsError :
        print ('the file ' + out_path + ' already exists.')
    except FileNotFoundError :
        print ('the file ' + in_path + ' does not exist.')
    except IsADirectoryError :
        print (in_path + ' is a directory, not a file.')

The Read-Modify-Write Pattern

A common pattern of working with files is to:

open a file, read some data from it, close the file,
process the data somehow,
open a file (either the same one or a new one), write the processed data to it, close the file.

def alphabetize (in_path , out_path) :
    # alphabetizes the contents of a file containing one word per line
    try :
        # read:
        in_file = open (in_path , 'r')      # possible FileNotFoundError , IsADirectoryError
        words = in_file.readlines ()
        in_file.close ()
        # modify:
        words.sort ()
        # write:
        out_file = open (out_path , 'x')    # possible FileExistsError
        for line in words :
            out_file.write (line)
        out_file.close ()
    except FileNotFoundError :
        print ('the file ' + in_path + ' does not exist.')
    except IsADirectoryError :
        print (in_path + ' is a directory, not a file.')
    except FileExistsError :
        print ('the file ' + out_path + ' already exists.')

To Do This Week:

Reading:
- section 9.1: file reading (1 page)
- section 14.2: file writing (1 page)
- section 14.4: paths (1 page)
- section 14.5: exceptions (1 page)
Come to lab on Thursday.
Complete homework 12.

Comp 112 Lecture 12 Files and Exceptions

Exceptions