CS340 - MP12

Set and Map*


Introduction

In this assignment you will use the set and map classes of the Standard Template Library to accomplish a fairly sophisticated task.

The Problem

To begin, get a copy of the program words.cpp, which you can find here. This program reads an input file and lists all the words from each line in the file. If you run it with no command line arguments, it will read from standard input (until you press CONTROL-Z, which generates an end-of-file) and write to standard output. You can provide one or two file names on the command line. The first file is used for input and the second, if given, is used for output. The output from the program is simple: It lists the line number then lists all the words found on that line.

Your job is to modify the program so that it does the following:

  1. It should read the entire input file.
  2. Then it should output a list of all the words that occur in the file. This list should be in alphabetical order.
  3. Next to each word, you should put a list of the line numbers on which the word occurs.

For this problem, each word in the file is associated with a set of line numbers. You can represent this data as a map in which the keys are strings and the values are sets of integers. You can declare an object of this type as follows:

         map< string, set<int> >  words;

(Note that the space between the two >'s is essential, since if the space is omitted, C++ will interpret >> as a single token representing the output operator.)

Using this notation, if str is a word, then words[str] is the set of integers that is associated with the word. (Important note: This is valid even if str has not previously been used as a key. If this is done, a new (key,value) pair is inserted into the map. The value in this pair is the empty set, and words[str] refers to this empty set.)

To traverse the elements of the map, you will need an iterator for this map. The type of this iterator is:

        map< string, set<int> >::iterator

If p is such an iterator, then p->first is a word and p->second is the set associated to that word.

What You'll Turn In

Your submitted programs should adhere to the CS340 coding standards.

Submit a file called mp12.zip. The zipped file should contain the project as well as the source file so that you code may be easily recompiled. Do not include either the Debug or Release directories generated by .NET.


* based on an assignment from here.