CS24留学生辅导、讲解C/C++程序、辅导threshold、C/C++设计讲解

- 首页 >> C/C++编程

Fall 2018 CS24 Project 2

Due: December 07, 2018

Objectives: This project is the last project you will be working on this quarter. The main

objective of this project is to gain experience developing larger programs in C++.

Project specification: In this project, you will be building upon the functionality you built in

Project 1, with a few modifications. You will be creating 2 executables with different

functionalities for this project.

1. In the first executable, you will be taking as input a word to be searched and also a

threshold value for the count and you must output the list of files in which the word

appears a minimum of (threshold) number of times. For example, given a word ‘cat’

and threshold 2, you must output the list of files in which the word ‘cat’ appears 2 or

more times.

2. In the second executable, you will be taking as input 2 words, and you must output

the union of the 2 lists of files in which the 2 words appear.

Each of the above 2 executables takes exactly one input during execution, so the

inputs need not be given in a while loop. It will suffice if the program exits after

printing the output for the one single input given.

Input specification: The program takes as a command line argument a directory which

contains set of files, containing multiple words. Words can appear multiple times in a single

file or in different files. The input files are assumed to be stored in a dedicated directory (for

e.g. /cs/class/cs24/project1/input) and loaded in memory upon program execution. Use the

driver program given, to get a list of (word count,file) from the list of input files. After the

words have been loaded into memory, each of the above 2 executables takes a single input

from the user, prints the corresponding output and exits.

Program functionality:

Below you can find the format specification:

First executable:

$./wordsearchcount <path_to_input_files>

Examples for execution format:

$./wordsearchcount input

Enter word: cat

Enter threshold: 2

File: file1.txt; Count: 3

File: file4.txt; Count: 3

File: file2.txt; Count: 2

---Program exits

If executed as specified above, the program should return the name of files that contain word

“cat” 2 or more times, and exit.

Second executable:

$./wordsearchunion <path_to_input_files>

Examples for execution format:

$./wordsearchunion input

Enter word1:cat

Enter word2:dog

file1.txt

file4.txt

file6.txt

file8.txt

---Program exits

If executed as specified above, the program should return the union of the list of files that

contain words - cat or dog.

Implementation requirements:

You will be retaining the same functionality as Project 1, but with some key changes :

1. Instead of using an array to store the list of words, you will be creating a doubly

linked list of words in which the words are sorted in alphabetical order, in the doubly

linked list.

2. Instead of using a bag to store the list of files, you will be again creating a doubly

linked list of filenames with their respective counts and this doubly linked list needs

to be sorted by the number of occurrences of the word in the given file, in decreasing

order. For example, in the doubly linked list , if file A contains the word 3 times and

file B contains the word 1 time, the doubly linked list will first have file A, followed by

file B.

Key implementation detail:

In project 1, the details of the file name along with the respective count of the word was

displayed from within the bag.cpp file (i.e - the print method was inside the bag cpp file). In

this project, for each of the 2 executables, the reference to the linked list must be returned to

the calling function in wordsearchcount / wordsearchunion cpp file and the printing of the

output must happen from within this cpp file.

A basic layout of the data structure you would need to implement is illustrated on the figure

below:

Pointers for code changes in the implementation:

1. Word.cpp must now contain a reference to a doubly linked list of files.

2. Bag.cpp must now be replaced by list.cpp which must take care of the functionalities

for the doubly linked list of files.

3. Itemtype.cpp may be reused as before.

4. For finding the union of the 2 linked list of files, use the below pseudo code as

reference:

Create a new list and add the files corresponding to the first list (list of files of

word 1 ) into this list.

Take each element of the second list(list of files of word 2) and check if it is

present in the newly created list, and if it is not there, add it to the list, else do

not add it if it is already present in the list.

At the end, the new list will contain the union of the 2 lists.

Print out the list of files in this new list.

5. You are advised to use the doubly linked list implementation discussed in class, and

modify it to suit the needs of the project.

As before, you may assume the maximum number of files to be 100 and the maximum

number of words to be 1000.

You will need 10 files: wordsearchcount.h, wordsearchcount.cpp, wordsearchunion.h,

wordsearchunion.cpp, itemtype.h, itemtype.cpp, list.h, list.cpp, word.h, word.cpp.

Instructions for compilation: Your program should compile on a CSIL machine with the

following command without any errors or warnings.

$ g++ -o wordsearchcount wordsearchcount.cpp itemtype.cpp list.cpp word.cpp

Turn-in procedure:

Upload your files to gauchospace under the corresponding project submission folder.


站长地图