CS24留学生辅导、讲解C/C++程序、辅导threshold、C/C++设计讲解
- 首页 >> C/C++编程Fall 2018 CS24 Project 2
Due: December 07, 2018
Objectives: This project is the last project you will be working on this quarter. The main
objective of this project is to gain experience developing larger programs in C++.
Project specification: In this project, you will be building upon the functionality you built in
Project 1, with a few modifications. You will be creating 2 executables with different
functionalities for this project.
1. In the first executable, you will be taking as input a word to be searched and also a
threshold value for the count and you must output the list of files in which the word
appears a minimum of (threshold) number of times. For example, given a word ‘cat’
and threshold 2, you must output the list of files in which the word ‘cat’ appears 2 or
more times.
2. In the second executable, you will be taking as input 2 words, and you must output
the union of the 2 lists of files in which the 2 words appear.
Each of the above 2 executables takes exactly one input during execution, so the
inputs need not be given in a while loop. It will suffice if the program exits after
printing the output for the one single input given.
Input specification: The program takes as a command line argument a directory which
contains set of files, containing multiple words. Words can appear multiple times in a single
file or in different files. The input files are assumed to be stored in a dedicated directory (for
e.g. /cs/class/cs24/project1/input) and loaded in memory upon program execution. Use the
driver program given, to get a list of (word count,file) from the list of input files. After the
words have been loaded into memory, each of the above 2 executables takes a single input
from the user, prints the corresponding output and exits.
Program functionality:
Below you can find the format specification:
First executable:
$./wordsearchcount <path_to_input_files>
Examples for execution format:
$./wordsearchcount input
Enter word: cat
Enter threshold: 2
File: file1.txt; Count: 3
File: file4.txt; Count: 3
File: file2.txt; Count: 2
---Program exits
If executed as specified above, the program should return the name of files that contain word
“cat” 2 or more times, and exit.
Second executable:
$./wordsearchunion <path_to_input_files>
Examples for execution format:
$./wordsearchunion input
Enter word1:cat
Enter word2:dog
file1.txt
file4.txt
file6.txt
file8.txt
---Program exits
If executed as specified above, the program should return the union of the list of files that
contain words - cat or dog.
Implementation requirements:
You will be retaining the same functionality as Project 1, but with some key changes :
1. Instead of using an array to store the list of words, you will be creating a doubly
linked list of words in which the words are sorted in alphabetical order, in the doubly
linked list.
2. Instead of using a bag to store the list of files, you will be again creating a doubly
linked list of filenames with their respective counts and this doubly linked list needs
to be sorted by the number of occurrences of the word in the given file, in decreasing
order. For example, in the doubly linked list , if file A contains the word 3 times and
file B contains the word 1 time, the doubly linked list will first have file A, followed by
file B.
Key implementation detail:
In project 1, the details of the file name along with the respective count of the word was
displayed from within the bag.cpp file (i.e - the print method was inside the bag cpp file). In
this project, for each of the 2 executables, the reference to the linked list must be returned to
the calling function in wordsearchcount / wordsearchunion cpp file and the printing of the
output must happen from within this cpp file.
A basic layout of the data structure you would need to implement is illustrated on the figure
below:
Pointers for code changes in the implementation:
1. Word.cpp must now contain a reference to a doubly linked list of files.
2. Bag.cpp must now be replaced by list.cpp which must take care of the functionalities
for the doubly linked list of files.
3. Itemtype.cpp may be reused as before.
4. For finding the union of the 2 linked list of files, use the below pseudo code as
reference:
Create a new list and add the files corresponding to the first list (list of files of
word 1 ) into this list.
Take each element of the second list(list of files of word 2) and check if it is
present in the newly created list, and if it is not there, add it to the list, else do
not add it if it is already present in the list.
At the end, the new list will contain the union of the 2 lists.
Print out the list of files in this new list.
5. You are advised to use the doubly linked list implementation discussed in class, and
modify it to suit the needs of the project.
As before, you may assume the maximum number of files to be 100 and the maximum
number of words to be 1000.
You will need 10 files: wordsearchcount.h, wordsearchcount.cpp, wordsearchunion.h,
wordsearchunion.cpp, itemtype.h, itemtype.cpp, list.h, list.cpp, word.h, word.cpp.
Instructions for compilation: Your program should compile on a CSIL machine with the
following command without any errors or warnings.
$ g++ -o wordsearchcount wordsearchcount.cpp itemtype.cpp list.cpp word.cpp
Turn-in procedure:
Upload your files to gauchospace under the corresponding project submission folder.