Python辅导:CMPUT201 Parse Instance辅导Python

- 首页 >> 其他

用Python解析文本文件,提取非注释行内容信息。

Requirement

The following content is from a file named “instance10_001.txt” (which is posted in eClass under Week 1):

#instance10_001.txt #area [0, MAX_X] x [0, MAX_Y] 100 100 #number of points NUM_PT 10 #coordinates 0 0 0 90 70 100 100 50 30 30 30 70 70 70 70 30 50 50 45 0 #end of instance 

This file describes 10 points in the two-dimensional plane, within the rectangular area [0, MAX_X] [0, MAX_Y]. Every line starting with a symbol # is a comment; the first non-comment line contains the integer values for MAX_X and MAX_Y, which are 100 and 100 in this file (the range for each of them is [1, 1000]); the second non-comment line contains the number of points NUM_PT, which is 10 in this file (the range is [1, 1000]); the other non-comment lines present the integer x- and y-coordinates for the points, one in a line (in a non-specific order).

In all three assignments, the input files all follow such a file format, except that the values of the variables can be different and the comment lines can be missing. Each file is called an instance, and the file name convention is to start with “instance”, followed by the number of points in the instance, then an underscore, the index of instance having the same number of points, and lastly the file suffix “.txt”. That is, “instanceXXX YYY.txt” is the YYY-th (may or may not be left-patched with 0’s) instance having XXX points; for example, the above “instance10 001.txt” is the first instance having 10 points.

The following list contains the specifications for Assignment #1 (10 marks in total):

  1. Write a single program with multiple functionalities (i.e. objectives), using the command-line options. Suppose your program name is “myprogram”. If a command for running your program is incorrect (such as invalid options), your program prints out the following and then quits (functionality #1): gt;myprogram [-i inputfile [-o outputfile]
  2. One functionality (functionality #2) of your program is to read in the content of an instance file. To read in the file “instance10_001.txt” you will execute the command: gt;myprogram -i instance10 001.txt
    Here “-i” is the command-line option that indicates the succeeding argument is the input filename.
  3. Your program will check the correctness of the file content (functionality #3) during reading. That is, to check that the first non-comment line contains two integer values for MAX_X and MAX_Y, the second non-comment line contains an integer value for the number of points NUM_PT (this number also appears in the filename; your program does not need to validate this, but to always use the number read in from the file), and there are exactly NUM_PT more lines, each contains two integer values for the x- and y-coordinates of a point. Your program also makes sure that no coordinate is out of the specified rectangular area, neither there can be duplicate points in the instance.
    If your program encounters an error, then reports “Error in reading the instance file!” and quits; otherwise, it continues to the next item.
  4. With the correctness been checked, your program will print (functionality #4) out the non-comment lines of the input file to the screen, when using the command: gt;myprogram -i instance10_001.txt
    If an output filename is specified, using either of the following commands:
    gt;myprogram -i instance10_001.txt -o output.txt
    gt;myprogram -o output.txt -i instance10_001.txt
    where “-o” is the command-line option that indicates the succeeding argument is the output filename, then instead of printing to the screen all the non-comment lines of the input file are written into the file “output.txt”.
  5. If your program is not fed with an input file, that is, by executing the following command: gt;myprogram then your program will generate several instances (functionality #5) through a user interface as follows:

Your program will generate in total 7 instances, written into 7 separate files with their filenames “instance10_j.txt”, for j = 1, 2, …, 7, respectively. Each instance has the rectangular area [0, 100] [0, 200], and has 10 points. The coordinates of a point is generated uniformly randomly within the rectangular area. And your program makes sure there are no duplicate points within each instance. If it is impossible for your program to generate these files, prints out “Error in generating instances!” and quits. All these files are saved in the current directory executing the command, and your program prints the following to the screen:

instance10_1.txt generated instance10_2.txt generated instance10_3.txt generated instance10_4.txt generated instance10_5.txt generated instance10_6.txt generated instance10_7.txt ... done!