讲解CMPSCI 383、python程序设计调试、讲解python设计、辅导java语言

- 首页 >> Python编程

CMPSCI 383 – Fall 2018

Homework 5 Extra Credit

Due Wednesday 12/12/2018 at 11:59pm

If too much is never enough, you can implement the following for extra credit. Each extra credit

task will be worth 25% of a homework assignment.

EC Task 1: Using the training data file, implement the Apriori association rules algorithm for use

with the congressional voting data. The original paper is probably the most helpful for

implementation details; note that it outlines other algorithms as well.

Your program should be executable from the command line, with a corresponding file named

“Apriori.java” or “apriori.py”, and should take a filename and a minimum support parameter on

the command line. For example:

%> python apriori.py /path/to/congress_train.csv 19

For this exercise, consider rules concerning with positive “Yea” votes as making up a frequent

item set. You program should output a series of rules, one per line, like so:

3 -> 29

19, 28 -> 16

24, 31, 40 -> 33

Where the numbers are the column indices in the training file (starting with 0).

EC Task 2: Using the training data file, implement the PC structure learning algorithm (there’s

some background on slides 37-47 that we didn’t cover in class in the notes from lecture 10; it’s

also described on slide 45 here). Your program should conduct hypothesis tests using the

chi-square statistic to determine conditional independence (you are welcome to use a stats

library such as scipy for these calculations).

Your program should be executable from the command line, with a corresponding file named

“Pc.java” or “pc.py”, and should take a filename and an alpha parameter value on the command

line. For example:

%> java Pc /path/to/congress_train.csv 0.01

You should output the “skeleton” of the system as a series of edges, one per line (you do not

need to orient the edges):

19 - 42

22 - 25

22 - 28

Again, the numbers here should correspond to the column indices found in the training file

(starting with 0 for the first column).


站长地图