辅导JavaData Structures CS 辅导java数据结构binary search trees

- 首页 >> Java编程

CS2230 Computer Science II: Data Structures

Homework 7

Implementing Sets with

binary search trees

30 points

Goals for this assignment

• Learn about the implementation of Sets using binary search trees, both unbalanced and

balanced

• Implement methods for a NavigableSet, including contains and remove

• Get more practice writing JUnit tests

• Get more practice with version control

Purpose

Binary search trees can be used to build efficient Sets that perform lookups and inserts in �(��� �)

time, which is fairly efficient. As we've seen in class, Sets are useful for applications where you need to

look things up by a "key" like checking airline tickets or looking up the existence of a word in a

dictionary. In this homework, you will study the implementation of Sets using binary search trees. We

have provided Java files that contain code for a Set implemented with an unbalanced binary search tree

(BSTSet.java) and a Set implemented using an AVLTree (AVLTreeSet.java), and your job is to finish them.

Submission Checklist

By April 17, 11:59pm: answers to the PROGRESS_REPORT.txt in your GitHub repository. No slip days

accepted.

By April 20, 11:59 pm: You should have changes in GitHub to the following files:

• BSTSet.java

• AVLTreeSet.java

• BSTSetTest.java

• AVLTreeTest.java (if you added optional tests)

• answers.pdf

Slip days: Your submission time is the date of the last commit we see in your GitHub repository. As

usual, we will process slip days automatically, so you do not need to tell us you are using them.

You will submit them via GitHub. Follow the directions in setup_hw7.pdf on "Setup your own private

repository to push your commits to". Before you are done submitting, you must check the following.

ü Do the tests pass?

ü Did I write the required tests?

ü Does my github.uiowa.edu repository reflect the code I intend to turn in? (You must view this in

your web browser, not in NetBeans. If still in doubt, you can also clone your completed code

into a second NetBeans project and check it by running all the tests).

Getting HW7

Follow all the directions in setup_hw7.pdf

Part 0

Examine the TreeNode class. In particular, answer the following questions for yourself (you do not need

to submit the answers, but this part will help you with the rest of the assignment).

• How do you test if this is a leaf?

• How does isBST work?

• What is the result of toString() when called on the root TreeNode of the following tree?

Part 1: Contains method

The Set.contains method returns true if and only if the element exists in the Set.

a) The BSTSetTest.java file has test cases for BSTSet. Devise three different tests for the contains

method (put them in testContainsA,B, and C) so that you are confident that contains() works.

Your tests for this part should be "black box", that is, they don't depend on the implementation:

they only call public methods of BSTSet (in this case, the constructor, add(), and contains()). Your

3 tests need to be different: that is, your add methods should be such that they cause different

underlying tree structures and there should be cases where contains() returns each true and

false.

b) Implement the BSTSet.contains method.

Part 2: A method for NavigableSet

Take a look at the NavigableSet interface. It adds a new method subSet to the methods already

provided by Set. The subSet method requires that keys stored in the NavigableSet have a total order.

6

3

2 4

Therefore, you'll see that the generic type is constrained to be a Comparable. The Comparable interface

provides the compareTo method.

public interface NavigableSet<T extends Comparable<T>> extends Set<T>

Read the Java 8 API on Comparable

https://docs.oracle.com/javase/8/docs/api/java/lang/Comparable.html to see what compareTo does.

a) BSTSet implements the NavigableSet interface, so you need to provide an implementation of

subSet.

NavigableSet<T> subSet(T fromKey, T toKey) that returns a new NavigableSet that contains

all elements �, such that ������� ≤ � < �����.

Your method must be efficient. That is, your algorithm should find fromKey quickly and once it finds

toKey (or larger key) it should not look at any other nodes. For a balanced tree, your implementation

should be faster than linear in N, where N is the size of the original Set, assuming that S << N, where S is

the size of the subset. An implementation that visits entire subtrees that are not in range [start, end) is

probably O(N). An example of an O(N) algorithm would be: 1) perform an inorder traversal of the whole

tree, inserting nodes into a List, then 2) iterate over the list to remove the elements not between start

and end.

The following tests in BSTSetTest.java are relevant to this method: testSubSet1,2,3.

b) You must write at least two more tests.

• testSubSet4: It must test a different tree structure and range than the other 3 tests.

• testSubSetOutOfBounds: Test a situation where the original Set has elements but the subset

is the empty Set.

Tips

• The subMap method in Goodrich chapter 11.3 is a similar method to subSet.

• You'll probably need one or more private helper methods and/or inner classes.

• There are three basic approaches we can think of:

o Build a whole List<T> of keys that are in-range using recursion then add the elements to

a new BSTSet.

o Same as above but skip the List<T>; add directly to the BSTSet as you find them

o Define an inner class that implements Iterator<T>. It keeps a frontier (e.g., a Stack) to do

the traversal. Advance the traversal after each call to next(). HINT: a preorder traversal

requires seeing each node twice (once before visiting left and once after) so may want

to mark nodes as "visited" or not. Call this iterator's next() in a loop, adding elements to

a new BSTSet.

Part 3: remove elements from an unbalanced BST

The Set.remove method removes the element if it exists in the Set and returns true if the element was

found.

Part 3a: deleteMin

The BSTSet.remove method will depend on the BSTSet.deleteMin method. This method takes a

TreeNode n and removes the minimum element in the subtree rooted at n.

Tips:

• The two if-statement lines of deleteMin are "pre conditions". Leave them there! They will help

with debugging

• To perform the deletion you should use the method updateParent. It will greatly simplify your

implementation because it works whether you are changing the left or right child.

We've provided tests in BSTSetTest (called testDeleteMin*) to help you ensure deleteMin is correct

before proceeding to the remove method.

Part 3b: remove

Implement the BSTSet.remove method. Recall that there are four cases for removal in a BST:

a) removed node is a leaf

b) removed node has only a left child

c) removed node has only a right child

d) removed node has two children

Case d is the tricky one. Use the following algorithm, adapted from your textbook:

1) use deleteMin to delete the smallest element in right subtree

2) use the data of the node returned by deleteMin to overwrite the data in the "removed" node.

Take the time with some examples on paper (the remove test cases in BSTSetTest.java are one good

source of examples) to convince yourself why the above algorithm works.

There are several tests called testRemove* in BSTSetTest to help you debug.

Part 4: Balanced tree

The BSTSet does not keep the tree balanced, and we know that an unbalanced tree can make

contains/add/remove operations take O(N) in the worst case instead of O(log N). To improve your Set

implementation, you will complete a second class that uses a balanced binary search tree, AVLTreeSet.

Notice that this class extends BSTSet, borrowing most of the functionality.

We've provided implementations of add and remove that call a balancing method to fix the balance

property after an insertion or removal. Your job will be to complete the balancing functionality by

implementing two methods

• AVLTreeSet.rotateLeft

• AVLTreeSet.rotateRight

See the comments above these methods to see a detailed description of what each method should do.

Notice that the AVLTreeSet.balance method uses rotateLeft and rotateRight in combination to do

the double rotation cases, so you don't have to directly implement double rotations.

Tips:

• as in the remove implementation, you should use the updateParent method to change what

node is at the top of the subtree. It will greatly simplify your code.

• Note that the rotation changes the height of the tree, so make sure to call updateHeight at

the end. (Do not call updateHeightWholeTree)

All of the tests in AVLTreeSetTest.java should pass when the rotate methods work.

Part 5: Stress test

You've put a lot of effort into these Set implementations, so try it out on bigger data than the tiny test

cases! We've provided an example file StressTest.java. It times how long add() takes for 3

implementations of Set

• BSTSet

• AVLTreeSet

• java.util.TreeSet

The program tries a uniform random distribution of inputs, as well as an increasing order of inputs.

Answer the following in a file called answers.pdf (put it in the base of your repository). Visuals such as

bar graphs are encouraged. Just make sure to explain them.

a) What observations of heights and running times can you make from running StressTest?

b) Give an explanation of the heights and running times you observed, based on what you know

about the 3 implementations. In other words, say "why" the heights and running times are what

they are. (You can read about java.util.TreeSet at

https://docs.oracle.com/javase/8/docs/api/java/util/TreeSet.html )

This part will be graded on these 2 criteria:

• Did you answer the questions completely and specifically? Are your explanations accurate?

• Are your answers concise and clear?

Tips for Testing and Debugging

• We've provided a method BSTSet.bulkInsert() that is helpful for creating a tree.

• We have also provided an implementation of TreeNode.equals() so that trees can be

compared properly in JUnit assert* methods.

• In your tests, do not build your test trees manually. Use add() or bulkInsert().

• IMPORTANT: the methods checkIsBST (for BSTSet or AVLTreeSet) and checkIsBalanced (for

AVLTreeSet) are very helpful for debugging. These methods throw an exception if the invariants

of the data structure are violated. Some of the test cases call them. You should use them in your

own test cases and even within your code at places where you know the invariants should hold

(for example, at the end of the remove method).

Extra credit (up to 5 points)

Submission: put your work in the base of your GitHub repository in a file named extracredit.pdf.

You may attempt this extra credit no matter how much of the rest of the assignment you complete.

However, the effort-to-points ratio may not be as high. The assignment is more open ended and allows

for exploration and problem solving.

Read the code in StressTest.java. You'll notice that it prints the height of the trees inside of the BSTSet

and the AVLTreeSet. The height is not part of the Set/NavigableSet interfaces, yet we exposed it for our

testing and debugging purposes.

In contrast, we could not print out the height of the underlying tree in the java.util.TreeSet. The

particulars of the tree implementation are not exposed to the user. Take a look at the public methods of

https://docs.oracle.com/javase/8/docs/api/java/util/TreeSet.html .

Discover interesting features of TreeSet's implementation. Here are some ideas:

• Option 1a: What algorithm is it? Look at the source code for TreeSet. Find out what

algorithm/data structure TreeSet uses. Your explanation MUST include primary evidence of your

findings (e.g., snippets of code that you refer to). We suggest starting in StressTest.java and

right-clicking on "TreeSet" and choosing an option under Navigate. In general the options under

Navigate will help you explore the source code.

• Option 1b: Break the interface! Devise a way to approximate the height of an instance of

TreeSet with elements in it without calling any private methods. For example, you might run

experiments to figure out the asymptotic run time of the public methods. You should attempt to

isolate constant factors with control experiments. Your explanation MUST include plots, which

you clearly describe and refer to.


站长地图