Data Structures and Algorithms

Tuesday, March 17, 2015

Ternary Search Trees

Ternary search trees are specialized structures for storing and retrieving strings.

At each level, just like a binary search tree, each left node is smaller

Searching for a word:

Just like searching a binary search tree, you start at the root and follow links left and right as appropriate

Thursday, October 2, 2014

Using Iterators to Overcome Array-based Problems

Using simple array-based iteration not only ties algorithms to using arrays, but also requires that the
logic for determining which elements stay, which go, and in which order to process them, is known in
advance.

An iterator (also known as an enumerator) solves these problems by providing a generic interface for
looping over a set of data so that the underlying data structure or storage mechanism—such as an array,
database, and so on—is hidden

Whereas simple iteration generally requires you to write specific code
to handle where the data is sourced from or even what kind of ordering or preprocessing is required, an
iterator enables you to write simpler, more generic algorithms

Iterator Operations

public interface Iterator {
public void first();

public void last();

public boolean isDone();

public void next();

public void previous();

public Object current() throws IteratorOutOfBoundsException;
}

Not all data structures allow traversing the data in both directions, nor does it always make sense.

For this reason, it is acceptable for any of the traversal methods—first(), last(), next(), and previous()—to throw an UnsupportedOperationException to indicate this missing or unimplemented behavior

public class IteratorOutOfBoundsException extends RuntimeException {

}

The Iterable Interface

Iterable is an interface that provides a generic way to obtain an iterator from any data structure that supports it.

public interface Iterable {

public Iterator iterator();

}

Using an Iterator

Once an iterator has been obtained—

either by explicit construction or as an argument to a method—position to the beginning or end as

appropriate

Iterator iterator = ...;
iterator.first();
while (!iterator.isDone()) {
Object object = iterator.current();
...
iterator.next();
}

Iterator iterator = ...;

for (iterator.first();!iterator.isDone(); iterator.next()) {

Object object = iterator.current();

...

}

Standard Iterators

Array Iterator

The most obvious iterator implementation is one that wraps an array.

By encapsulating an array within an iterator, you can start writing applications that operate on arrays now and still extend easily to other data structures in the future.

Reverse Iterator

With a reverse iterator, however, the same behavior can be achieved without re-sorting and without duplicated code.

When the application calls first(), the reverse iterator actually calls last() on the underlying iterator. When the application calls next(), the underlying iterator’s previous() method is invoked, and so on. In this way, the behavior of the iterator can be reversed without changing the client
code that displays the results, and without re-sorting the array

Filtering Iterator

One of the more interesting and useful advantages of using iterators is the capability to wrap or decorate
(see the Decorator pattern [Gamma, 1995]) another iterator to filter the return values

The filter iterator works by wrapping another iterator and only returning values that satisfy some condition,
known as a predicate

Each time the underlying iterator is called, the returned value is passed to the
predicate to determine whether it should be kept or discarded

public interface Predicate {
public boolean evaluate(Object object);
}

evaluate(), that is called for each value, and returning a Boolean to indicate whether the value meets the selection criteria or not

Introduction

Deterministic Algorithms:

the result of any algorithm can be determined exactly based on the inputs

Heuristic

problem is so difficult that finding a precise solution can be too costly in terms of time or resources
this case, a heuristic may be a more practical approach. Rather than try to find a perfect solution, a heuristic uses certain well-known characteristics of a problem to produce an approximate solution

Heuristics are often used to sift through data, removing or ignoring values that are irrelevant so that the
more computationally expensive parts of an algorithm can operate on a smaller set of data.

the major drawback with using a heuristic is that it is usually not possible to determine how it
will perform all of the time

This leads to a level of uncertainty in the overall
algorithm that may or may not be acceptable depending on the application.

Understanding Complexity in Relation to Algorithms

performance (the amount of CPU/memory/disk usage)
complexity (how well the algorithm scales)

It’s more important, therefore, to ascertain how a given algorithm behaves as the size of the problem
increases

greater interest is how the number of operations performed varies with the size of the problem

Algorithm Complexity

if the problem size were to increase by an order of magnitude, how does that affect
the number of operations required to perform a simple function? Does the number remain the same?
Does it double? Does it increase linearly with the size of the problem? Does it increase exponentially?

By measuring the complexity of an algorithm,
we hope to predict how it will perform: Complexity affects performance, but not vice versa

it is important to remember that complexity doesn’t provide a precise
measure of expected performance, but rather places certain bounds or limits on the achievable performance.

Understanding Big-O Notation

O(1): Pronounced “order 1” and denoting a function that runs in constant time
O(N): Pronounced “order N” and denoting a function that runs in linear time
O(N2): Pronounced “order N squared” and denoting a function that runs in quadratic time
O(log N): Pronounced “order log N” and denoting a function that runs in logarithmic time
O(N log N): Pronounced “order N log N” and denoting a function that runs in time proportional
to the size of the problem and the logarithmic time
O(N!): Pronounced “order N factorial” and denoting a function that runs in factorial time

Constant Time: O(1)

O(1) actually means that an algorithm takes constant time to run; in other words, performance isn’t affected by the size of the problem.

Example: performance is addressing main memory in a computer (array lookup)

It still doesn’t guarantee that the algorithm will be very fast.

Linear Time: O(N)

An algorithm runs in O(N) if the number of operations required to perform a function is directly proportional to the number of items being processed.

Even if you doubled or even tripled the number of registers in operation at any one time, the
processing time is still officially O(N).

Quadratic Time: O(N2)

The number of handshakes required for a group of size N turns out to be (N2 – N) / 2.

big-O notation disregards any constants—in this case, the 2—we’re left with N2 – N.

as N becomes larger, subtracting N from N2 will have less and less of an overall effect, so we can safely ignore the subtraction, leaving us with a complexity of O(N2)

Any algorithm with a complexity of O(N2) is pretty much guaranteed to be useless for solving all but the smallest of problems

Logarithmic Time: O(log N) and O(N log N)

This means that even when the size of the input data set increases by a factor of a million,
the run time will only increase by some factor of log(1,000,000) = 20

Achieving logarithmic running times usually requires your algorithm to somehow discard large portions of the input data set

Eg. Binary search, BST

O(N log N) is still better than O(N2) but not quite as good as O(N).

Factorial Time: O(N!)

Some algorithms can perform even more poorly than O(N2)— compare the O(N2) and O(N!)

Example: Calculating factorial of a number

TDD Structure

Monday, September 29, 2014

Comparing basic sorting algorithms

Basic Sorting

Sorting Fundamentals

All of the sorting algorithms are built upon two fundamental operations:

1) Comparing items to determine whether they are our of order
2) Moving items into sorted position

The advantages and disadvantages of each sorting algorithm are based on how many times these fundamental operations need to be performed and how expensive these operations are in performance terms.

Comparators

It is important to support different orderings without having to write a whole new algorithm.

A comparator is responsible for imposing a specific ordering on objects.

Separation of concerns Design Principle

Separate the concern of how to compare two individual objects (the comparator) from the concern of how to efficiently sort a large list of objects (the algorithm)

The Comparator Interface

public interface Comparator
{
public int compare(Object left, Object right);
}

The compare method throws a ClassCastException if the type of either of the object prevents them from being compared.

The Comparable Interface

Many built-in java objects implement this interface.

public interface Comparable
{
public int compareTo(Object other);
}

The Comparator compares two objects with each other, whereas a Comparable object compares another object with itself.

Natural Comparator

A natural comparator is a comparator that supports a natural ordering of objects.
Eg. A comes before B, B comes before C and so on.

It is designed to compare two objects that implement the Comparable interface.

Java has the concept of Comparable - an interface that can be implemented by any class you are using to provide a natural sort order.

Example

public final class NaturalComparator implements Comparator
{
public static final NaturalComparator INSTANCE = new NaturalComparator();

private NaturalComparator() { ... }

public int compare(Object left, Object right)
{
assert left != null : "left cannot be null";
return ((Comparable) left).compareTo(right));
}
}

Because the NaturalComparator has no state, you only need one instance of it.
Make sure the class is marked as final to prevent it from being extended erroneously.

public void testLessThan() {
assertTrue(NaturalComparator.INSTANCE.compare(“A”, “B”) < 0);
}

public void testGreaterThan() {
assertTrue(NaturalComparator.INSTANCE.compare(“B”, “A”) > 0);

}

public void testEqualTo() {
assertTrue(NaturalComparator.INSTANCE.compare(“A”, “A”) == 0);

}

Reverse Comparator

public final class ReverseComparator implements Comparator
{
public static final NaturalComparator INSTANCE = new NaturalComparator();

private NaturalComparator() { ... }

public int compare(Object left, Object right)
{
assert left != null : "left cannot be null";
return ((Comparable) right).compareTo(left));
}
}

A better approach is to create a generic comparator that wraps (or “decorates”) another comparator and reverse the result.

public class ReverseComparator implements Comparator {
private final Comparator _comparator;

public ReverseComparator(Comparator comparator) {
assert comparator != null : “comparator can’t be null”;
_comparator = comparator;
}

public int compare(Object left, Object right) {
//Delegate to _comparator
return _comparator.compare(right, left);
}
}

Testing Reverse Comparator

public void testLessThanBecomesGreaterThan()
{
ReverseComparator comparator =
new ReverseComparator(NaturalComparator.INSTANCE);
assertTrue(comparator.compare(“A”, “B”) > 0);
}

public void testGreaterThanBecomesLessThan() {

ReverseComparator comparator =

new ReverseComparator(NaturalComparator.INSTANCE);

assertTrue(comparator.compare(“B”, “A”) < 0);

}

public void testEqualsRemainsUnchanged() {

ReverseComparator comparator =

new ReverseComparator(NaturalComparator.INSTANCE);

assertTrue(comparator.compare(“A”, “A”) == 0);

}

Bubble Sort

With each pass resulting in one item being moved into its final sorted position.

It is a stable algorithm (i.e. the algorithm maintains the relative order of items with a common sort key)

Number of comparisons is independent of the state of the input data (i.e. arrangement of the input)

public interface ListSorter

{

public List sort(List list);

}

public class BubblesortListSorter implements ListSorter

{

private final Comparator _comparator;

public BubblesortListSorter(Comparator comparator) {

assert comparator != null : "comparator cannot be null";

_comparator = comparator;

}

public List sort(List list) {

assert list != null : "list cannot be null";

int size = list.size();

for (int pass = 1; pass < size; ++pass) {

for (int left = 0; left < (size - pass); ++left) {

int right = left + 1;

if (_comparator.compare(list.get(left), list.get(right)) > 0) {

swap(list, left, right);

}

return list;

}

private void swap(List list, int left, int right) {

Object temp = list.get(left);

list.set(left, list.get(right));

list.set(right, temp);

}

Test Case

public abstract class AbstractListSorterTestCase extends TestCase {

private List _unsortedList;

private List _sortedList;

protected void setUp() throws Exception {

_unsortedList = new LinkedList();

_unsortedList.add("test");

_unsortedList.add("driven");

_unsortedList.add("development");

_unsortedList.add("is");

_unsortedList.add("one");

_unsortedList.add("small");

_unsortedList.add("step");

_unsortedList.add("for");

_unsortedList.add("a");

_unsortedList.add("programmer");

_unsortedList.add("but");

_unsortedList.add("it's");

_unsortedList.add("one");

_unsortedList.add("giant");

_unsortedList.add("leap");

_unsortedList.add("for");

_unsortedList.add("programming");

_sortedList = new LinkedList();

_sortedList.add("a");

_sortedList.add("but");

_sortedList.add("development");

_sortedList.add("driven");

_sortedList.add("for");

_sortedList.add("giant");

_sortedList.add("is");

_sortedList.add("it's");

_sortedList.add("leap");

_sortedList.add("one");

_sortedList.add("programmer");

_sortedList.add("programming");

_sortedList.add("small");

_sortedList.add("step");

_sortedList.add("test");

}

protected void tearDown() throws Exception {

_sortedList = null;

_unsortedList = null;

}

protected abstract ListSorter createListSorter(Comparator comparator);

public void testListSorterCanSortSampleList()

{

ListSorter sorter = createListSorter(NaturalComparator.INSTANCE);

List result = sorter.sort(_unsortedList);

assertEquals(result.size(), _sortedList.size());

Iterator actual = result.iterator();

actual.first();

Iterator expected = _sortedList.iterator();

expected.first();

while (!expected.isDone()) {

assertEquals(expected.current(), actual.current());

expected.next();

actual.next();

}

public class BubblesortListSorterTest extends AbstractListSorterTest {

protected ListSorter createListSorter(Comparator comparator) {

return new BubblesortListSorter(comparator);

}

Selection Sort

Each element is moved directly to its final sorted position rather than taking small steps toward its final position.

It could be made a stable algorithm.

Number of comparisons is independent of the state of the input data (i.e. arrangement of the input)

public class SelectionSortListSorter implements ListSorter {

private final Comparator _comparator;

public SelectionSortListSorter(Comparator comparator) {

assert comparator != null : "comparator cannot be null";

_comparator = comparator;

}

public List sort(List list) {

assert list != null : "list cannot be null";

int size = list.size();

for (int slot = 0; slot < size - 1; ++slot) {

int smallest = slot;

for (int check = slot + 1; check < size; ++check) {

if (_comparator.compare(list.get(check), list.get(smallest)) < 0) {

smallest = check;

}

swap(list, smallest, slot);

}

return list;

}

private void swap(List list, int left, int right) {

if (left == right) {

return;

}

Object temp = list.get(left);

list.set(left, list.get(right));

list.set(right, temp);

}

Test Case

public class SelectionSortListSorterTest extends AbstractListSorterTestCase {

protected ListSorter createListSorter(Comparator comparator) {

return new SelectionSortListSorter(comparator);

}

Insertion Sort

An insertion sort works by dividing the data into two groups: already sorted items and unsorted items.

One by one, an item is taken from the unsorted group and inserted at the appropriate position in the growing group of sorted items

This algorithm does not sort the objects in place by rearranging the order of the list it is given; rather, this algorithm creates a new, empty list and inserts each item from the original list into the result list in sorted order.

It is a stable algorithm (i.e. the algorithm maintains the relative order of items with a common sort key)

Number of comparisons is highly sensitive to the state of the input data.

At worst, it requires as many comparisons are the other algorithms.

At best, it requires fewer comparisons than the number of items in the input data.

public class InsertionSortListSorter implements ListSorter {

private final Comparator _comparator;

public InsertionSortListSorter(Comparator comparator) {

assert comparator != null : "comparator cannot be null";

_comparator = comparator;

}

public List sort(List list) {

assert list != null : "list cannot be null";

final List result = new LinkedList();

sort(list, result);

return result;

}

void sort(List list, final List result) {

assert list != null : "list can't be null";

assert result != null : "result can't be null";

Iterator it = list.iterator();

for (it.first(); !it.isDone(); it.next()) {

int slot = result.size();

while (slot > 0) {

if (_comparator.compare(it.current(), result.get(slot - 1)) >= 0) {

break;

}

--slot;

}

result.insert(slot, it.current());

}

Test Case

public class InsertionSortListSorterTest extends AbstractListSorterTestCase {

protected ListSorter createListSorter(Comparator comparator) {

return new InsertionSortListSorter(comparator);

}

Data Structures and Algorithms

Tuesday, March 17, 2015

Ternary Search Trees

Thursday, October 2, 2014

Iteration/Enumerator

Introduction

Monday, September 29, 2014

Comparing basic sorting algorithms

Basic Sorting

Friday, August 22, 2014

Links