# The Probability/Statistics Object Library

## Introduction

Kyle Siegrist is a Professor of Mathematical Sciences at the University of Alabama in Huntsville.

My main purpose in this article is to describe the Probability/Statistics Object Library, abbreviated in this paper as PSOL. This project is supported by a grant from the Course and Curriculum Development Program of the National Science Foundation (award number DUE-0089377 ). The library consists of complete applets and applet components for use probability and statistics instructional materials. All objects in the library (both executable files and source files) are freely available for use, modification, and redistribution under a Creative Commons license.

I place great emphasis on the reusability of the objects in the library. In particular, the applets can be used by teachers in a variety of different courses at different mathematical levels. This versatility comes at a cost however, since a certain amount of work is required on your part to adapt the materials for your courses.

My central premise is that general-purpose, reusable applets and objects can be very useful to teachers and students. Thus, my second goal is to make the case for this premise and encourage the development of object libraries in other areas of mathematics.

## Applets

The applets in our library are intended to illustrate concepts and techniques in probability and statistics in an interactive, dynamic way. You can download an applet, "drop" it into a web page (details on page 4), and then add other elements of your choice, such as expository text, data sets, and graphics. The PSOL currently contains approximately 60 applets that fall into two basic categories:

• Many of these applets are simulations of random processes, with the data displayed in custom tables and graphs. Typically, the student can vary the parameters of the process and choose among basic probability distributions that drive the simulation. The main goal of an applet of this type is to show the agreement between simulated behavior of the random process and predictions from the mathematical theory.
• In some PSOL applets, the student generates data by making choices in a game or by clicking on a number line or scatter plot. The main goal of an applet of this type is usually to increase the student's understanding of some statistical definition or concept.

In the next three sections we will consider a particular representative applet, the Dice Experiment, in some detail. My purpose is not to discuss the underlying mathematics, but rather the issues of reuse and adaptability.

## The Dice Experiment

Figure 1 shows a screen shot of the Dice Experiment, a typical applet in the library.

• If your browser supports Java (version 1.4 or later), you can click on Figure 1 to open a new web page with the live applet.
• If you need to install the Java plug-in for your browser, visit Java.com.

### Figure 1. A typical view of the Dice Experiment

The Dice Experiment applet is a virtual random experiment that rolls dice, collects data, and displays the data in tables and a graph. Specifically, the basic experiment is to roll n dice and record the values of the following random variables:

• Y: the sum of the n dice scores
• M: the average of the n dice scores
• U: the minimum of the n dice scores
• V: the maximum of the n dice scores
• Z: the number of aces (1's) among the n dice scores

The simulation is controlled by the buttons and selection boxes in the main toolbar (Figure 2) at the top.

### Figure 2. The main toolbar

• The first button   runs the experiment one time -- the dice are rolled one at a time, with audible feedback, and then the tables and graphs are updated.
• The second button   runs the experiment repeatedly. In this mode,
• the dice are rolled all at once on each run of the experiment,
• the tables and graphs are updated periodically, according to the number in the Update setting, and
• the simulation stops after the number of runs specified in the Stop setting.
• The third button   in the main toolbar can be used to stop the simulation at any time -- the data in the tables and graphs are preserved.
• The fourth button   in the main toolbar clears the data in the tables and graphs and restores the applet to its initial state.

The table on the left records the values of the five random variables on each update. The table on the right gives the probability mass function, mean, and standard deviation of a selected random variable in the distribution column, and the relative frequency function, empirical mean, and empirical standard deviation in the data column. The graph on the right gives exactly the same information as the table on the right, but in graphical form rather than numerical form. Information about the theoretical distribution is displayed in blue, and information about the empirical data is displayed in red. The choice of the random variable to display in the table and graph is made with the drop-down box in the second toolbar (Figure 3).

### Figure 3. Selection of random variable

The number of dice can be varied from 1 to 30 with the scroll bar. Clicking on the die icon brings up a dialog box (Figure 4) for specifying the probabilities that govern each die:

### Figure 4. The die probabilities dialog box

The buttons along the top of this dialog box specify six "pre-packaged" die distributions. The first gives uniform probabilities (corresponding to a fair die), and the others give various non-uniform distributions for a crooked die. Additionally, the student can specify the probabilities directly in the text boxes.

Information about each component in the applet is given in a tool tip that appears when the student rests the cursor on the component. Basic information about the applet is also given in a help box that pops up when the student clicks on the information button in the main toolbar.

In the next section I will cover the nuts and bolts of how you could include this applet in your own Web-based course materials. Then I will return to the more interesting discussion of pedagogical issues.

## Installing the Dice Experiment Applet

The procedure for installing an applet in a Web page is relatively simple -- I illustrate with the Dice Experiment:

1. At the library web page for the Dice Experiment applet, click on the link to download and save DiceExperiment.jar, the Java Archive (JAR) file for the applet. This file contains all of the Java class and resource files (such as image files and sound files) needed for the applet.
2. Insert the following in the HTML (Web) page at the point where the applet is to appear:

<applet code="edu.uah.math.experiments.DiceExperiment.class" archive="DiceExperiment.jar" width="500" height="400""></applet>

The snippet of code in the second step is raw HTML and is based on the assumption that the HTML page and the JAR file are in the same folder. If they are in different folders, only the archive property needs to be changed to give the appropriate address of the JAR file. A Web-authoring tool such as FrontPage or Dreamweaver will have a simple point and click method for inserting an applet, sparing you from writing any HTML code.

Since the applets in the PSOL are freely available for use and redistribution, you can publish your materials on your school's web server or a local intranet, or you can install the materials on individual PCs. This freedom of distribution is a significant factor if your school has unreliable or incomplete Internet connections.

## Discussion of the Dice Experiment Applet

The Dice Experiment applet (like the other applets in the library) contains no explicit mathematical exposition and thus, in principle, can be used by teachers and students at various levels. The applets in the library are intended to be small "micro-worlds" in which students can run virtual versions of random experiments and play virtual versions of statistical games.

With appropriate exposition, you could use the Dice Experiment applet as part of a discussion of any of the following topics:

• Random experiments and random variables. Rolling dice is a simple and conceptually clear example of a random experiment, and it is one that every student has actually performed. Random variables such as the sum of the dice scores and the largest of the dice scores are also easy to understand, and they are variables that most students have computed playing real dice games.
• A random sample from a distribution. The dice are all governed by the same underlying probability distribution, so rolling n dice generates a random sample of size n from this distribution. The distribution that governs the dice can be specified arbitrarily, so the die in this experiment is really just a simple metaphor for a random measurement that is repeated n times independently.
• The sample mean and the law of large numbers. The law of large numbers shows up in several places: the convergence of the empirical mean and standard deviation to the distribution mean and standard deviation, respectively, and the convergence of the relative frequencies to the corresponding probabilities.
• The central limit theorem. No matter what probability distribution is given to the individual dice (as long as it is not a point mass at a single value), the distribution of the sum and the distribution of the average become more "normal" as the number of dice increases.
• Order statistics. The minimum and maximum scores are the extreme order statistics for the random sample. Their distributions converge, respectively, to point mass at the smallest and largest scores with positive probability.
• Bernoulli trials and the binomial distribution. In terms of rolling an ace or not, the dice form a sequence of Bernoulli trials, and random variable Z has a binomial distribution.

The important point -- and the basic assumption of the PSOL -- is that instructors must provide appropriate expositions of the topics that are suitable for their classes. The Dice Experiment applet and the other applets in the library are of little value without such guidance from instructors. In the language of reusability (see the Reusable Learning project), the applets are adaptable, but not adoptable.

For our discussion, it might be useful to use the term module to refer to a collection of elements (typically including mathlets, exposition, and exercises) that is focused on a relatively small mathematical topic and is pedagogically complete. Most interactive materials on the web, including those in this journal, the MathForum collection, and other portal sites, are modules or even larger learning environments. In many cases, the elements of a module are tightly coupled and cannot be used independently -- such modules were never intended to be broken apart and adapted to other settings. In short they are adoptable but not adaptable. Clearly, a well designed module has two main advantages:

• Very little work is required by the instructor to use the module.
• The module can be sharply focused on particular learning objectives.

The main disadvantage of modules is that instructors must find ones that precisely fit their needs in terms of content, level, and learning objectives. Moreover, combining modules from different sources is likely to result in a confusion of conflicting styles, notation, and user interfaces.

One of my goals for the PSOL is to provide general-purpose applets in probability and statistics that can be used as elements of high quality modules. I believe that both types of resources are useful: complete modules and libraries of reusable elements. However, there are not as many collections of the second type. Moreover, developers must understand the inevitable tradeoffs between the two approaches. The Reusable Learning project has extensive information about reusable learning resources, including guidelines for developers.

## Reusable Components

The applets in the PSOL are constructed from programmatic components that are themselves intended to be reusable. Play with the Dice Experiment applet again (or look at the picture above) and note the basic elements. The visible components of this applet are

• The basic shell with the main toolbar
• The dice
• The table that displays the values of the random variables
• The table that displays the true and empirical distributions of the selected random variable.
• The graph that displays the true and empirical distributions of the selected random variable.
• The scroll bar for changing the number of dice.
• The dialog box for specifying the probabilities that govern the dice.

In addition, the applet includes a number of components that are not visible, but rather correspond to mathematical elements:

• The probability distribution that governs the dice
• The probability distribution of each of the random variables
• A general data structure for collecting and processing empirical data from the random variables

All of these components, and many more, are available in the library. Each component is available in two forms:

• A Java Archive (JAR) file that includes all Java class and resource files for the object, in most cases packaged as a Java "bean". The JAR file is all that is needed to use the object as is.
• A compressed ZIP file that includes all source and resource files for the object.

You can import a Java bean into a "builder tool" to expose the properties and methods of the object. A builder tool allows you to include objects in another project in a point-and-click fashion, with relatively little coding. The free integrated design environment (IDE) from NetBeans.org is an example of such a builder tool. The source code, on the other hand, allows you to modify the object or study its programming.

Thus, if you have programming experience, you could modify an applet from the library, or you could construct a custom applet using components from the library. In both cases, you could do this in a fraction of the time needed to build the applet from scratch. For example, modifying the Dice Experiment applet to explore other random variables (such as the range of the dice scores or the number of two's) would be relatively simple -- only a few objects would need slight modification. Assembling a custom applet from components would require significantly more work. However, the objects in the library spare you much of the tedious, low-level programming that has little pedagogical value. For example, programming a die so that the spots are in the proper location and scale with the size of the die is clearly not a good use of an instructor's time. The PSOL provides a virtual die that has all of the functionality needed for educational projects.

I do not want to suggest, however, that programming a Java applet is easy, even with the help of the components in the PSOL. All programming is difficult, and learning any programming language, Java included, requires a significant investment in time. On the other hand, I will argue in Section 8 that higher level programming can be a valuable educational experience.  First, however, I will guide you through the construction of a very simple applet, the ubiquitous Hello World! example.

## Hello World!

Our goal in this section is the construction of a very simple applet from components in the library. To work through this example yourself, you will need the Java Software Development Kit (available at the Sun Java site), a text editor (such as Windows Notepad), and your Java-enabled browser.

The Hello World! applet requires two objects from the library: the Experiment object, which provides a basic shell for a random experiment, and the Coin object. We will need to modify the objects, so we will need the source files. First create a new empty folder. Click on the two links to go to the appropriate pages of the library, and download the source ZIP files into your folder.

Next extract the files in these two ZIP archives into your folder, but do not preserve the library folder names when you do the extraction. This will simplify the naming of our objects and consequently the folder structure that we must use. At this point, you should have the following essential files in your folder (as well as some other files that we will not need):

• Experiment.java
• Coin.java
• step.gif
• run.gif
• stop.gif
• reset.gif

The first two are the Java source files for our Experiment and Coin objects, and the next four are tiny GIF files for the buttons on the main tool bar.

Now open the file Experiment.java with your text editor, and remove the first programming line:

  package edu.uah.math.experiments;

Again, this step merely simplifies the names of the objects and allows us to put all of our files in one folder. Similarly, open the file Coin.java and remove the first programming line:
  package edu.uah.math.devices;


Next, create a new file called HelloWorld.java, type (or copy and paste) the following lines, and save the file to your folder:

  public class HelloWorld extends Experiment{
private Coin coin = new Coin();
public void init(){
super.init();
coin.setTailLabel("World");
}
public void doExperiment(){
super.doExperiment();
coin.toss();
}
public void update(){
super.update();
coin.setTossed(true);
}
public void reset(){
super.reset();
coin.setTossed(false);
}
}


Let's try to understand what we have just done.

The first line creates a new applet object, called HelloWorld , which is a subclass of the Experiment object. Thus, HelloWorld will inherit all of the methods of Experiment.

The second line creates a new Coin object called, appropriately enough, coin .

The next group of lines is in a method called init that initializes the applet. The first line of this method calls the corresponding method of the parent Experiment object, while the next two lines change the default labels on the coin from H and T to Hello and World , respectively. The final line of the method adds the coin to the applet.

The next group of lines is in a method called DoExperiment that defines our random experiment. The first line calls the corresponding method in the Experiment object, and the second line tosses our coin.

The next group of lines is in a method called update that defines how the information in the applet will be displayed. The first line calls the corresponding method in the Experiment object, and the second line sets the "tossed" state of the coin to true (so that the coin label will be displayed).

Finally, the last group of lines is in a method called reset, which specifies the actions that occur when the user presses the reset button. Again, the first line calls the corresponding method in the Experiment object, while the second line sets the "tossed" state of the coin to false (so that the coin label will not be displayed).

Note that much of the structure and functionality of HelloWorld are inherited from the parent Experiment object by invoking the super- methods. This structure and functionality include the basic shell with the toolbar and buttons and the default actions performed when the user clicks on the buttons. All that we had to do was add the special functions that are appropriate for our applet.

At this point, you should be able to compile HelloWorld.java without errors using the Java compiler in the Java Software Development kit (or a more sophisticated Java development environment, if you have one).

Our final task is to create a stub HTML file so that we can view the applet in a browser. With your text editor, create a new file called HelloWorld.html, type (or copy and paste)the lines below, and save the file to your folder.

  <html>
<title>HelloWorld</title>
<body>
<p><applet code="HelloWorld.class" width="450" height="300">HelloWorld</applet></p>
</body>
</html>


The important tag is the applet tag, which merely has a reference to the HelloWorld class file and specifies the width and height of the applet.

That's it! You should be able to open HelloWorld.html with your browser and play with your applet. Click on the step button to toss the coin and see either "Hello" or "World" depending on whether you coin landed heads or tails. Click on the run button and practice changing the update and stop settings.

Figure 5 shows a screen shot of the applet. If you did not construct your own applet (or even if you did), you can click on the image to see the finished product.

## Object-Oriented Programming & Abstract Mathematics

Java, like most modern programming languages has an object oriented paradigm, as opposed to the procedural paradigm of older languages. In object oriented programming (OOP), the basic programmatic elements are classes of objects that are defined by their properties and methods. A class of objects can be sub-classed by modifying the properties and methods or by adding additional properties and methods. An object can be passed, as a parameter, to another object.

The object oriented paradigm, if not the particular terms in the jargon, should be clear to any teacher of mathematics, for it is the same paradigm as in abstract mathematical structures. My thesis in this section is that object oriented programming can be pedagogically valuable to students of mathematics, particularly when the programming is centered on mathematical objects. I will use examples from the PSOL to make the case for this thesis.

An abstract probability distribution on a set S of real numbers is implemented in the PSOL as an abstract Java class. The probability mass function (or probability density function) f, and the domain S, are left unspecified, but then other quantities of interest (cumulative distribution function, quantile function, mean, variance, simulated value, etc.) can be computed from this function. These computations form the methods of the object and are simply the Java implementations of standard definitions in probability theory. For example, the mean of a discrete distribution on a countable set S with probability mass function f is given by

(1)       .

On the other hand, the binomial distribution with parameters n and p governs the number of successes in n Bernoulli trials with success probability p on each trial. For example, the number of aces in the Dice Experiment has this distribution, where n is the number of dice and p is the probability of rolling an ace with a single die. The binomial distribution is implemented in the PSOL as a subclass of the abstract distribution class, by specifying in the set of possible values S = {0, 1, ..., n} and the probability mass function f:

(2)       .

Many of the generic methods of the abstract distribution class are then overridden (replaced) in the binomial distribution class with the appropriate special closed formulas. For example, as every student of probability knows, the mean of the binomial distribution is simply

(3)       .

Thus, the method for computing the mean in the abstract distribution class, which implements (1), is overridden the binomial distribution class by implementing (3).

A random variable is implemented in the PSOL as an object that contains both a distribution object and a data object. A random variable can be passed to a graph or table to display information about the distribution or empirical data. In particular, note that the graph and table in the Dice Experiment are not "hard-wired" for this particular applet -- the graph and table are general components that can be used with any random variable.

The main point I want to make is that the object-oriented structure of the components parallels the underlying mathematical theory. I believe that designing an object model for an area of mathematics and programming the objects in this model lead to deep understanding of the mathematics, just as rigor and proof lead to deep understanding.

## Concluding Remarks

First, I hope you will visit the PSOL , use the resources in the library, and provide feedback on ways that these resources can be improved. Although there are many excellent projects in probability and statistics available on the web, I think that the PSOL project is special because of two attributes:

• Scale. The PSOL contains approximately 60 applets in probability and statistics, all with a common interface and with no explicit mathematical exposition. The applets are built out of approximately 100 distinct components.
• Reusability. The objects in the library, at both the applet level and the component level, are designed to be adapted and implemented in other projects. The objects are completely free and open source.

Second, I want to encourage authors and developers of educational resources in mathematics to give more attention to issues of reuse and adaptation. Currently, I believe, most authors tend to develop materials from scratch, without considering that high quality components may already be available. These developers also tend to think of their work only in terms of a final product to be adopted, without much thought of how their materials might be broken into components and reused (adapted). The Reusable Learning project is an excellent resource for authors and developers on such issues.

Finally, I want to encourage teachers and students not to think of themselves only as consumers of educational resources, but also as partners in the development process. To an increasing extent, teachers and students will assemble and adapt resources from disparate sources to create a customized learning environment. By putting these resources together, if only by providing the expository glue that connects them, teachers and students become developers.

## Online Resources

The following Web sites were linked to in the body of this article. We collect them here (in order of appearance) for your convenience.  All links have been verified as of October 8, 2004.  Other resources may be found elsewhere in this journal and in the MathDL collections.

Probability/Statistics Object Library (http://www.math.uah.edu/stat/objects/)

Creative Commons (http://creativecommons.org/)

Java.com (http://www.java.com)

Reusable Learning (http://www.reusablelearning.org/)

Math Forum @ Drexel (http://www.mathforum.org)

NetBeans.org (http://www.netbeans.org)

Java at Sun (http://java.sun.com)