I am writing this with a view to making headway in wrapping my head around Oracle's Java docs. Simply reading these dry documents is for me a daunting task, to say nothing of comprehending them! But I have found that the information becomes more interesting, and the process of reading more tolerable, when I put in the effort to conceptualize the info in a way that makes sense to me—which makes sense, since info can hardly interest you if you don't understand it.
Fortunately I have some idea as to how I can break this task down into subtasks. At present I am primarily interested in the core Java interfaces. These are the interfaces in the core Java SE APIs, defined in Oracle's API specification for Java SE.
Let us begin with what Oracle identifies as "[t]he root interface in the collection hierarchy": the Collection
interface. There's a lot to learn about Collection
. In this post we'll focus on what collections and objects of type Collection
are.
Collections
According to Oracle, collections are groups of objects called elements. Some collections can contain duplicates of their elements. We call these bags. Other collections cannot. Call them boxes.
(Edit: I forgot Oracle technically says a collection "represents" a group of objects, not that a collection is one. I'll have to think about how that affects what I say below.)
Boxes are like sets in set theory. Just as, in set theory, you can't have more than one of a single object in a set, you can't put more than one of the same object in a box.
Every once in a while you might see multiple instances of the same numeral in notation representing a set, such as But this is just a notational variant of So is or These are all the same set. Why? Because they have the same elements. There are not actually two in the set There's only one, because the number is just a single object. Only one such number exists. No matter how many times you write the numeral it always stands for the same number.
Here's a way to see my point. Let be Then is But obviously denotes the same thing wherever it appears. That is, So can only have three things in it: and So by substituting for we conclude that can only have three things in it: and
All this to say, sets never have duplicate elements. Boxes are the same way. Consider the paradigmatic box: an object of type Set
(a subinterface of Collection
). (From here on out, unless I indicate otherwise, "set" means a Set
object.) For any two elements e1
and e2
of a set, !e1.equals(e2)
. It follows, without loss of generality, that e1
and e2
are not duplicates.
You might doubt these things are true of all sets. Some sets have elements, the thought might go, that lack an equals
method. If e1
were one of those elements, the method call e1.equals(e2)
would be illegal. So it could not return false
as needed for the negation of its return value to be true
.
But you would be wrong! Any element of a collection is an object, i.e., an instance of a class. And all classes are subclasses of Object
. So they all inherit the equals
method from Object
. For this reason, no matter the type of e1
, e1
has an equals
method. How can we be sure though that e2
is a valid argument to pass? We can because it's an instance of Object
, as is e1
, and the lone parameter of Object
's equals
method is of type Object
.
Null Elements
So it looks like no set can contain multiple copies of an element. But hang on... Can't you put null
in a collection? That's not an object. It's a value indicating the absence of an object reference. I'm the one who's wrong then! Not just any element of any collection is an object.
So, what if e1
happens to be null
? Then e1
is not an instance of any class, and does not inherit an equals
method from Object
. In that case it's not strictly true that !e1.equals(e2)
. But there still are no duplicates of null
in the set, according to the documentation for Set
. Also, if e1
is null
, it is true that e1
and e2
are not such that e1.equals(e2)
. Why? For those elements to be such that e1.equals(e2)
, that method call would have to return true
. But the invocation is not allowed, so it cannot return true
. So the generalization in the documentation still holds true. That is, every set contains no pair of elements e1
and e2
such that e1.equals(e2)
.
Now we see why Oracle chose to deny any element pair is such that e1.equals(e2)
, instead of affirming, like I did, that all pairs are such that !e1.equals(e2)
. Wording matters. But in my defense, what I said was such truthlike, much truthy. Not all pairs of elements make the negation true, but all pairs of objects in any collection do.
There being collections with null elements doesn't square with defining "collections" as groups of objects and "elements" as objects in a collection. Some collections have none but null elements. No null element is an object, though each is in a collection. So, what are collections and elements? To answer this question, we have to consider two things.
First, what is it that all elements of Collection
objects have in common, be those elements objects or not? If there is something they all have in common, we may keep calling all Collection
objects collections, if we please. If that's what we want, we just have to revise our definition of "collection."
Second, is it true that all Collection
objects are collections? Perhaps objects of that type only truly qualify as collections when they contain objects. Then, if a Collection
object has only null elements, it's not a collection (group of objects). Assuming that's right, there is no need to revise our definition of "collection," just our notion of which Collection
objects the less formal term "collection" applies to. (That's if you have the same notion I had—that all Collection
objects are collections.)
I'll take these questions in turn. There is a common feature of all elements of Collection
objects. And it's not just that they're elements of Collection
objects, of course. Pointing that out doesn't help. We want to know what makes them elements!
The pertinent common feature is their being values. That is, all elements are pieces of data in the form of variables or literals.
Null elements are values known as reference literals. These differ from primitive literals, not with respect to whether they're values, but with respect to whether they are directly assignable—so, assignable without autoboxing—to variables of reference types. These are types like Object
and HashSet
. We declare variables of these types if we want variables capable of storing objects' memory addresses.
Though primitive literals are values, they cannot be elements. Why not? Because they're neither references nor reference literals. Collection
objects can only hold either variables of reference types or values directly assignable to variables of those types. This is because Collection
is a blueprint for data structures that are designed to group objects together. Primitives are not objects, nor are they the types of values that point to objects. So there is no need for Collection
to accommodate the inclusion of primitives as elements.
But in describing a collection as a group of objects known as its elements, Oracle seemed to be saying a collection's elements are the objects themselves. Not values serving as references to the objects. But then Oracle must mean something different by "element" when it refers to null elements in the documentation for Set
.
Suppose we want our usage of "element" to be maximally consistent. Then we're at a crossroads, semantics-wise. Either we stop calling null values elements, or we stop calling objects elements, opting to call values elements instead.
Say we stop calling objects elements. Then we can alternatively define a collection as a group of non-primitive values, and define its elements as those values. This way we can keep calling all Collection
objects collections, and keep calling occurrences of null
in a collection elements.
If on the other hand we reserve the term "element" for objects, then we have to say a Collection
object
has no elements if all
"contains" is null
. And if
not only "contains" null
, but also contains objects, we must say that those objects
contains are
only elements—that null
is not one of
elements. And we must deny that any variable or literal is itself an element of
since no variable or literal is itself an object.
As observed previously, if there are such things as null elements, some Collection
objects are not groups of objects—specifically Collection
objects with only null elements. So, by Oracle's definition, such objects are not collections. Neither are Collection
objects that "contain" null
instead of elements, if such objects are possible. They're not groups of objects.
The only way a Collection
object
containing no other objects can be a group of objects is if it's an element of itself.
would then be like a singleton set (in set theory) in that
has only one object as an element. But
would differ from a singleton in that (i) if we count null
as an element and
"contains" null
,
has at least two elements, and (ii)
is in
(In Zermelo-Fraenkel set theory, no set is a member of itself.)
We might establish a linguistic convention according to which every Collection
object is automatically an element of itself. That would make the relation between a Collection
object and its elements a bit more like the relation between a set and its subsets in set theory. Every set is a(n improper) subset of itself, because it contains all the elements it contains. It's an idea, but I imagine it would make matters more confusing rather than less.
I say we classify values, not objects, as elements. What do you think?
Top comments (0)