Path: alphanet.ch!imp.ch!news.imp.ch!sunqbc.risq.qc.ca!nntp.cs.ubc.ca!torn!watserv3.uwaterloo.ca!undergrad.math.uwaterloo.ca!neumann.uwaterloo.ca!alopez-o
From: alopez-o@neumann.uwaterloo.ca (Alex Lopez-Ortiz)
Newsgroups: sci.math,news.answers,sci.answers
Subject: sci.math FAQ: Fundamentals
Followup-To: sci.math
Date: 17 Feb 2000 22:51:57 GMT
Organization: University of Waterloo
Lines: 505
Approved: news-answers-request@MIT.Edu
Expires: Sun, 1 Mar 1998 09:55:55
Message-ID: <88hu2d$qtk$1@watserv3.uwaterloo.ca>
Reply-To: alopez-o@neumann.uwaterloo.ca
NNTP-Posting-Host: daisy.uwaterloo.ca
Summary: Part 1 of 31, New version
Keywords: Algebraic structures/What are numbers
Originator: alopez-o@neumann.uwaterloo.ca
Originator: alopez-o@daisy.uwaterloo.ca
Xref: alphanet.ch news.answers:69771 sci.answers:4514
Archive-name: sci-math-faq/numbers
Last-modified: February 20, 1998
Version: 7.5
FUNDAMENTALS
_________________________________________________________________
Algebraic structures
We will attempt to give a brief explanation of the following concepts:
* N is a monoid
* Z is an integral domain
* Q is a field
* in the field R the order is complete
* the field C is algebraically complete
If you have been asked by a child to give them arithmetic problems, so
they could show off their newly learned skills in addition and
subtraction I'm sure that after a few problems such as: 2 + 3, 9 - 5,
10 + 2 and 6 - 4, you tried tossing them something a little more
difficult: 4 - 7 only to be told `` That's not allowed.''
What you may not have realized is that you and the child did not just
have different objects in mind (negative numbers) but entirely
different algebraic systems. In other words a set of objects (they
could be natural numbers, integers or reals) and a set of operations,
or rules regarding how the numbers can be combined.
We will take a very informal tour of some algebraic systems, but
before we define some of the terms, let us build a structure which
will have some necessary properties for examples and counterexamples
that will help us clarify some of the definitions.
We know that any number that is divided by six will either leave a
remainder, or will be divided exactly (which is after all the
remainder 0). Let us write any number by the remainder n it leaves
after division by six, denoting it as [ n ]. This means that, 7, 55
and 1 will all be written [1], which we call the class to which they
all belong: i.e. 7 in [1], 55 in [1], or, a bit more technically, they
are all equivalent to 1 modulo 6. The complete set of class will
contain six elements, and this is called partitioning numbers into
equivalent classes because it separates (or partitions) all of our
numbers into these classes, and any one number in a class is
equivalent to any other in the same class.
One interesting thing we can do with these classes is to try to add or
to multiply them. What can [1] + [3] mean? We can, rather naively try
out what they mean in ``normal'' arithmetic: [1] + [3] = [1 + 3] =
[4]. So far so good, let us try a second example 25 in [1] and 45 in
[3], their sum is 70 which certainly belongs to [4]. Here we see what
we meant above by equivalence, 25 is equivalent to 1 as far as this
addition is concerned. Of course this is just one example, but
fortunately it can be proven that the sum of two classes is always the
class of the sums.
Now this is the kind of thing we all do when we add hours for example,
7 (o' clock) plus 6 hours is 1 (o' clock), and all we are really doing
is adding hours (modulo 12).
The neat part comes with multiplication, as we will see later on. But
for now just remember, it can be proven that something like [4] x [5]
= [2] will work: the product of two classes is the class of the
product.
Now for some of the necessary terminology.
Monoids and Groups
We need to define a group.
Let us take a set of objects and a rule (called a binary operation)
which allows us to combine any two elements of this set. Addition is
an example from math, or ANDing in some computer language.
The set must be closed under the operation. That means that when two
elements are combined the result must also be in the set. For example
the set containing even numbers will always give us an even number
when two elements are added together. But if we restrict ourselves to
odd numbers, their sum is not an odd number and so we know right off
the bat that the set of odd numbers and addition cannot constitute a
group. Some books will consider closure in the definition of binary
operation, and others add it as one of the requirements for a group
along with the ones that follow below.
The set and the operation is called a group if the binary operation
satisfies the following criteria:
* the operation is associative, which means it doesn't matter how
you group the elements you are operating on, for example in our
set of remainders: [1] + ([3] + [4]) = ([1] + [3]) + [4]
* there is an identity element, meaning: one of the elements
combined with the others in the set doesn't change them in the
least. For example the zero in addition, or the one in
multiplication.
* every element has an inverse with respect to that operation. If
you combine an element and its inverse you get the identity (of
that operation) back.
(Be careful with this last one, -3 is the inverse of 3 in addition,
since they give us 0 when added, but 1/3 is the inverse of 3 with
respect to multiplication, since 3 x 1/3 = 1 the identity under
multiplication.)
So we can see that the set of natural numbers N (with the operation of
addition) is not even a group, since there is no inverse for 5, for
example. (In other words there is no natural number which added to 5
will give us zero.) And so the third rule for our operation is
violated. But it still has some structure, even if it is not as rich
as the ones we'll see later on.
Sets with an associative operation (the first condition above) are
called semigroups, and if they also have an identity element (the
second condition) then they are called monoids.
Our set of natural numbers under addition is then an example of a
monoid, a structure that is not quite a group because it is missing
the requirement that every element have an inverse under the operation
(Which is why in elementary school 4 - 7 is not allowed.)
What about the set of integers, is it a group?
By itself this question is nonsensical. Why? Well, we have not
mentioned under what operation. OK, let us say: the set of integers
with addition.
Now, addition is associative, the zero does not change any number when
added to it, and for every number n we can add -n and get zero. So
it's a group all right.
In fact it is a special kind of group. When we can perform the
operation on the two elements in any order (e.g a + b = b + a) then
the group is called commutative, or Abelian in honor of Abel. Not
every operation is commutative, for example three minus two is
certainly not the same as two minus three. Our set of integers under
addition is then an Abelian group.
Rings
If we take an Abelian group (remember: a set with a binary operation)
and we define a second operation on it we get a bit more of a
structure than we had with just a group.
If the second operation is associative, and it is distributive over
the first then we have a ring. Note that the second operation may not
have an identity element, nor do we need to find an inverse for every
element with respect to this second operation. As for what
distributive means, intuitively it is what we do in math when perform
the following change: a x (b + c) = (a x b) + (a xc).
If the second operation is also commutative then we have what is
called a commutative ring. The set of integers (with addition and
multiplication) is a commutative ring (with even an identity - called
unit element - for multiplication).
Now let us go back to our set of remainders. What happens if we
multiply [5] x [1]? We see that we get [5], in fact we can see a
number of things according to our definitions above, [5] is its own
inverse, and [1] is the multiplicative element. We can also show
easily enough (by creating a complete multiplication table) that it is
commutative. But notice that if we take [3] and [2], neither of which
are equal to the class that the zero belongs to [0], and we multiply
them, we get [3]x[2] = [0]. This bring us to the next definition. In a
commutative ring, let us take an element which is not equal to zero
and call it a. If we can find a non-zero element, say b that combined
with a equals zero ( a x b = 0) then a is called a zero divisor.
A commutative ring is called an integral domain if it has no zero
divisors. Well the set Z with addition and multiplication fullfills
all the necessary requirements, and so it is an integral domain.
Notice that our set of remainders is not an integral domain, but we
can build a similar set with remainders of division by five, for
example, and voil`, we have an integral domain.
Let us take, for example, the set Q of rational numbers with addition
and multiplication - I'll leave out the proof that it is a ring, but I
think you should be able to verify it easily enough with the above
definitions. But to give you a head start, notice the addition of
rationals follow all the requirements for an abelian group. If we
remove the zero we will have another abelian group, and that implies
that we have something more than a ring, in fact, as we will see in
the next section.
Fields
Now we can make one step further. If the elements of a ring, excluding
the zero, form an abelian group (with the second operation) then it is
a field. For example, write the multiplication table of the remainders
of division by 5, and you will see that it satisfies all the
requirements for a group: (You will probably have noticed that the
group does not contain the number five itself since [5] = [0].)
(tabular)(c | c c c c) 1 2 3 4 ; 1 2 3 4 ; 2 2 4 1 3 ; 3 3 1 4 2 ; 4 4
3 2 1(tabular)
(Why isn't the set of divisors of six - excluding the zero and under
multiplication - a group? That's easy enough, since we have excluded
the zero we do not have the result of [2] x [3] in our set, so it
isn't closed.)
Ordering
Given a ring, we can say that it is ordered when you have a special
subset of that ring behaves in a very special way. If any two elements
of that special subset are added or multiplied their sum and their
product are again in the special subset. Take the negative numbers in
R , can they be that special subset? Well the sum seems to be
allright, it is also a negative number. But things don't work with the
product: it is positive. What about the positive numbers? Yep, and in
fact we call that special subset, the set of positive elements. Now,
we gave the definition for an ordered ring, we can also define an
ordered field the same way.
But what does a complete ordered field mean? Well the definition looks
rather nasty: it is complete if every non-empty subset which posesses
an upper bound has a least upper bound.
Let's translate some of that, trying to lose as little information on
the way. A bound is something that guarantees that all of the elements
of your set are on one side of it (reasonably enough). For example,
certainly all negative reals are less than 100, so 100 is a bound (it
is in fact an upper bound 'cause all negatives are ``below'' it). But
there are lots of other bounds, 1, 5, 26 will all do nicely. The
question now is, of all of these (upper bounds) which is the smallest,
that is which one is ``the border'' so to speak? Does it always
exists?
Let's take the rationals, and look at the following numbers:
1.4, \; 1.41, \; 1.414, \; 1.4142, \; 1.41421, ...
Now each of these is a rational number (it can be written as a
fraction), and they are getting closer and closer to a number we've
probably seen before (just take out your calculator and find the
square root of two). So we can write the shorthand for this series as
sqrt(2). Certainly we can find an upper bound for this series, 3 will
do nicely, but so can 1.5, or 1.42. But what is the smallest. Well
there isn't any. Not among the rational at least, because no matter
what fraction you give I can give you one closer to the square root of
two. What about the square root of two itself? Well it's not a
rational number (I'll skip the proof, but it is really rather easy) so
you can't use it. If you want another series which is really neat look
at the section on ``Euler's formula'' in the FAQ.
And that is where the reals come in. Any set or reals that is bounded
you can certainly find the smallest of these bounds. (By the way this
``least upper bound'' is abbreviated ``l.u.b.'', or ``sup'' for
supremum.) We can also turn things around and talk of lower bounds,
and of the largest of these etc. but most of that will be just a
mirror image of what we have dealt with so far.
So that should be it. And for years that did seem to be it, we seemed
to have all the numbers we'd ever care to have.
There was just one small stick in the works, but most people just sort
of pretended not to notice, and that was that not all polynomials had
solutions. One simple polynomial of this kind is x^2 + 1 = 0. It's so
simple, yet there's no self respecting number that would solve this
polynomial. There were these funny answers which seemed like they
should be solutions but no one could make any sense out of them, so
they were considered imaginary solutions. Which was really too bad
because they were given the name of imaginary numbers and now that the
name stuck we realize that they are numbers just as good as any of the
ones we have been using for centuries. And in fact that takes us to
the last great pinnacle in this short excursion. The field of complex
numbers.
We can define an algebraically closed field as a field where every
nonconstant polynomial (i.e. one with an x in it from high school
days) has a zero in the field. Whew! This in short means that as long
as the polynomial is not a constant number (which is no fun anyways)
but something which looks like it wants a solution, like 5 x^3 - 2 x^2
+ 6 = 0 it will always have one, if you are working with complex
numbers and not just reals.
There is another definition which is probably just as good, but may or
may not be easier: A field is algebraically closed if every polynomial
splits into linear factors. Linear factors are briefly factors not
containing x to any power of two or higher, in other words in the
form: ax + b. For example x^2 + x - 6 can be factored as (x + 3)(x -
2), but if we are in the field of reals we cannot factor x^2 + 1, but
we can in the field of complex numbers: x^2 + 1 = (x - i)(x + i),
where, you may recall, i^2 = -1.
_________________________________________________________________
What are numbers?
Introduction
Informally:
* N = { 0,1,... } or N = { 1,2,... }
Wether 0 is in N depends on where you live and what is your field
of interest. At the informal level it is a religious topic.
* Z = { ..., - 1,0,1,... }
* Q = { p/q | p, q in Z and q != 0 }
* R = { d_0.d_1d_2... | d_0 in Z and 0 <= d_i <= 9 for i > 0 }
* C = { a + b o i | a, b in R and i^2 = -1 }
Construction of the Number System
Formally (following the mainstream in math) the numbers are
constructed from scratch out of the axioms of Zermelo Fraenkel set
theory (a.k.a. ZF set theory) [Enderton77, Henle86, Hrbacek84]. The
only things that can be derived from the axioms are sets with the
empty set at the bottom of the hierarchy. This will mean that any
number is a set (it is the only thing you can derive from the axioms).
It doesn't mean that you always have to use set notation when you use
numbers: just introduce the numerals as an abbreviation of the formal
counterparts.
The construction starts with N and algebraically speaking, N with its
operations and order is quite a weak structure. In the following
constructions the structures will be strengthen one step at the time:
Z will be an integral domain, Q will be a field, for the field R the
order will be made complete, and field C will be made algebraically
complete.
Before we start, first some notational stuff:
* a pair (a,b) = { { a } , { a,b } } ,
* an equivalence class [a] = { b | a == b } ,
* the successor of a is s(a) = a U { a } .
Although the previous notations and the constructions that follow are
the de facto standard ones, there are different definitions possible.
Construction of N
* { } in N
* if a in N then s(a) in N
* N is the smallest possible set such that the preceding rules hold.
Informally n = { 0,...,n - 1 } (thus 0 = { } , 1 = { 0 } , 2 = { 0,1 }
, 3 = { 0,1,2 } ). We will refer to the elements of N by giving them a
subscript _n. The relation <_n on N is defined as: a_n <_n b_n iff a_n
in b_n. We can define +_n as follows:
* a_n +_n 0_n = a_n
* a_n +_n s(b_n) = s(a_n +_n b_n)
Define *_n as:
* a_n *_n 0_n = 0_n
* a_n *_n s(b_n) = (a_n *_n b_n) +_n a_n
Construction of Z
We define an equivalence relation on N x N: (a_n,b_n) ==_z(c_n,d_n)
iff a_n +_n d_n = c_n +_n b_n. Note that ==_z ``simulates'' a
subtraction in N . Z = { [(a_n,b_n)]_z | a_n, b_n in N } . We will
refer to the elements of Z by giving them a subscript _z. The elements
of N can be embedded as follows: embed_n : N --> Z such that
embed_n(a_n) = [(a_n,0_n)]_z. Furthermore we can define:
* [(a_n,b_n)]_z <_z [(c_n,d_n)]_z iff a_n +_n d_n <_n c_n +_n b_n
* [(a_n,b_n)]_z +_z [(c_n,d_n)]_z = [(a_n +_n c_n, b_n +_n d_n)]_z
* [(a_n,b_n)]_z *_z [(c_n,d_n)]_z =
[((a_n *_n c_n) +_n (b_n *_n d_n), (a_n *_n d_n) +_n (c_n *_n
b_n))]_z
Construction of Q
We define an equivalence relation on Z x (Z { 0_z }): (a_z,b_z) ==_q
(c_z,d_z) iff a_z *_z d_z = c_z *_z b_z. Note that ==_q ``simulates''
a division in Z . Q = { [(a_z,b_z)]_q | a_z in Z and b_z in Z { 0_z }
} . We will refer to the elements of Q by giving them a subscript _q.
The elements of Z can be embedded as follows: embed_z : Z --> Q such
that embed_z(a_z) = [(a_z,1_z)]_q. Furthermore we can define:
* [(a_z,b_z)]_q <_q [(c_z,d_z)]_q iff a_z *_z d_z <_z c_z *_z b_z
when 0_z <_z b_z and 0_z <_z d_z
* [(a_z,b_z)]_q +_q [(c_z,d_z)]_q = [((a_z *_z d_z) +_z (c_z *_z
b_z), b_z *_z d_z)]_q
* [(a_z,b_z)]_q *_q [(c_z,d_z)]_q = [(a_z *_z c_z, b_z *_z d_z)]_q
Construction of R
The construction of R is different (and more awkward to understand)
because we must ensure that the cardinality of R is greater than that
of Q .
Set c is a Dedekind cut iff
* { } subset c subset Q (strict inclusions!)
* c is closed downward:
if a_q in c and b_q <_q a_q then b_q in c
* c has no largest element:
there isn't an element a_q in c such that b_q <_q a_q for all b_q
!= a_q in c
You can think of a cut as taking a pair of scissors and cutting Q in
two parts such that one part contains all the small numbers and the
other part contains all large numbers. If the part with the small
numbers was cut in such a way that it doesn't have a largest element,
it is called a Dedekind cut. R = { c | c is a Dedekind cut } . We will
refer to the elements of R by giving them a subscript _r. The elements
of Q can be embedded as follows: embed_q : Q --> R such that
embed_q(a_q) = { b_q | b_q <_q a_q } . Furthermore we can define:
* a_r <_r b_r iff a_r subset b_r (strict inclusion!)
* a_r +_r b_r = { c_q +_q d_q | c_q in a_r and d_q in b_r }
* -_r a_r = ; { b_q | there exists an c_q in Q such that b_q <_q c_q
and (-1)_q *_q c_q in a_r }
* |a_r|_r = a_r U -_r a_r
* *_r is defined as:
+ if not a_r <_r 0_r and not b_r <_r 0_r
then a_r *_r b_r = 0_r U { c_q *_q d_q | c_q in a_r and d_q
in b_r }
+ if a_r <_r 0_r and b_r <_r 0_r then a_r *_r b_r = |a_r|_r *_r
|b_r|_r
+ otherwise a_r *_r b_r = -_r (|a_r|_r *_r |b_r|_r)
There exists an alternative definition of R using Cauchy sequences: a
Cauchy sequence is a s : N --> Q such that s(i_n) +_q((-1)_q *_q
s(j_n)) can be made arbitrary near to 0_q for all sufficiently large
i_n and j_n. We will define an equivalence relation ==_r on the set of
Cauchy sequences as: r ==_r s iff r(m_n) +_q((-1)_q *_q s(m_n)) can be
made arbitrary close to 0_q for all sufficiently large m_n. R = {
[s]_r | s is a Cauchy sequence } . Note that this definition is close
to ``decimal'' expansions.
Construction of C
C = R x R. We will refer to the elements of C by giving them a
subscript _c. The elements of R can be embedded as follows: embed_r :
R --> C such that embed_r(a_r) = (a_r,0_r). Furthermore we can define:
* (a_r,b_r) +_c (c_r,d_r) = (a_r +_r c_r, b_r +_r d_r)
* (a_r,b_r) *_c (c_r,d_r) = ((a_r *_r c_r) +_r -_r (b_r * d_r), (a_r
*_r d_r) +_r (b_r *_r c_r))
There exists an elegant alternative definition using ideals. To be a
bit sloppy: C = R [x]/< (x *_r x) +_r 1_r > , i.e. C is the resulting
quotient ring of factoring ideal < (x *_r x) +_r 1_r > out of the ring
R [x] of polynomials over R . The sloppy part is that we need to
define concepts like quotient ring, ideal, and ring of polynomials.
Note that this definition is close to working with i^2 = -1: (x *_r x)
+_r 1_r = 0_r can be rewritten as (x *_r x) = (-1)_r.
Rounding things up
At this moment we don't have that N is a subset of Z , Z of Q , etc.
But we can get the inclusions if we look at the embedded copies of N ,
Z , etc. Let
* N' = ran embed_r o embed_q o embed_z o embed_n
* Z' = ran embed_r o embed_q o embed_z
* Q' = ran embed_r o embed_q
* R' = ran embed_r
For these sets we have N' subseteq Z' subseteq Q' subseteq R' subseteq
C. Furthermore these sets have all the properties that the
``informal'' numbers have.
What's next?
Well, for some of the more alien parts of math we can extend this
standard number system with some exotic types of numbers. To name a
few:
* Cardinals and ordinals
Both are numbers in ZF set theory [Enderton77, Henle86, Hrbacek84]
and so they are sets as well. Cardinals are numbers that represent
the sizes of sets, and ordinals are numbers that represent well
ordered sets. Finite cardinals and ordinals are the same as the
natural numbers. Cardinals, ordinals, and their arithmetic get
interesting and ``tricky'' in the case of infinite sets.
* Hyperreals
These numbers are constructed by means of ultrafilters [Henle86]
and they are used in non-standard analysis. With hyperreals you
can treat numbers like Leibnitz and Newton did by using
infinitesimals.
* Quaternions and octonions
Normally these are constructed by algebraic means (like the
alternative C definition that uses ideals) [Shapiro75, Dixon94].
Quaternions are used to model rotations in 3 dimensions.
Octonions, a.k.a. Cayley numbers, are just esoteric artifacts :-).
Well, if you know where they are used for, feel free to contribute
to the FAQ.
* Miscellaneous
Just to name some others: algebraic numbers [Shapiro75], p-adic
numbers [Shapiro75], and surreal numbers (a.k.a. Conway numbers)
[Conway76].
Cardinals and ordinals are commonly used in math. Most mortals won't
encounter (let alone use) hyperreals, quaternions, and octonions.
References
J.H. Conway. On Numbers and Games, L.M.S. Monographs, vol. 6. Academic
Press, 1976.
H.B. Enderton. Elements of Set Theory. Academic Press, 1977.
G.M. Dixon. Division Algebras; Octonions, Quaternions, Complex Numbers
and the Algebraic Design of Physics. Kluwer Academic, 1994.
J.M. Henle. An Outline of Set Theory. Springer Verlag, 1986.
K. Hrbacek and T. Jech. Introduction to Set Theory. M. Dekker Inc.,
1984.
L. Shapiro. Introduction to Abstract Algebra. McGraw-Hill, 1975.
This subsection of the FAQ is Copyright (c) 1994, 1995 Hans de
Vreught. Send comments and or corrections relating to this part to
J.P.M.deVreught@cs.tudelft.nl
_________________________________________________________________
--
Alex Lopez-Ortiz alopez-o@unb.ca
http://www.cs.unb.ca/~alopez-o Assistant Professor
Faculty of Computer Science University of New Brunswick