Here's the next installment of our series on algebraic number theory. In the
last installment we had a quick look at groups and rings. Now it's time to look at field theory, with special emphasis on what is known as Galois theory. The latter is all about developing a concise description of the relations among the roots of an irreducible polynomial equation using group theory. Some of this theory was famously sketched out in 1832 by Évariste Galois on the night before a duel in which he died.
Galois theory makes it possible to prove several well-known results, such as the impossibility of expressing the solution of some fifth degree polynomial equations in terms of radicals and the impossibility of trisecting some angles with straightedge and compass. We won't go into that, but instead we will eventually see Galois theory used frequently in algebraic number theory.
A field is simply a ring whose multiplication is commutative, has an identity element, and has multiplicative inverses for all elements except the additive identity element. We've already mentioned several examples of fields, specifically
number fields, which are
algebraic extensions of
finite degree of the rationals
Q. (I. e., each element of a such a field is an algebraic number in some finite extension of
Q.) More exotic examples of fields certainly exist, though, such as
finite fields, fields of functions of various kinds,
p-adic number fields, and certain other types of
local fields. If you go far enough in algebraic number theory, you'll encounter all of these.
The most important set of facts about fields for our purposes lie in what is known as
Galois theory. This is the theory developed originally by Évariste Galois to deal (among other things) with the solvability or non-solvability, using radicals, of algebraic equations. It tells us a lot about the structure of field extensions in terms of certain groups – called
Galois groups – which are constructed using
permutations of roots of a polynomial which determines the extension. (Permutations are 1-to-1 mappings of a set to itself that interchange elements.) A little more precisely, a Galois group consists of
automorphisms of a field – i. e. maps (functions) of the field to itself which preserve the field structure. All such automorphisms, it turns out, can be derived from permutations of the roots of a polynomial – under the right conditions.
The importance of Galois theory is that it sketches out some of the "easy" background facts about a given field extension, into which some of the more difficult facts about the algebraic integers of the extension must fit.
Before we proceed, let's review some notations and definitions that will be used frequently. Suppose F is a field. For now, we will assume F is a subset of the complex numbers
C, but not necessarily a subset of the real numbers
R. If x is an
indeterminate (an "unknown"), then F[x] is the set of polynomials in powers of x with coefficients in F. F[x] is obviously a ring. If f(x)∈F[x] is a polynomial, it has degree n if n is the highest power of x in the polynomial. f(x) is monic if the coefficient of its highest power of x is 1. If f(x) has degree n, it is said to be irreducible over F if it is not the product of two (or more) nonconstant polynomials in F[x] having degree less than n.
A complex number α, which is not in F, is algebraic over F if f(α)=0 for some f(x)∈F[x]. f(x) is said to be a minimal polynomial for α over F if f(x) is monic, f(α)=0, and no polynomial g(x) whose degree is less than that of f(x) has g(α)=0. (Note that any polynomial such that f(α)=0 can be made monic without changing its degree.) A minimal polynomial is therefore irreducible over F. F(α) is defined to be the set of all quotients g(α)/h(α) where g(x) and h(x) are in F[x] and h(α)≠0. F(α) is obviously a field, and it is referred to as the field obtained by adjoining α to F.
If E is any field that contains F, such as F(α), the degree of E over F, written [E:F], is the dimension of E as a vector space over F. (Usually this is assumed to be finite, but there are infinite dimensional extensions also.) It is relatively easily proven that if α is algebraic over F and if the minimal polynomial of α has degree n, then [F(α):F]=n. Of course, more than one element can be adjoined to form an extension. For instance, with two elements α and β we write F(α,β), which means (F(α))(β). (Or (F(β))(α) – the order doesn't matter.)
We will frequently need one more important fact. Suppose we have two successive extensions, involving three fields, say D⊇E⊇F. This is called a
tower of fields. Then D is a vector space over E, as is E over F. From basic linear algebra, D is also a vector space over F, and vector space dimensions multiply. Consequently, in this situation we have the rule that degrees of field extensions multiply in towers: [D:F]=[D:E][E:F].
Now we're almost ready to define a group, called the Galois group, corresponding to an extension field E⊇F. However, Galois groups can't be properly defined for all field extensions E⊇F. The extension must have a certain property. Here is the problem: The group we want should be a group of permutations on a certain set – the set of all roots of a polynomial equation. But consider this equation: x
3-2=0. One root of this equation is the (real) cube root of 2, 2
1/3. The other two roots are ω2
1/3 and ω
22
1/3 where ω=(-1+√-3)/2. You can check that ω
3=1 and ω satisfies the second degree equation x
2+x+1=0. ω is called a
root of unity, a cube root of unity in particular. (Roots of unity, as we'll see, are very important in algebraic number theory.) Now, the extension field E=
Q(2
1/3) is contained in
R, but the other roots of x
3-2=0 are complex, so not in the extension E. This means that it isn't possible to find an automorphism of E which permutes the roots of the equation. Hence we can't have the Galois group we need for an extension like E.
The property of an extension E⊇F that we need to have is that for any polynomial f(x)∈F[x] which is irreducible (has no nontrivial factors) over F, if f(x) has one root in E, then all of its roots are in E, and so f(x)
splits completely in E, i. e. f(x) splits into linear (first degree) factors in E. An equivalent condition (as it turns out), though seemingly weaker, is that there be even one irreducible f(x)∈F[x] such that f(x) splits completely in E but in no subfield of E. That is, E must be the smallest field containing F in which the irreducible polynomial f(x)∈F[x] splits completely. E is said to be a
splitting field of f(x). The factorization can be written
f(x) = ∏1≤i≤n (x - αi)
with all α
i∈E, where n is the degree of f(x). (Remember that we are assuming f(x) is monic.) When this is the case, E is
generated over F by adjoining all the roots of f(x) to F. In this case it can be shown that the degree [E:F] is the same as the degree of f(x).
An extension that satisfies these conditions is said to be a
Galois extension, and it is the kind of extension we need in order to define the Galois group G(E/F). (Sometimes the type of extension just described is called a
normal extension, and a further property known as
separability is required for a Galois extension. As long as we are dealing with subfields of
C, fields are automaticaly separable, so the concepts of Galois and normal are the same in this case.)
Suppose E⊇F isn't a Galois extension. If E is a
proper extensions of F (i. e. E≠F), if α∈E but α∉F, and if f(x) is a minimal polynomial for α over F, then the degree [E:F] of the extension is greater than or equal to the degree of f(x). The degrees might not be equal, because
all the roots of f(x) must be adjoined to F to obtain a Galois extension, not just a single root. If α is (any) one of the roots, [F(α):F] is equal to the degree of f(x). But this is the degree [E:F] only if α happens to be a primitive element for the extension, so that E=F(α), which isn't usually the case, and certainly isn't if E isn't a Galois extension of F.
In the example above with f(x)=x
3-2, we have E =
Q(ω,2
1/3) =
Q(ω)(2
1/3), [
Q(ω):
Q]=2 and [
Q(ω,2
1/3):
Q(ω)]=3, so the degree of the splitting field of f(x) over
Q is 6, because degrees multiply.
Q(2
1/3)⊇
Q is an example of a field extension that is not Galois. But
Q(ω,2
1/3)⊇
Q(ω) is Galois, since f(x) is irreducible over
Q(ω) but splits completely in the larger field. Likewise,
Q(ω)⊇
Q is Galois, and in fact all extensions of degree 2 are Galois. (If f(x)∈
Z[X] is a quadratic which is irreducible over
Q and has one root in E, then the roots are given by the quadratic formula and involve √d for some d∈
Z, so if one is in E, both are.)
We'll come back to this example, but first we'll look at a simpler one to get some idea of how Galois groups work. Consider the two equations x
2-2=0 and x
2-3=0. The roots of the first are x=±√2, and the roots of the second are x=±√3. We will start from the field
Q and adjoin one root of each equation. This yields two different fields: E
2=
Q(√2) and E
3=
Q(√3). If we adjoin a root from both equations we get a larger field that contains the others as subfields: E=
Q(√2,√3).
Consider the field extension E
2⊇
Q first. We use the notation G(E
2/
Q) to denote the Galois group of the extension. In this example, call it G
2 for short. We will use Greek letters σ and τ to denote Galois group elements in general. G
2 consists of two elements. One of these is the identity (which we denote by "1") which
acts on elements of the field E
2 but (by definition) leaves them unchanged. This can be symbolized as 1(α)=α for all α∈E
2. The action of a Galois group element can be fully determined by how it acts on a generator of the field, meaning √2 in this case. So it is enough to specify that 1(√2) = √2. This Galois group has just one other element σ
2, which is defined by σ
2(√2)=-√2. An important property that a Galois group must satisfy is that the action of all its elements leaves the
base field (
Q in this case) unchanged. A Galois group is an example of a group that acts on a set – a very important concept in group theory. But there is an additional requirement on Galois groups: each group element must preserve the structure of the field it acts on. In technical terms, it must be a field automorphism. We'll see the importance of this condition very soon.
As you can probably anticipate, the Galois group G
3=G(E
3/
Q) has elements 1 and σ
3 defined by σ
3(√3)=-√3. We can now ask: what is the Galois group of the larger extension E⊇
Q? It must contain 1, σ
2 and σ
3. We have to think about how (for instance) σ
2 acts on √3. The clever thing about Galois theory is that it's easy to say what this action should be: σ
2 should leave √3 unchanged: σ
2(√3)=√3. In particular, σ
2(√3) cannot be ±√2 The reason is that σ
2 leaves the coefficients of x
2-3=0 unchanged, and because σ
2 is a structure-preserving field automorphism it cannot map something that is a root of that equation (such as √3) to something that is not a root of that equation (±√2).
For any finite group G, the
order of the group is the number of distinct elements. We symbolize the order of G by #(G). In Galois theory it is shown that the order of a Galois group is the same as the degree of the corresponding field extension. Symbolically: #(G(E/F))=[E:F]. Basically this is because we can always find a primitive element θ such that E=F(θ), and θ satisfies an equation f(x)=0, where the degree of f(x) is [E:F]. The other n-1 roots of that equation are said to be
conjugate roots. We get n automorphisms, the elements of G(E/F), generated from mapping θ to one of its conjugates (or to itself, giving the identity automorphism). Since the degrees of field extensions in towers multiply, so too do the orders of Galois groups in field towers, as long as each extension is Galois. That is, if D⊇E⊇F, where each extension is Galois, then #(G(D/F)) = #(G(D/E))#(G(E/F)). In our example, the degree of the extension is [
Q(√2,√3):
Q] = [
Q(√2,√3):
Q(√2)][
Q(√2):
Q] = 4. So this is also the order of the Galois group G=G(
Q(√2,√3)/
Q), and therefore we need to find 4 elements.
We've already identified three of the elements (1, σ
2 and σ
3). It's pretty clear that the remaining element must be a product of group elements: τ=σ
2σ
3. The product of Galois group elements is just the
composition of the elements, which are field automorphisms (which happen to be derived from permutations on roots of equations), and hence they compose like any other function (or permutation). (Composition is just another term for the the function which is the result of applying one function after another.) Because of how σ
2 and σ
3 are defined, it must be the case that τ(√2)=-√2 and τ(√3)=-√3. Since E⊇
Q is generated by √2 and √3, and τ is a field automorphism, we can figure out what τ(α) must be for any other α∈E. For instance, τ(√6)=√6, since √6=√2√3.
(Remember that we specified σ
2(√3)=√3. You may have been wondering why we didn't just define the action of σ
2 as an element of the full Galois group G=G(E/
Q) by σ
2(√3)=-√3. Had we done that, σ
2 would have been what we found as τ, while the τ we got as the product of σ
2 and σ
3 would turn out to be the "old" σ
2, so the only difference would be a relabeling of group elements.)
For a slightly more complicated example, suppose f(x)=x
2+x+1 and g(x)=x
3-2, with roots ω and 2
1/3 respectively, as above. Then in the tower
Q(ω,2
1/3) ⊇
Q(ω) ⊇
Q both the extensions are Galois. (We already saw this isn't so with the tower
Q(ω,2
1/3) ⊇
Q(2
1/3) ⊇
Q – order matters.) So the full extension E=
Q(ω,2
1/3) ⊇
Q is Galois. Its Galois group G=G(E/
Q) has order 6, because 6 is the degree of the whole extension, since the intermediate extensions are of degree 3 and 2 and the degrees of the extensions multiply.
It turns out to be easy to determine the Galois group of this extension, although there are some tedious calculations needed to verify this. So bear with us a moment here. We can define two automorphisms of E that leave
Q fixed, as follows. It suffices to specify them on generators of the field. Let one automorphism σ be defined by σ(&omega)=ω
2 and σ(2
1/3)=2
1/3. Let the other automorphism τ be defined by &tau(2
1/3)=ω2
1/3 and τ(ω)=ω. σ and τ are defined to leave elements of
Q unchanged. For sums and products elements of E, σ and τ are defined to preserve the field structure, so they really are automorphisms (though, to be rigorous, this should be checked). So σ and τ are elements of the Galois group G=G(E/
Q).
We can also see that σ
2(ω) = σ(σ(ω)) = σ(ω
2) = ω
4 = ω, because ω
3 = 1. So σ
2 is the identity automorphism. (Note that the exponents on σ and τ refer to repeated composition, not ordinary exponentiation, because composition "is" multiplication in the group G.) If we compute τ
2 and τ
3 in the same way, applied to 2
1/3, we find that τ
2(2
1/3) = ω
22
1/3, and τ
3(2
1/3) = 2
1/3, again because ω
3 = 1. Thus τ
2 isn't the identity automorphism, but τ
3 is.
Now let's compute with the composed automorphisms στ and τσ. First, στ(2
1/3) = σ(ω2
1/3) = ω
22
1/3. However, τσ(2
1/3) = τ(2
1/3) = ω2
1/3. So we have στ ≠ τσ, because ω≠ω
2. Instead, we will find by a similar calculation that στ(2
1/3) = ω
22
1/3 = τ
2σ(2
1/3). Hence στ = τ
2σ. A little more checking will show that 1 (the identity automorphism), σ, τ, τ
2, τσ, and στ give a complete list of distinct automorphisms that can be formed from σ and τ. That's just right, because G must be a group of order 6.
In abstract group theory there are only two distinct groups of order 6. (That is, distinct up to an
isomorphism, which is a 1-to-1 structure-preserving map between groups that shows they are essentiall the "same" group.) One is the
cyclic group of order 6, denoted by C
6. This is isomorphic to the
direct product of a cyclic group of order two and one of order 3, i. e. the group C
2×C
3. However, since στ ≠ τσ, G isn't abelian, it cannot be C
6, which is abelian. The only other group of order 6 is (up to isomorphism) S
3, the group of permutations of three distinct objects, also known as the
symmetric group. (An isomorphic group is the
dihedral group D
3, the group of symmetries of an equilateral triangle.) Since this group is the only nonabelian group of order 6, G(E/
Q) must be isomorphic to it.
There's a whole lot more that could be said about Galois theory, but that would take up quite a bit of space, and the intention here is only to give a feel for what it is about. The basic idea to take away is this: A great deal is known about abstract groups and their subgroup structure. Galois theory is a way to "map" extensions of fields to groups and their subgroups in such a way that most of the interesting details about the extension are reflected in details about the groups, and vice versa. The group structure is sensitive to relationships among elements in the subextensions of a Galois extension. In Galois theory it is proven that there is a precise correspondence between subextensions and subgroups of the Galois group.
It thus becomes possible to infer facts about field extensions easily from a knowledge of their Galois groups. One example of the power of this method is that it made possible proving facts that had remained mysterious for hundreds of years – for example, the unsolvability by radicals of general polynomial equations of degree 5 or more, and the impossibility of certain geometric constructions by straightedge and compass alone (trisecting angles, for example).
Galois theory is an absolutely indispensible tool in algebraic number theory. It will come up again and again. We will mention other results in the theory when they are needed.
In the next installment we'll circle back to take a deeper look at ring theory, which is the most basic tool used in algebraic number theory – because there are generalizations of "integers" in an algebraic number field, and they are rings analogous to the familiar ring
Z of ordinary integers.
Tags:
algebraic number theory,
field theory,
Galois theory,
Galois group