Symbol
The symbol is the smallest building block in the theory of computation and can be any letter, number or even pictograms. For example: a, b, 0, 1
Alphabet
From the symbols we can form an alphabet represented by the sigma sign (Σ).
The alphabet is nothing more than a finite set of collection of symbols.
For example: Σ = {a,b}{0,1}
String
A string is a sequence of symbols from the alphabet.
For example: Σ = {a,b}
Possible strings are:
String "a" of length 1
String "b" of length 1
String "aa" of length 2
String "ab" of length 2
String "aaa" of length 3
Question: How many strings of length 2 are possible over alphabet {a,b}?
Answer: 4. The strings are: aa, ab, ba and bb
Question: How many strings of length 2 and starting with "a" are possible over alphabet {a,b}?
Answer: 2. The strings are: aa, ab
Question: How many strings of length n are possible over alphabet {a,b}?
Answer: 2n. Note: There are 2 symbols in the alphabet.
Question: How many strings of length n are possible over alphabet {a,b,c,d}?
Answer: |Σ|n = 4n. Note: There are 4 symbols in the alphabet.
Language
In general a language is a collection of words, but in the theory of computation a language is a collection of strings.
For example: Σ = {a,b}
L1 = set of all strings of length 2.
L1 = {aa, ab, ba, bb}
L1 is called a finite language or a finite set.
L2 = set of all strings of length 3.
L2 = {aaa, aab, aba, abb, baa, bab, bba, bbb}
L2 is called a finite language or a finite set.
L3 = set of all strings where each string start with an "a".
L3 = {a, aa, ab, aaa, aab, aba, abb, ...}
L3 is called an infinite language or an infinite set.
Note:
A language which can be formed over an alphabet can be finite or infinite
Powers of Σ
Σn = Set of all strings over Σ of length n
For example: Σ = {a,b}
Σ0 = Set of all strings over Σ of length 0
Σ0 = {ε}
ε (epsilon) is a special symbol which represents an empty string "" with length 0.
|ε| = 0
|Σ0| = 0
Σ1 = Set of all strings over Σ of length 1
Σ1 = {a,b}
|Σ1| = 2
Σ2 = Set of all strings over Σ of length 2
Σ2 = Σ Σ = {a,b} {a,b} = {aa,ab,ba,bb}
|Σ2| = 4
Σ3 = Set of all strings over Σ of length 3
Σ3 = Σ Σ Σ = {a,b} {a,b} {a,b} = {aaa,aab,aba,abb,baa,bab,bba,bbb}
|Σ3| = 8
Σ*
* means 0, 1, 2 ... n
Σ* = Σ0 ∪ Σ1 ∪ Σ2 ∪ ... Σn
Note: ∪ = union
For example: Σ = {a,b}
Σ* = {ε} ∪ {a,b} ∪ {aa,ab,ba,bb} ∪ ...
Σ* = Set of all strings over Σ of ALL lengths.
Σ* represents an infinite set.
Σ* can be seen as the parent of all languages.
L1 ⊆ Σ*
L2 ⊆ Σ*
L3 ⊆ Σ*
Note: ⊆ = subset
Sigma star is the parent of all languages
Question: How many languages are possible over Σ*
Answer: infinite
How are strings and languages used in in practice.
Question: Which symbols are used in the C programming language?
Answer: Σ = {a,b..z,A,B..Z,0,1..9,+, * ...}
The alphabet is finite.
The program:
void main(){
int a,b
:br
}
is nothing more than a string.
Question: What is the C programming language?
Answer: Set of all valid programs.
Question: How many programs are possible over the C programming language?
Answer: infinite {P1, P2, P3 ...}
Question: If you have a program Pn how can you know if it valid?
Answer: Check if the string (S) is available in the language (L). However the language is infinite, so you can not do this.
Example:
Σ = {a,b}
L1 = {aa, ab, ba, bb}
Note: L1 is finite
S = aa
S is available in L
L2 = {a, aa, aaa, ab, ...}
Note: L2 is infinite.
S = baba
It takes forever to check if S is available in L
Posted By -