# COMS W3261 Computer Science Theory Lecture 6:September 24, 2012 Properties of Regular Languages

## Overview

• The set of regular languages is closed under a number of common operations such as union, intersection, complement, and reversal.
• Many common decision problems for representations of regular languages are decidable.
• Every regular set has a unique minimum-state DFA (unique up to renaming of states).

## 1. Closure Properties of Regular Languages

• A closure property for a family of languages is a theorem that says if we apply a certain operation to the languages in the family, then the resulting language will also be in the family. For example, if we take the union of two regular languages L and M, then the language L ∪ M is also regular. We therefore say the regular languages are closed under the operation of union.
• We can show that the regular languages are closed under the following operations:
• union, intersection, complement, difference
• concatenation, Kleene closure
• reversal
• homomorphism, inverse homomorphism
• These closure properties can be used to show that some languages are regular.
• These closure properties combined with the pumping lemma can be used to show some languages are not regular.

## 2. Decision Problems for Regular Languages

• We can ask whether a representation of language has a given property. Such a question is often called a decision problem.
• If there is an algorithm to answer the question, we say the problem is decidable. For decidable problems we are interested in how quickly a question can be answered as a function of the size of the representation of the language.
• The emptiness problem is to decide whether the language denoted by a given representation is empty.
• Given a finite automaton for a regular language, we can answer the emptiness problem by determining whether there is a path from the start state to a final state. This can be answered in O(n2) time where n is the number of states in the automaton.
• The membership problem is to decide whether a particular string is in the language denoted by a given representation.
• Given a DFA D for a regular language and an input string w, we can answer the membership problem by simulating D processing w beginning in the start state. This can be answered in O(|w|) time.

## 3. Testing Equivalence of States

• Given a DFA D for a regular language, we say two distinct states p and q are equivalent if, for all input strings w, δ*(p, w) is a final state iff δ*(q, w) is a final state.
• This says either δ*(p, w) and δ*(q, w) are either both accepting or both nonaccepting.
• If two states are not equivalent, then we say they are distinguishable.
• The table-filling algorithm for computing all pairs of distinguishable states:
• Input: a DFA D = (Q, Σ, δ, q0, F).
• Output: a table T of all pairs of distinguishable states.
• Method:
• ```for all states p and q do
if p is final and q is nonfinal
for all states p and q do
for all input symbols a do
if δ(p,a) and δ(q,a) are in T then
until no more pairs can be added to T
```
• Theorem: If two states are not distinguishable by the table-filling algorithm, then the two states are equivalent.

## 4. Testing Equivalence of DFA's

• We can use the table-filling algorithm to test the equivalence of two DFA's by testing the equivalence of their start states.
• The DFA's are equivalent iff their start states are equivalent.

## 5. Minimizing the Number of States in a DFA

• We can use the table-filling algorithm to minimize the number of states in a DFA.
• The minimization algorithm:
• Input: a DFA A = (QA, Σ, δA, qA, FA).
• Output: an equivalent minimum-state DFA B = (QB, Σ, δB, qB, FB).
• Method:
• ```1. Eliminate any state that cannot be reached from the start state.
2. Compute the sets of all equivalent states.
3. Partition the states into blocks so that
all states in the same block are equivalent and
no pair of states from different blocks are equivalent.
4. Construct the minimum-state DFA B as follows:
a. QB is the set of blocks of equivalent states.
b. If R and S are blocks containing the states p and q of A, respectively,
then δB(R, a) = S if δA(p, a) = q.
c. qB is the block containing qA.
d. A state S is in FB if S contains a state in FA.
```
• Theorem: L(B) = L(A) and no DFA equivalent to A has fewer states than B.

## 6. Practice Problems

1. Prove that the two regular expressions (a+b)* and (a*b*)* generate the same language.
2. Consider the function on languages noprefix(L) = { w in L | no proper prefix of w is a member of L}. Show that the regular languages are closed under the noprefix function.
3. [Hard] Consider the function on languages remove_middle_third(L) = { xz | for some y, xyz is in L where |x| = |y| = |z|}. Show that the regular languages are not closed under the remove_middle_third function.
4. [Hard] An equivalence relation R on a language L contained in Σ* is right invariant if xRy implies xzRyz for all z in Σ*. R is of finite index if it partitions L into a finite number of equivalence classes. Show that L is regular if and only if it is the union of some of the equivalence classes of a right-invariant equivalence relation on L of finite index.