Grammars: 두 판 사이의 차이

2025년 10월 18일 (토) 02:00 판

개요

Grammar는 DFA나 NFA처럼 문자열을 소비하며 “인식(recognize)”하는 것이 아닌, 문자열을 생성(generate) 하는 시스템이다. 즉, grammar는 어떤 언어에 속하는 모든 문자열을 만들어내는 규칙들의 모음이다.

Definition of Grammars

Grammar는 아래와 같은 4-튜플([math]\displaystyle{ V,\Sigma, R, S }[/math])을 통해서 정의된다:

[math]\displaystyle{ V }[/math]: 비단말기(nonterminal symbols)의 유한 집합
- 예: [math]\displaystyle{ \{S, A, B\} }[/math]
- 실제 문자열에 직접 등장하지는 않지만, 생성 과정에서 사용된다.
[math]\displaystyle{ \Sigma }[/math]: 단말기(terminal symbols) 의 유한 집합
- 예: [math]\displaystyle{ \{a,b,0,1\} }[/math]
- 실제 언어의 문자열을 구성하는 기호들이며, [math]\displaystyle{ V }[/math]와 겹치지 않는다.
[math]\displaystyle{ R }[/math]: 규칙(rule)의 유한 집합
- 각 규칙은 [math]\displaystyle{ l\rightarrow r }[/math] 형태이고, [math]\displaystyle{ l, r \in (V \cup \Sigma)* }[/math]
- [math]\displaystyle{ l }[/math]은 최소 하나의 비단말을 포함해야 하며, 규칙은 단말을 단말이나 다른 비단말로 치환하는 역할을 한다.
[math]\displaystyle{ S \in V }[/math]: 시작 기호(start symbol)

이때 두 집합 [math]\displaystyle{ V, \Sigma }[/math]의 모든 기호를 섞어서 만들 수 있는 문자열 전체는 [math]\displaystyle{ (V\cup \Sigma)* }[/math]이다. 이는 sentential form이라고 불리며, 문법(Grammar)으로 문자열을 생성하는 과정에서 만들어지는 모든 문자열을 의미한다.

Derivations

문법을 사용해서 문자열을 생성하는 과정(derivation)은 아래와 같다:

[math]\displaystyle{ u \Rightarrow v }[/math]: “u가 v를 한 단계 유도한다.”
- 즉, 어떤 규칙 [math]\displaystyle{ l \rightarrow r }[/math]을 적용하여
  [math]\displaystyle{ u=w_1lw_2 }[/math]인 문자열에서 [math]\displaystyle{ l }[/math]을 [math]\displaystyle{ r }[/math]로 바꾸면 [math]\displaystyle{ v = w_1rw_2 }[/math]가 된다.
- 즉, [math]\displaystyle{ u }[/math] 안의 한 비단말기를 규칙에 따라 다른 형태로 치환하여 [math]\displaystyle{ v }[/math]를 얻는 것이다.

또한, 주어진 규칙 내에서 여러번 규칙을 적용하여 문자열 [math]\displaystyle{ v }[/math]를 얻을 수 있는데, 이를 기호로 나타내면 [math]\displaystyle{ u \Rightarrow* v }[/math]이다. 예를 들어, [math]\displaystyle{ u \Rightarrow u_1 \Rightarrow u_2 \Rightarrow \cdots \Rightarrow v }[/math]와 같은 방식이다. 이를 통해 문법 G가 생성하는 언어 [math]\displaystyle{ L(G) }[/math]는 아래와 같이 정의된다:

[math]\displaystyle{ L(G) = \{w \in \Sigma*|S \Rightarrow* w\} }[/math]

Classes of Grammars

문법은 아래와 같이 네가지로 구분된다:

Unrestricted Grammar: 아무 제약이 없는 일반적인 형태
Context-sensitive Grammar: 각 규칙에서 [math]\displaystyle{ |l| \le |r| }[/math]이다.
- 이는 규칙을 적용하는 과정에서 문자열의 길이가 줄어들지 않음을 의미한다.
Context-free Grammar: 모든 규칙의 왼쪽이 단 하나의 비단말기라는 것을 의미한다.([math]\displaystyle{ l \in V }[/math])
- 문맥과 상관없이 하나의 비단말기만 보고 규칙을 적용할 수 있다.
- PDA(푸시다운 오토마톤)과 동일한 계산 능력을 가진다.
Regular Grammar: Context-free Grammar보다 더욱 단순한 형태이다.
- 각 규칙은 [math]\displaystyle{ r = w }[/math]또는 [math]\displaystyle{ r = wB }[/math]꼴이며, 오른쪽 끝에 비단말기가 올 수 있다.
- 이는 이름에서 짐작할 수 있듯이 정규언어와 정확히 동일한 클래스이다.

[math]\displaystyle{ }[/math] [math]\displaystyle{ }[/math] [math]\displaystyle{ }[/math] [math]\displaystyle{ }[/math] [math]\displaystyle{ }[/math] [math]\displaystyle{ }[/math] [math]\displaystyle{ }[/math] [math]\displaystyle{ }[/math] [math]\displaystyle{ }[/math] [math]\displaystyle{ }[/math] [math]\displaystyle{ }[/math] [math]\displaystyle{ }[/math] [math]\displaystyle{ }[/math]

각주

@@ 4번째 줄: / 4번째 줄: @@
 ==개요==
+Grammar는 [[Finite Automata|DFA]]나 [[Finite Automata|NFA]]처럼 문자열을 소비하며 “인식(recognize)”하는 것이 아닌, 문자열을 생성(generate) 하는 시스템이다. 즉, grammar는 어떤 언어에 속하는 모든 문자열을 만들어내는 규칙들의 모음이다.
+==Definition of Grammars==
+Grammar는 아래와 같은 4-튜플(<math>V,\Sigma, R, S</math>)을 통해서 정의된다:
+# <math>V</math>: 비단말기(nonterminal symbols)의 유한 집합
+#* 예: <math>\{S, A, B\}</math>
+#* 실제 문자열에 직접 등장하지는 않지만, 생성 과정에서 사용된다.
+# <math>\Sigma</math>: 단말기(terminal symbols) 의 유한 집합
+#* 예: <math>\{a,b,0,1\}</math>
+#* 실제 언어의 문자열을 구성하는 기호들이며, <math>V</math>와 겹치지 않는다.
+# <math>R</math>: 규칙(rule)의 유한 집합
+#* 각 규칙은 <math>l\rightarrow r</math> 형태이고, <math>l, r \in (V \cup \Sigma)*</math>
+#* <math>l</math>은 최소 하나의 비단말을 포함해야 하며, 규칙은 단말을 단말이나 다른 비단말로 치환하는 역할을 한다.
+# <math>S \in V</math>: 시작 기호(start symbol)
+이때 두 집합 <math>V, \Sigma</math>의 모든 기호를 섞어서 만들 수 있는 문자열 전체는 <math>(V\cup \Sigma)*</math>이다. 이는 sentential form이라고 불리며, 문법(Grammar)으로 문자열을 생성하는 과정에서 만들어지는 모든 문자열을 의미한다.
+===Derivations===
+문법을 사용해서 문자열을 생성하는 과정(derivation)은 아래와 같다:
+* <math>u \Rightarrow v</math>: “u가 v를 한 단계 유도한다.”
+** 즉, 어떤 규칙 <math>l \rightarrow r</math>을 적용하여<br><math>u=w_1lw_2</math>인 문자열에서 <math>l</math>을 <math>r</math>로 바꾸면 <math>v = w_1rw_2</math>가 된다.
+** 즉, <math>u</math> 안의 한 비단말기를 규칙에 따라 다른 형태로 치환하여 <math>v</math>를 얻는 것이다.
+또한, 주어진 규칙 내에서 여러번 규칙을 적용하여 문자열 <math>v</math>를 얻을 수 있는데, 이를 기호로 나타내면 <math>u \Rightarrow* v</math>이다. 예를 들어, <math>u \Rightarrow u_1 \Rightarrow u_2 \Rightarrow \cdots \Rightarrow v</math>와 같은 방식이다. 이를 통해 문법 G가 생성하는 언어 <math>L(G)</math>는 아래와 같이 정의된다:
+ <math>L(G) = \{w \in \Sigma*|S \Rightarrow* w\}</math>
+===Classes of Grammars===
+문법은 아래와 같이 네가지로 구분된다:
+# Unrestricted Grammar: 아무 제약이 없는 일반적인 형태
+# Context-sensitive Grammar: 각 규칙에서 <math>|l| \le |r|</math>이다.
+#* 이는 규칙을 적용하는 과정에서 문자열의 길이가 줄어들지 않음을 의미한다.
+# Context-free Grammar: 모든 규칙의 왼쪽이 단 하나의 비단말기라는 것을 의미한다.(<math>l \in V</math>)
+#* 문맥과 상관없이 하나의 비단말기만 보고 규칙을 적용할 수 있다.
+#* PDA(푸시다운 오토마톤)과 동일한 계산 능력을 가진다.
+# Regular Grammar: Context-free Grammar보다 더욱 단순한 형태이다.
+#* 각 규칙은 <math>r = w</math>또는 <math>r = wB</math>꼴이며, 오른쪽 끝에 비단말기가 올 수 있다.
+#* 이는 이름에서 짐작할 수 있듯이 정규언어와 정확히 동일한 클래스이다.
+<math></math>
+<math></math>
+<math></math>
+<math></math>
+<math></math>
+<math></math>
+<math></math>
+<math></math>
+<math></math>
+<math></math>
+<math></math>
+<math></math>
+<math></math>
 ==각주==