package cmmCompiler;

/* DO NOT TRY TO RUN THIS -- IT WILL PROBABLY GO INTO AN INFINITE LOOP
 * This has the structure of a recursive descent parser/recognizer
 * but suffers from left recursion and other problems.
 * No real attempt was made to fix the grammar.
 * The intent of this file is to give you an idea of where to start, 
 * how to structure your code, and how you can automate the generation
 * of some basic error messages. 
 */

import java.util.Vector;

public class Recognizer {
/*

C-- (“ Minus Minus” Language Description 
 
Introduction 

This document describes the C-- language, a subset of the C/Java language. This document details the lexical conventions and syntax of C--, along with the language semantics needed to complete the assignments. 

Lexical Conventions of C--

1. Its keywords are as follows: 

else if int return void while

All keywords are reserved, and must be written in lower case. 

2.  The following are special symbols in the language: 

+ - * / < <= > >= == != = ; , ( ) [ ] { } // /* * /	

3.  Other tokens are ID, NUM,  defined as follows 
a.  an ID consists of a letter, followed by zero or more letters and digits;  IDs are case-sensitive; 
b. a NUM is an integer literal –a digit followed by 0 or more digits; 
. 
4. White space consists of blanks, newlines, and tabs. White space must separate IDs, NUMs and keywords; otherwise, it is ignored. 

5. Comment that start with // extend to the end of the line (like in C++). 

6. Comments surrounded by the usual C notations /* . . . * / can be placed anywhere whitespace can appear (not within tokens). They may include more than one line. Comments can not be nested. 

Syntax and Semantics of C-- 
A BNF grammar for C-- could be specified as follows: 
	 1. program := declaration-list 
	 2. declaration-list := declaration-list declaration | empty
	 3. declaration := var-declaration | fun-declaration 
	 4. var-declaration := type-specifier ID ; | type-specifier ID [ NUM] ; 
	 5. type-specifier := int | void 
	 6. fun-declaration := type-specifier ID ( params ) compound-stmt 
	 7. params := param-list | void | e
	 8. param-list := param-list , param | param 
	 9. param := type-specifier ID | type-specifier ID [] 
	10. compound-stmt := { local-declarations statement-list } 
	11. local-declarations := local-declarations var-declaration | empty
	12. statement-list := statement-list statement | empty 
	13. statement := expression-stmt | compound-stmt | selection-stmt | iteration-stmt | return-stmt 
	14. expression-stmt := expression ; | ;
	15. selection-stmt := if ( expression ) statement | if ( expression ) statement else statement 
	16. iteration-stmt := while ( expression ) statement 
	17. return-stmt := return ; | return expression ; 
	18. expression  := var = expression | simple-expression 
	19. var := ID | ID [ expression ] 
	20. simple-expression := additive-expression relop additive-expression | additive-expression
	21. relop := <= | < | > | >= | == | !=
	22. additive-expression := additive-expression addop term | term
	23. add-op := + | -
	24. term := term mulop factor | factor
	25. mulop := * | /
	26. factor := ( expression ) | var | call | NUM
	27. call := ID ( args ) 
	28. args := arg-list | empty 
	29.  arg-list := arg-list , expression | expression 

Language Semantics 

C-- is C without pointers, with only integers, without some of the control constructs, and with no records (i.e., structs). If you are not familiar with C, it is similar to the procedural portion of Java and C++ (see the examples on class web site). Note that C--- as described here also lacks some important binary operators, such as logical and (&&) and logical or (||), as well as unary minus and logical negation operators.  Also noteworthy is the fact that the given grammar is ambiguous, –it might lack precedence and associativity information for the operators and finally, it is not in a form suitable for building your recursive descent compiler. The form is suitable for describing the syntax and semantics. Think of this as the requirements specification.

Below we give a short explanation of the associated semantics for each of the grammar rules. 
 
	1. program := declaration-list 
	2. declaration-list := declaration-list declaration | empty 
	3. declaration := var-declaration | fun-declaration 

A program consists of a list (or sequence) of declarations, which may be variable or function declarations, in any order. There may be zero declaration, which will be detected and rejected by semantic analysis because any legal C-- program must have a main function (see below). Semantic restrictions are as follows (these do not occur in C). All variables and functions must be declared before they are used (this avoids backpatching references). The last declaration in a program must be a function declaration of the form void main (void). Note that C-- lacks prototypes, so that no distinction is made between declarations and definitions (as in C). 

	4. var-declaration := type-specifier ID ; | type-specifier ID [ NUM] ; 
	5. type-specifier := int | void 

A variable declaration declares either a simple variable of integer type or an array variable whose base type is integer and whose indices range from 0 . . . NUM-1. Note that in C-- the basic types are int (integer), and void. In a variable declaration, only int can be used. void is for function declarations (see below). Note, also, that only one variable can be declared per declaration. 

	6. fun-declaration := type-specifier ID ( params ) compound-stmt 
	7. params := param-list | void | empty 
	8. param-list := param-list , param | param 
	9. param := type-specifier ID | type-specifier ID [ ] 

A function declaration consists of a return type specifier, an identifier, and a comma-separated list of parameters inside parentheses, followed by a compound statement with the code for the function. If the return type of the function is void, then the function returns no value (i.e., it is a procedure). Parameters of a function are either void or empty indicating that there are no parameters, or a list representing the function’ parameters. Parameters followed by brackets are array parameters whose sizes can vary. Simple integer parameters are passed by value. Array parameters are passed by reference (i.e., as pointers) and must be matched by an array variable during a call. Note that there are no parameters of type function. The parameters of a function have scope equal to the compound statement of the function declaration, and each invocation of a function has a separate set of parameters. Functions may be recursive (to the extent that declaration before use allows). 

	10. compound-stmt := { local-declarations statement-list } 

A compound statement consists of curly brackets surrounding a set of declarations and statements. A compound statement is executed by executing the statement sequence in the order given. The local declarations have scope equal to the statement list of the compound statement and supersede any global declaration. 

	11. local-declarations := local-declarations var-declaration | empty
	12. statement-list := statement-list statement | empty 

Note that both declaration and statement lists may be empty. 
	13. statement := expression-stmt | compound-stmt | selection-stmt | iteration-stmt | return-stmt 

There are five types of statements in a C-- program. 

	14. expression-stmt := expression ; | ;

An expression statement has an optional expression followed by a semicolon. Such expressions are usually evaluated for their side effects. Thus this statement is used for assignments and function calls. 


	15. selection-stmt := if ( expression ) statement | if ( expression ) statement else statement 

The if-statement has the usual semantics: the logical expression (of type bool) is evaluated; a true value causes execution of the first statement; a false value causes the execution of the second statement, if it exists. This rule results in the classical dangling else ambiguity, which is resolved in the standard way: the else part is always parsed immediately as a substructure of the current if (the most closely nested disambiguating rule). 

	16. iteration-stmt := while ( expression ) statement 

The while statement is the iteration statement in C--. The while statement executed by repeatedly evaluating the expression and then executing the statement if the expression evaluates to a true value, and ending when the expression evaluates to false. 

	17. return-stmt := return ; | return expression ; 

A return statement may either return a value or not. Functions not declared void must return values. Functions declared void must not return values. A return causes transfer of control back to the caller (or termination of the program if it is inside main). 

	18. expression  := var = expression | simple-expression 
	19. var := ID | ID [ expression ] 

An expression may assign the value of the expression to the given variable. The assignment has the usual storage semantics: the location of the (scalar or array) variable represented by var is found, then the subexpression to the right of the assignment is evaluated, and the value of the subexpression is stored at the given location. A var is either a simple (integer) variable or subscripted array variable. A negative subscript causes the program to halt. However, upper bounds of the subscripts are not checked. Note that C-- assignment statements are expressions, so you can chain multiple assignments together (e.g., a = b = c = 5 ;). 
var’ represent a further restriction of C-- from C. In C, the target of an assignment must be an l-value, and l-values are addresses that can be obtained by many operations. In C--, the only l-values are those given by the var syntax, and so this category is checked syntactically, instead of during type checking is in C. Thus, pointer arithmetic is forbidden in C--. 

	20. simple-expression := additive-expression relop additive-expression | additive-expression
	21. relop := <= | < | > | >= | == | !=
	22. additive-expression := additive-expression addop term | term
	23. add-op := + | -
	24. term := term mulop factor | factor
	25. mulop := * | /
	26. factor := ( expression ) | var | call | NUM

The relational operators or relops do not associate (that is, an unparenthesized expression can only have one relational operator). The / symbol represents integer division; that is, any remainder is truncated. 
An array variable must be subscripted, except in the case of an expression consisting of a single ID and used in a function call with an array parameter (see below). 

	27. call := ID ( args ) 
	28. args := arg-list | empty 
	29.  arg-list := arg-list , expression | expression 

A function call consists of an ID (the name of the function), followed by parentheses enclosing its arguments. Arguments are either empty or consist of a comma separated list of expressions, representing the values to be assigned to parameters during a call. Functions must be declared before they are called, and the number of parameters in a declaration must equal the number of arguments in a call. An array parameter in a function declaration must be matched with an expression consisting of a single identifier representing an array variable. 

Finally the above rules give no input or output statement. We must include such functions in the definition of C--. Consider two functions to be predefined in the global environment, as though they had the following declarations:

	int input (void) { . . . }
	void output ( int x ) { . . . }

The input function has no paramters and returns an integer value read from stdin (keyboard). The output function takes one integer parameter whose value is printed to stdout (screen) with a newline.

 */
	static boolean ErrorsFound = false;
	static String ErrorMessages = "";
	static LexScan lex;
	public static void main(String[] args)
	{
		lex = new LexScan("src/cmmCompiler/Recognizer.java");
		//scan whole file into vecter
		//could also use nextToken() to pull a single token
		if ( program() && !ErrorsFound )
			System.out.println("\nACCEPT");
		else 
			System.out.println("\nERRORS FOUND\n"+ErrorMessages);
	}


	static Token currentToken = null;
	static public boolean accept (int TokenType)
	{
		if (currentToken == null) 
			currentToken = lex.nextToken();
		if (currentToken.type != TokenType)
			return false; 
		currentToken = null;
		return true;
	}
	static public boolean expect (int tokenType)
	{
		if (accept(tokenType))
			return true;
		ErrorsFound = true;
		String ErrorMessage = 
			"\nError at line: "+currentToken.lineNo +
			" column: "+currentToken.column +
			"\n\tExpected: "+Token.tokenName(tokenType)+
			"\n\t   Found: "+Token.tokenName(currentToken.type) + 
			"\t("+currentToken.text+")";	
		ErrorMessages = ErrorMessages + ErrorMessage;
		return false;
	}
	
//	 1. program := declaration-list 

	static public boolean program(){
		return declarationList();
	}
//	 2. declaration-list := declaration-list declaration | empty

	static public boolean declarationList(){
		while( declaration() )
			;
		return true;
	}

//	 3. declaration := var-declaration | fun-declaration 

	static public boolean declaration(){
		return varDeclaration() || funDeclaration();
	}
	
//	 4. var-declaration := type-specifier ID ; | type-specifier ID [ NUM] ; 

	static public boolean varDeclaration(){
		if ( !typeSpecifier() ) return false;
		expect(Token.ID);
		if ( accept(Token.OPEN_BRACKET) )
		{
			expect(Token.NUMBER);
			expect(Token.CLOSE_BRACKET);
		}
		expect(Token.SEMI);
		return true;
	}
	
//	 5. type-specifier := int | void 

	static public boolean typeSpecifier(){
		return accept(Token.INT) || accept(Token.VOID);
	}
	
//	 6. fun-declaration := type-specifier ID ( params ) compound-stmt 

	static public boolean funDeclaration(){
		if ( !typeSpecifier() ) return false;
		expect(Token.ID);
		expect(Token.OPEN_PAREN);
		params();
		expect(Token.CLOSE_PAREN);
		compoundStmt();
		return true;
	}
	
//	 7. params := param-list | void | empty 

	static public boolean params(){
		return ( paramList() || accept(Token.VOID) || true );
	}
	
//	 8. param-list := param-list , param | param 

	static public boolean paramList(){
		if ( !param() ) return false;
		while ( accept(Token.COMMA) )
			param();
		return true;
	}
	
//	 9. param := type-specifier ID | type-specifier ID [] 

	static public boolean param(){
		if ( !typeSpecifier() ) return false;
		expect(Token.ID);
		if ( accept(Token.OPEN_BRACKET) )
		{
			expect(Token.CLOSE_BRACKET);
		}
		return true;
	}
	
//	10. compound-stmt := { local-declarations statement-list } 

	static public boolean compoundStmt(){
		accept(Token.OPEN_CURLY);
		localDeclarations();
		statementList();
		expect(Token.CLOSE_CURLY);
		return true;
	}
	
//	11. local-declarations := local-declarations var-declaration | empty

	static public boolean localDeclarations(){
		while ( varDeclaration() )
			;
		return true;
	}
	
//	12. statement-list := statement-list statement | empty 

	static public boolean statementList(){
		while ( statement() )
			;
		return true;
	}
	
//	13. statement := expression-stmt | compound-stmt | selection-stmt | iteration-stmt | return-stmt 

	static public boolean statement(){
		return expressionStmt() || compoundStmt() || 
			selectionStmt() || iterationStmt() || returnStmt();
	}
	
//	14. expression-stmt := expression ; | ;

	static public boolean expressionStmt(){
		if ( accept(Token.SEMI) )
			return true;
		return expression();
	}
	
//	15. selection-stmt := if ( expression ) statement | if ( expression ) statement else statement 

	static public boolean selectionStmt(){
		if ( ! accept(Token.IF) )
			return false;
		expect(Token.OPEN_PAREN);
		expression();
		expect(Token.CLOSE_PAREN);
		statement();
		if ( ! accept(Token.ELSE) )
			statement();
		return true;
	}
	
//	16. iteration-stmt := while ( expression ) statement 

	static public boolean iterationStmt(){
		if ( ! accept(Token.WHILE) )
			return false;
		expect(Token.OPEN_PAREN);
		expression();
		expect(Token.CLOSE_PAREN);
		statement();
		return true;
	}
	
//	17. return-stmt := return ; | return expression ; 

	static public boolean returnStmt(){
		if ( ! accept(Token.RETURN) )
			return false;
		expression();
		expect(Token.SEMI);
		return true;
	}
	
//	18. expression  := var = expression | simple-expression 

	static public boolean expression(){
		if ( simpleExpression() )
			return true;
		if ( ! var() )
			return false;
		expect(Token.ASSIGN);
		expression();
		return true;
	}
	
//	19. var := ID | ID [ expression ] 

	static public boolean var(){
		if ( ! accept(Token.ID) )
			return false;
		if ( accept(Token.OPEN_BRACKET) )
		{
			expression();
			expect(Token.CLOSE_BRACKET);
		}
		return true;
	}
	
//	20. simple-expression := additive-expression relop additive-expression | additive-expression

	static public boolean simpleExpression(){
		if ( ! additiveExpression() )
			return false;
		if ( relop())
			additiveExpression();
		return true;
	}
	
//	21. relop := <= | < | > | >= | == | !=

	static public boolean relop(){
		return accept(Token.OP_EQ) || accept(Token.OP_NE) 
			|| accept(Token.OP_LT) || accept(Token.OP_LE)
			|| accept(Token.OP_GT) || accept(Token.OP_GE);
	}
	
//	22. additive-expression := additive-expression addop term | term

	static public boolean additiveExpression(){
		if (! additiveExpression() ) //obvious problem here
			return term();
		addOp();
		term();
		return true;
	}
	
//	23. add-op := + | -

	static public boolean addOp(){
		return accept(Token.OP_PLUS) || accept(Token.OP_MINUS);
	}
	
//	24. term := term mulop factor | factor

	static public boolean term(){
		if (! term() ) //obvious problem here
			return factor();
		mulop();
		factor();
		return true;
	}
	
//	25. mulop := * | /

	static public boolean mulop(){
		return accept(Token.OP_MUL) || accept(Token.OP_DIV);
	}
	
//	26. factor := ( expression ) | var | call | NUM

	static public boolean factor(){
		if ( accept(Token.OPEN_PAREN) ) {
			expression();
			expect(Token.CLOSE_PAREN);
			return true;
		}
		return var() || call() || accept(Token.NUMBER) ;
	}
	
//	27. call := ID ( args ) 

	static public boolean call(){
		if (! accept(Token.ID) )
			return false;
		accept(Token.OPEN_PAREN);
		args();
		expect(Token.OPEN_PAREN);
		return true;
	}
	
//	28. args := arg-list | empty 

	static public boolean args(){
		argList();
		return true;
	}
	
//	29.  arg-list := arg-list , expression | expression 

	static public boolean argList(){
		if (! expression() )
			return false;
		while ( accept(Token.COMMA) )
			expression();
		return true;
	}	
}
