Google Play badge

syntax


SYNTAX

In computer science, the term syntax of a computer language refers to the set of rules that define the combination of symbols considered to be a correctly structured fragment or document in that language. This applies both to markup languages, where the document represents data and programming languages, where the document represents source code. A language’s syntax defines its surface form. Computer languages that are text-based are based on the sequences of characters. Visual programming languages on the other hand are based on the connection between symbols (which may be graphical or textual) and on the spatial layout. Documents that happen to be syntactically invalid are said to have a syntax error.

Syntax – the form – is contrasted with semantics – the meaning. In computer languages processing, semantic processing normally comes after syntactic processing. However, in some cases semantic processing is important for complete syntactic analysis, and they are therefore done concurrently or together. In a compiler, the syntactic analysis comprises the frontend, while the semantic analysis comprises the backend (and the middle end in cases where the phase is distinguished).

LEVELS OF SYNTAX

Computer language syntax is normally distinguished into three different levels:

Distinguishing in such a manner produces modularity allowing every level to be described as well as processed separately, and often independently. It starts by a lexer turning the linear sequence of characters into a linear sequence of tokens: this is referred to as lexical analysis or lexing.

Second, the parser turns the linear sequence of tokens into what is called a hierarchical syntax tree. This is referred to as parsing.

Thirdly, the contextual analysis resolves checks and names types. The parsing stage itself can be subdivided into two parts: the concrete syntax tree or the parse tree which is determined by the grammar, but it is too detailed for practical use, and the abstract syntax tree (AST), which simplifies this into a form that is usable.

SYNTAX VERSUS SEMANTICS

The syntax of a language describes a valid program’s form, but it does not provide any information concerning the meaning of the program or the results that come with executing that program. The meaning that is given to a combination of symbols is handled by semantics (either hard-coded or formal in a reference implementation). Not all programs that are syntactically correct are semantically correct. A big number of the syntactically correct programs are nonetheless ill-formed, per the rules of the language; and may (with regard to the language specification as well as the soundness of the implementation) result in an error on execution or translation. In some instances, such programs can show undefined behavior. Even at times when a program is well defined in a language, it can still have a meaning not intended by its writer.

In a natural language’s example, it may be impossible to give a meaning to a sentence that is grammatically correct. For example,

Download Primer to continue