Programming Language Principles

1. Intro
2. Lexical Analysis
3. Parsing
4. Binding Times
5. Sematic Analysis
6. Control Flow
7. Type Systems
8. Composite Types
9. Subroutine and Control Abstraction
10. TODO Scott Supplement
11. λ Calculus
12. JVM Overview
13. Denotational Semantics
14. Axiomatic Semantics
- 14.1. Assertions
- 14.2. Axioms
15. OOP

1 Intro

1.1 Specification of programming lang:

Syntax
Semantics

1.2 Spectrum

Imperative
Declarative:
- Functional
- Logic
- query
Concurrent

1.3 Compilation process

Source code → incomplete machine code → complete machine code
                ↑
                 ↖ Linker inserts system libraries

1.4 JVM

has support for OOP, exception, threads

1.5 Bootstrapping

Suppose we want to make compiler for new language L.

Assuming we have C compiler on our target machine, write compiler for L using simple subset of C.
Once this compiler is complete, translate the compiler from C to L.
Use the compiler from step 1 to compile the compiler written in L.
Use the compiler from step 3.

Start with smaller subset of L to be implemented using C, then use L compiler to implement other L features.

1.6 Preprocessor

Trivial manipulation of text using simple pattern matching.
Usually no error messages

1.7 Turing Complete

A programming language is said to be Turing complete if it can be shown that
it is computationally equivalent to a Turing machine. ¹

In general, for an imperative language to be Turing complete, it needs²:
1. A form of conditional repetition or conditional jump (while, if + goto)
2. A way to read and write some form of storage
For a λ calculus based function language to be Turing complete, it needs:
1. Ability to abstract over arguments (Eg: lambda abstraction, quotation)
2. Ability to apply functions to arguments (Eg: reduction)

2 Lexical Analysis

2.1 Syntax v/s semantics

Syntax: Structure of the language

Semantics: meaning of the constructs

// here the syntax is correct but it is semantically incorrect code, notice
// return type of `funFoo` and the predicate, comparing int and string is
// meaningless.
int funFoo(int c) {
  if (c > "string") {
    return "type mismatch";
  } else {
    return 2;
  }
}

2.2 Regular Expression

2.2.1 ε

Empty String ""

2.2.2 Atom:

Eg: Σ = {a, b c}

2.2.3 Concatenation

a ε = {a}
ab = {ab}

2.2.4 Alternative: |

a | b
(a | b) c = {ac, bc}

2.2.5 Kleene Closure: *

a* = {ε, a, aa, aaa}
a(b|c)*a = {aa, aba, aca…}

2.2.6 Derived notations:

positive closure: +
- a+ = aa*
Not:
- not(a) = Σ - a we can derive this if the alphabets are finite, we can enumerate all the alphabets

2.2.7 Restrictions

Nothing can be defined recursively.
The extension to RE can be shown to be equivalent to the basic operations just by substitution.
disadvantage: recursive definitions cannot be expressed
- nested things cannot be expressed Eg: blocks, * /*…* */
- since comments cannot be expressed using RE, nested comments are not allowed in Java/C.
  - comments are removed during lexical analysis

2.3 Scanners

tokenize using RE
01 → will be tokenized as two digit with nothing between it
- it will be marked as error by parser
responsible for:
- tokenizing source
- removing comments
- dealing with pragmas
- saving text of identifiers, numers, strings
- saving source location for error messages
  - whitespace, including newlines, is usually discarded, so getting line or column number after scanning phase is not possible.

2.3.1 Implementation

Ad-doc
mechanical DFA
table-driven DFA

In Java
1. Token type
```
class Token {
    Kind kind;
    int pos;     \\ -----|
    int lineNum; \\ -----useful for debugging
}
```
2. Use each state as case in switch (Eg: START, AFTER_START)
3. Enums:
  - they have types
  - they are ordered
  - can be used in switch case
4. Handling comments
  - track pos and line number
5. Reserved words v/s keyword
  - reserved word: cannot be used as identifier
  - keyword: cannot be used as identifier because they have special meaning in laguage
  - Eg: goto in java, cannot be used as identifier but is not a keyword (does nothing)
Handling reserved word
- build identifier first then keyword
- list all the reserved words, then check if the identifier is present in the list

2.3.2 Whitespace

skip
used as seperators
add another path on start which loops to start on whitespace.

2.3.3 Automatic DFA generation

DFA can be generated based on RE for parsing.
Regular Expression → ε-NFA → NFA → DFA

2.4 Informal Summary

Lexical analysis forms stream of tokens from stream of characters
Takes care of:
- comments
- line and column number for error reporting
- text/strings

3 Parsing

3.1 Regular and Context Free language:

Any set of strings that can be defined in terms of the first three rules (Atom, Alternative, Kleene closure) is called a regular language
Any set of strings that can be defined if we add recursion is called context free language (CFL).

3.2 CFG: Context Free Grammar

RE is not recursive, need more powerful formalism

3.2.1 BNF

Σ set of alphabet (terminals)
N set of non terminals (variables)
P set of productions (substitution rules)
S start symbol

3.2.2 EBNF

In addition to BNF:

| or
- multiple production from same variable
* 0 or more
+ one or more

3.2.3 Example:

expr → factor | expr op factor
factor → ident | numlit | (expr)

3.3 Non Context Free Grammar

for CFG production doesn't depend on neighbors
if non CFG, there would be more than one symbol on lhs.
CFG has only one symbol on left hand side, so it doesn't depend on neighbors, hence the context free ness.
programming language are not context free
- we handle it separately using other techniques.

3.4 Parse tree

Scanner converts stream of characters into stream of tokens
Parser constructs parse tree from stream of tokens.

3.5 Ambiguous Grammar

more than one parse tree for given expression
if if else example

3.6 Grammar Class

we can create parser for any CFG which runs in O(n³)
LL and LR run in linear time
- both are used in compilers, LR being more common
LL v/s LR:
- LL tries to predict the next symbol by peeking the upcoming token and deciding which production to choose
- LR shifts token until it recongnizes a right hand side, it then reduces it to left hand side

3.6.1 LL: Left to right, Left-most derivation

considered simpler than LR, smaller class than LR
usually created by hand (not generated)
Also called as top down, predictive parser
usually hand written- we will use this in class

LL Implementation
- start with root
- Peek to check next token in the input string

3.6.2 LR: Left to right, Right-most derivation

usually generated
larger class than LL
also called as bottom up parser or shift reduce parser
skim the topic from book, not covered in class
Subtypes of LR:
- SLR
- LALR
- full LR

3.7 Grammar and parsing

3.8 Top Down parser

3.8.1 LL(K): look ahead k tokens

3.8.2 LL(*): can look ahead as much tokens as needed

3.8.3 First Set

First of terminal is the set containing the terminal

S → AB
A → x | y
B → z | w
First(A) = {x, y}
First(B) = {z, w}
First(S) = {x, y}

In general:
S → A | B
First(S) = First(A) ∪ First(B)

for LL(1), there should not be common element in First of two elements

Including ε in First

S → Ay
A → x | ε
First(S) = {x, y}

3.8.4 Follow Set

S → A B
A → x | y
B → z | w

Follow(A) = {z, w} = First(B)
Follow(B) = {}

Follow(x) = Follow(A)

In A → xA, Follow(A) is Follow(A) (see what comes after A, in RHS)

3.8.5 Predict Set

Predict set is defined for a production.

EPS(α) = (α →* ε)
Predict(A → α) = First(α) ∪ (if EPS(α) then Follow(A) else {})

If A results in ε production then predict set consists of first and follow of A.
Predict set tell us if grammar is LL(1)
and which production to choose for next input.

3.8.6 LL(1) Rule

The predict set of all productions with the same left side should be disjoint.
This is neccessary and sufficient condition for a grammar to be LL(1)

3.8.7 Recursive decent parsing working

determine which production to get based on predict set
If the predict set of all productions with same left side are not disjoint, then there are multiple options for production which can produce next symbol (but not necessarily complete string).

3.8.8 Left recrusive grammar

LL parsing doesn't work on Left recursive grammar

Left recursive removal

A → A σ | β
can be replaced with
A → β B
B → σ B | ε

how to: expand the original grammar to get feel of the language (Strings generated)
A → A σ | β
will give: β σ, β σ σ, β σ σ σ...
note that strings start with β and have several trailing σ
so rewrite grammar to represent this using non left recursive grammar:
A → β | B
B → σ B | ε

3.8.9 associativity and precedence

Associativity
- operator group left to right or right to left.
- Eg: 10 - 4 - 5 means (10 - 4) - 5 rather than 10 - (4 - 5)
Precedence
- some operator group tightly than others
- Eg: 10 - 4 × 3 = 10 - 12 rather than 6 × 3
Inbuilt associativity and precedence
- we can make grammar in such a way that associativity and precedence is handled automatically.
```
expr → term | expr add_op term
term → factor | term mult_op factor
factor → id | number | (expr)
add_op → + | -
mult_op → × | ÷
```
- in above example, note how we have introduced term to make sure mult_op is grouped more tightly than add_op
- since operators are left associative, we have expr (which has potential to produce more add_op) always to left of add_op

3.9 Parser Construction

we have LL(1) grammar, construct recursive decent parser

3.9.1 Grammar Simplication

useless symbols
redundant stuff

3.9.2 Handling errors

Halt
- Halt and print error messages
- Easy to implement
- inconvinient for programmar
Recovery
1. panic mode
  - define safe symbols that delimit "clean" point
  - delete tokens until safe symbol
2. phrase recovery
  - different set of safe symbols in different context.
3. context specific
4. exception based

3.10 Parsers

3.10.1 Backtracking Algorithms

BFS
- worst case exponential time
- worst case exponential space
- can parse every grammar
DFS
- worst case exponential time
- worst case near space
- Cannot parse left recursive grammar
Both BFS and DFS deploy backtracking, if production does not apply, backtrack and try next production.

3.10.2 Predictive Algorithms

Based on remaining input, predict which production to use
Faster than backtracking - linear in speed
Not all grammar can be parsed using predictive algorithm
How to predict?
- use look ahead
- increasing number of lookahead makes parser more powerful but makes it more complicated too

LL(1)

Left to right scan of token
Leftmost derivation
(1): 1 token of lookahead

Parse Table

1. E → int
2. E → (E Op E)
3. Op → +
4. Op → *

	int	(	)	+	*
E	1	2
Op				3	4

Example:

	E $	(int + (int + int)) $
2	(E Op E) $	(int + (int + int)) $
	E Op E) $	int + (int + int)) $
1	int Op E) $	int + (int + int)) $
	Op E) $	+ (int + int)) $
3	+ E) $	+ (int + int)) $
	E) $	(int + int)) $
2	(E Op E)) $	(int + int)) $
	E Op E)) $	int + int)) $
1	int Op E)) $	int + int)) $
	Op E)) $	+ int)) $
3	+ E)) $	+ int)) $
	E)) $	int)) $
1	int)) $	int)) $
	$	$

Construction of Parse table
1. Intuition
  
  The next character should uniquely identify a production, so we should pick a production that ultimately starts with that character
2. First Sets
  The First(A) is the set of terminals that can be at the start of a string produced by A.
  - we can use First set to make parsing table.
  - if rules contain ε productions, then we have to look through the non terminals and see First of the next non terminal
    - For A → α where α has ε in First set then: First(A) = First(A) ∪ {ε}
    - For A → α t ω, where α First set contains ε and t is terminal: First(A) = First(A) ∪ {t}
    - For A → α B ω, where B is non terminal and α has ε in First set: First(A) = First(A) ∪ First(B)
3. Follow Set
  Follow(A) is set of all terminal symbols that can immediately follow a
  subsequence derived from A in a sequence derived from start symbol, S.
```
Num → Sign Digits
Sign → + | - | ε

Sign has ε production, so First(Sign) will contain First(Digits)
```
  - The follow set represents the set of terminals that might come after a given non terminal.
    - If B → α A ω then: Follow(A) = Follow(A) ∪ First*(ω)
    - If B → α A ω and ε ⊂ First*(ω) then: Follow(A) = Follow(A) ∪ Follow(B)
    - In other words, follow of a symbol is that which can come after that, so that if that symbol goes to ε then the follow can appear. If the next symbol → ε then add follow of parent
4. Final Construction
  - For each rule A → ω, for each terminal t ⊂ First*(ω) set T[A, t] = ω (at A if you see t, use production to goto ω)
  - For each rule A → ω, if ε ⊂ First*(ω): set T[A, t] = ω for each t ⊂ Follow(A) (at A, if A can ε and t can be produced by follow then take ε path and let A get lost)

Limitation of LL(1)

Left recursion

can be eliminated

A → A ω | α
rewrite as
A → α B
B → ω B | ε

Explanation:
A can produce α ω, α ω ω ..., starts with α
second rule for repeating ω and ε for stopping

Left factoring

E → T
E → T + E
T → int
T → (E)

rewrite as:
E → TY
Y → + E | ε
...

Condition for LL(1):

A grammar G is LL(1) iff for any productions A → ω1 and A → ω2 the sets:
First(ω1 Follow(A)) and
First(ω2 Follow(A))
are disjoint

This condition is equivalent to saying that there are no conflicts in the table

4 Binding Times

Language design time: control flow constructs, primitive data types…
Language implementation time: IO, size of stack, heap etc
program writing time: variable name
compile time: layout of static data
Link Time: compiler compiles different modules at different times, binding to variables in other module is finalized during link time.
Load time
run time: address of variable.

4.1 Binding and visibility

void testFoo() {
  int arg = 100;
  for (int arg = 0; i < 10; arg++) {
    ; // arg here loops from 0 to 9
    // binding for arg = 100 is active here but not visible
  }
  // here arg = 100
}

4.2 Storage Allocation Mechanism

4.2.1 Static Allocation

Global variables
literals
tables generated for helping runtime: debugging, dynamic type checking etc

Languages without recursion
- some languages such as Fortran (before Fortran90), did not have recursion.
- as a result, only one invocation of subroutine will be active at a time.
- space for the variables can be allocated statically and avoid runtime overhead of creation and destruction.
Constants
- constants can be allocated space during compile time, since their value can be shared across recursive calls.
- Some constants value cannot be determined until runtime. Such elaboration-time constants are allocated on stack.

4.2.2 Stack Allocation

if language allows recursion, local variables cannot be allocated statically.
each instance of subroutine has its own frame or activation record
- each frame contains:
  - arguments
  - return values
  - local variables
  - temporaries: space for temp implicit var. (2 + 2 + 3), store 4 in temp then 4 + 3
  - Bookkeeping information:
    - subroutine's return address
    - reference to stack frame of caller (dynamic link)
offset of objects from frame base is known at compile time.
Stack grows downwards (towards lower address) in most language implementation.
- Local variables, temporaries and bookkeeping information typically have negative offsets from frame pointer
- Arguments and returns typically have positive offsets–they reside in caller's frame
even if language doesn't support recursion, stack based allocation of local variables is better than static allocation since, not all subroutines are active at the same time, stack based allocation takes less memory.

4.2.3 Heap Allocation

Issues:
- Internal fragmentation
- External fragmentation
Allocation strategy:
- First fit: more time, more efficient
- Best fit: less time, less efficient
- List of preallocated standard size blocks: fibonacci heap and buddy system
- compacting requires moving already allocated blocks – requires updating existing pointer.
Garbage Collection
- Implicit
  - unreachable object
  - on exiting scope - C++ RAII

4.3 Scope

the textual region of the program in which the binding is active is its scope
Bindings for local objects are created and binding for global objects are made invisible on entering a new scope.
On scope exit, we delete local variables binding and reactivate the binding for global variables.
This does not require any code to be executed → can be determined statically → statically scoped or lexically scoped
Set of active binding is called current referencing environment

4.3.1 Static Scope

there can be nested scope.
search current scope, if not found then search outer scope. Repeat untill reach global scope.
if no variable found even in global scope, then throw an error;

Hole in scope
- A name-to-object binding that is hidden by a nested declaration of the same name is said to have a hole in its scope.
  - or we can say the closer object has eclipsed the outer definition.
- Some languages may have qualifier to access outer scope variables.
Objects in lexically surrounding scopes
- the frames of surrounding subroutines are already active on the stack.
1. Static Chain
  - link to the surrounding block's stack
    (blockA) { (blockB) { (blockC) { } (blockD) { } } }
  - in above example, during execution of blockD, the scope for blockB has to be active. Thus, during execution of blockD, the stack will look like: A calls B calls D calls C
    
    blockC
    
    blockD
    
    blockB
    
    blockA
    - C and D have static link to B
    - B has static link to A

blockC
blockD
blockB
blockA

4.3.2 Declaration Order

can we use a variable in a block before it was declared.

Forward Reference
An attempt to use a name before it's declaration
- Mixed strategy used by languages
  - In Java/C++, members are visible through out the class even before defined.
  - Python, Scheme and Module2 does it differently. Refer book page: 133.
Declaration and Definition
- to implement mutually recursive data structures (example: manager and employee) C,C++ has declaration and definition.
- Declaration gives enough information to compiler about it's name and scope.
- Definition gives actual implementation details.
Redeclaration
- most dynamic languages overwrites the older value.
- In ML, however, the old meaning of variable is accessible to the functions defined before redeclation.
```
let x = 1;
let f y = x + y;
f 2 (* gives 3 *)
let x = 3
f 2 (* we might expect 5, but the output is 3 (1 + 2) *)
```
- In other languages like Scheme, stores bindings of names in known locations. Whereas in ML, program is compiled that access old meaning of variable.
1. TODO does ML simply substitute the value of variable while defining the function?
  
  Maybe because it is immutable.

4.3.3 Dynamic Scoping

The bindings depends on flow of control during runtime.

4.3.4 Implementing Scope

Symbol table is dictionary which maps name to information about it.

Static Scope:
- never delete anything from symbol table.
- Most variations on static scoping can be handled by augmenting a basic dictionary style symbol table with enter_scope and leave_scope operations to keep track of visibility.
1. LeBlanc and Cook Table
  - Each scope, as it is encountered, is assigned a serial number.
    - Eg: The outermost scope is given number 0 (the one containing predefined variables
    - The scope containing programmer defined global variables is given 1 and so on.
  - All names are entered into single large hash table
    - Each entry contains symbol name, category (variable, constant, type, procedure, fieldname, paramter), scope number and other details.
  - Symbol table has scope stack that indicates the scopes that compose the current referencing environment.
  - To look up a name in the table, we scan the hash table, for each entry in the table, we search scope stack to check scope of that entry has scope visible.
Dynamic Scope:
1. A-list (Association list)
  - a simple list of name-value pairs.
  - When used to implement dynamic scoping it functions as stack: new declarations are pushed as they are encountered and popped as the end of the scope in which they appear.
2. Central Reference table
  - maintains a mapping from names to current meanings.
  - new entries are added in front of the chain (bucket list (when key collides))
  - lookup returns the entry at front of the chain.
  - on exit, delete the entry at front.

4.3.5 Shallow and Deep binding (passing subroutine as parameter)

def sub1:
    def sub2:
        print x # prints 3 (deep), 4 (shallow)
    def sub3:
        x = 3
        sub4(sub2) # <- deep binding created here
    def sub4 (subx):
        x = 4
        subx() # <- shallow binding created here
    x = 1
    sub3()

Deep Binding
- referencing environment is the one when reference is created
- in the example, binding x = 3 is created when we pass sub2
Shallow Binding
- referencing environment is the one during routine is called
- in the example, binding x = 4 is created when we call subx
Deep Binding and Closure
- Note that Deep binding has reference to the binding environment.
- if we pass this subroutine (second class citizen), the subroutine can access the environment deep in the stack.
- but this still can be implemented using stack and pointers. No need to allocate space on heap.
- the issue arises when we return the subroutine and the stack unwinds. In this case to have closure we need unlimited extent and thus space allocation for the subroutine on stack. See 4.3.6.4

4.3.6 First Class and unlimited extent

First Class
- can be passed as parameter
- can be returned from subroutine
- can be assigned to a variable
Second Class
- can be passed as parameter
- can not be returned from subroutine
- can not be assigned to variable
Third Class
- cannot be passed as parameter
- cannot be returned from subroutine
- cannot be assigned to variable
Misc: Unlimited Extent
- check 4.3.5.3, unlimited extent needed only in case of first class function, second class can do with stack allocation.
- if we return subroutine and that subroutine has access to local variable in surrounding subroutine,
  - we cannot use stack based allocation for local variables
  - the allocation must be on heap
  - handled by garbage collection
- C do not have nested subroutine
  - can have pure stack based allocation for local variables
- Scheme will have heap based allocation for local variables
- Imperative languages avoid first class for avoiding heap based local variable allocation.
1. Function Object/Object Closure/Functor
  - to implement closure in imperative language such as Java, we can store the context in object's field.
  - example:
    interface IntFunc { public int call(int i); } class PlusX implements IntFunc { final int X; PlusX(int n) { x = n; } public int call(int i) { return i + x; } }
    Here we are saving the closure context in object field int X

4.4 Alias

Multiple name bound to same object.

Overloading: correct function is called according to the type and number of parameters passed.
- can be determined at compile time.
Polymorphism: correct function (method) is invoked according to type of instance (parent or child).
- Determined at runtime.
Coercion: automatic conversion of type by compiler
- Eg: +
  - if + defined only for int and float supplied, coercion will take place
  - if separate + defined for float, by overloading + for float will be called.

5 Sematic Analysis

Syntax of language can be defined using context-free grammar
Semantics cannot be defined using CFG.
- some semantics can be defined, but it is very complicated to do so.
We describe sematics using Attribute grammar (AG) which is like pseudocode.
Semantics analysis can be divided into:
- Static: check for semantic error at compile time
- Dynamic: generate code which will check for error during execution time.

5.1 Static Analysis

Compile time algorithm which predict runtime behavior.
Example: type checking

5.2 Semantics:

Operational Semantics
Denotational Semantics
Axiomatic Semantics
Attribute Grammar

5.3 Attribute Grammar

formal way to define attributes for production of formal grammar, associating these attributes to values.
Based on context free grammar
- Symbols in grammar are associated with attributes
- Production rules in grammar are associated with semantic rules indicating how the attribute values are computed.
- May include conditions that impose constraint on legal sentences.

5.3.1 Synthesized

values are calculated (synthesized) only in productions in which their symbol appears on left hand side.
An attribute grammar in which all attributes are synthesized is said to be S-attributed.
Value of parent is calculated using children, left is calculated using right.

5.3.2 Inherited

values are calculated when their symbol is on the right hand side of the current production.
value is calculated using parent and left sibling. (therefore in above statment is said when symbol is on rhs).

5.3.3 Attribute Flow

attribute grammar is well defined if its rules determine a unique set of values for the attributes of every possible parse tree.
non circular: if no attribute ever depends on itself.
An algorithm that decorates parse tree by invoking the rules of an attribute grammar in an order consistent with the tree's attribute flow is called a translation scheme.
- most obvious: make repeated passes over a tree and invoke semantic function whose arguments have all been defined and stopping when no such value changes.
- dynamic: similar to topological sort
- static: based on the grammar
  - S-attributed grammars form the simplest of such class.
    - attribute flow in an S-attributed grammar is strictly bottom-up
    - can be combines with bottom up parsing.
    - can be implemented on the fly during LR parse.
  - L-attributed grammars can be evaluated by visiting the nodes of the parse tree in a single left to right, depth first traversal.
    - can be implemented on the fly during LL parse.
  - in the same order they are visited during a top-down parse.
  - L-attributed grammars can be defined more formally as:
    1. Each synthesized attribute of a left hand side symbol depends only on that symbol's own inherited attribute or on attributes (synthesized or inherited) of the production's right hand side symbols.
      - either from parent or synthesize through children
    2. each inherited attribute of a right hand side symbol depends only on inherited attributes of left hand side symbol or on attributes (synthesized or inherited) of symbols to its left in the right-hand-side.
      - inherited attribute (not synthesized, top down) is either inherited through parent or left sibling.
  - Every S-attributed grammar is also an L-attributed grammar
    - but the reverse is not true, S-attributed grammar do not permit the initialization of attributes on right hand side, so there are L-attributed grammar which are not S-attributed.

5.4 is generic just sugar coating over `instanceof`

no, compiler does static type checking

6 Control Flow

6.1 Precedence and associativity

The basic arithmetic operator always associate left to right
- 9 - 3 - 2 is 4 not 8.

6.2 Assignment

side effect: if a construct influences subsequent computation in any way other than by returning a value for use in the surrounding context.
Expressions: always produce a value and may or may not have side effect
statement: executed solely for their side effects and return no useful value.

6.3 References and Values

6.3.1 value model

l-value: used on left hand side of assignments, expressions that denote locations are referred as l-values
r-value: expression that denote value, used on right hand side

6.3.2 reference model

distinction is made between l-values and r-values
a variable is not a named container for value, it is a named reference to a value

Example:

b := 2;
c := b;
a := b + c;

Value Model	Reference Model
we put 2 in b	we let b refer to 2
copy it into c	let c refer to it also
read these values, add them together	pass the reference to + operator and
and than place result in a	then let a refer to the result

in reference model there is only one 2 and all variable having value 2 refer to it.
in reference model every variable is l-value

6.3.3 Boxing

Java collections expect Object, primitive datatype are not descendants of Objects so we need to box them before using them in collections.
Recent version of Java does auto boxing, but returns Object so need to unbox manually.

6.3.4 Initialization

C, statically allocated variables are initialized to 0.
Java, C# initializes null to fields of classes typed objects.

Dynamic Check
- can assign IEEE standard NaN value to uninitialized number and be checked at runtime.
- use of NaN results in hardware interrupt.
Static Check
- check if code flow through any path leads to assigning value to variable before use.

6.3.5 Ordering within expressions

a - f(b) - c * d;
f(a, g(b), h(c));

we do not know which will evaluate first
may affect if the function calls have side effects
may affect compiler optimization of register allocation.
since compiler can optimize register allocation, the order of evaluation is left to implementation in some languages.
- except for Java and C#, where it is left to right.

6.3.6 Mathematical Identities

c = b - a + d
// rearrange as
c = d - a + b

can cause overflow.

6.3.7 Switch Case

jump table:
- can consume large space if set of labels is non dense and includes large value ranges.
sequential testing
- if number of case statement is less
hashing
- if set of label is large and has missing values and no large ranges
binary search

6.3.8 Applicative and Normal Order Evaluation

Applicative Order: evaluating before call
Normal Order: evaluating when (if) needed

6.3.9 Non determinacy

in which choice between alternatives (ie between control paths) is deliberately unspecified.

7 Type Systems

bits are untyped. hardware makes no attempt to keep track of which interpretation correspond to which location.
- bits can be interpreted as:
  - instructions
  - address
  - float, int, char literal
- assembly is untyped
- higher level languages keep track of types to provide contextual information and error reporting.
subroutines have types in languages in which they are 1st or 2nd class values.
strongly typed: if language prohibits application of any operation to any object that is not intended to support that operation.
statically typed: if language is strongly typed and type checking is done at compilation stage.
Dynamic type checking: checking at runtime.

7.1 Meaning of Type

from denotational point of view, type is simply a set of values.
from structural point of view, type is either one of a small collection of built in types (integer, character, Boolean etc) or composite type created.

7.2 Polymorphism

"having multiple form"
Parametric Polymorphism: the code takes in type either explicitly or implicitly.
- Explicit Parametric Polymorphism: generics/templates
Subtype Polymorphism: the code is designed to work with values of some type T, but programmer can define additional types to be extension of T and code will work with these types too.
- Usually found in OO Languages.

7.3 Orthogonality

Trivial Type: To characterize statement that is executed for its side effect and has no useful value, provide trivial type.
- void in C, unit in ML
- In pascal, we would have needed to take the return in dummy variable.

7.4 Enumeration

In C, enums are just typedef-ed integers
Pascal, enums and int are of different types, mixing them is not allowed.

7.5 Sub-range type

Why use subrange instead of int:
- helps in documentation
- dynamic check by compiler to see if value is in range
- optimize storage based on range.

7.6 Composite Types

7.6.1 Records (structs)

a record consists of collection of fields, each of which belong to a simpler type
similar to mathematical tuple, which corresponds to Cartesian product of types of the fields.

7.6.2 Variant Records (Union)

similar to Records, but only one field is valid at any given time.
it is the disjoint union of field types.

7.6.3 Arrays

function that maps index to member of component type.

7.6.4 Sets

Collection of distinct elements of base type.

7.6.5 Pointers

l-values
Pointers are often but not always implemented as address.
often used to define recursive data type

7.6.6 Lists

a list is defined recursively as either an empty list or a pair consisting head element and reference to a sublist.

7.6.7 Files

similar to arrays, maps index to component type.

7.7 Type Checking

Set of rules for:

Type Equivalence: when are types of 2 values equal
Type Compatibility: When can a value of a given type be used in given context
Type Inference: how to determine type of expression

7.8 Type Equivalence

7.8.1 Structural Equivalence

based on content of type definition
two types are same if they consist of the same component in the same order.
to determine if two types are structurally same, a compiler can expand their definition by replacing any embedded type names with their respective definitions recursively.

7.8.2 Name Equivalence

based on lexical occurrence of type definitions
each definition introduces new type
based on assumption that if programmer writes two type definition, then those are probably meant to represent different types.
strict name equivalence: aliased types are considered distinct
loose name equivalence: aliased types are considered equivalent

Ada gives both options

subtype elemType is integer; -- loose type equivalence
type person is new string;   -- strict type equivalence, new makes difference

7.9 Type Conversion and Cast

depending on the types involved, conversion may or may not require code to be executed during runtime:

Types structurally equivalent but language uses name equivalence: same low level representation, no coded needed.
types have different set of values but intersecting values are represented in the same way. (Ex: one maybe subrange of another). If the value assigned is not expected then code must be executed.
Types have different low level representation but can define correspondence in their values, (Ex: int and float). Runtime code is needed for checking and conversion.
- check might be needed to see if the value is within the bound

7.9.1 Nonconverting Type Cast

changing the type without changing underlying implementation
- ie interpret the bytes of a value as if they were another type
called as nonconverting type cast or type pun
reinterpret_cast in C++
In C, this can be done using: r = *((float *) &m);, take address of m (int) convert it into address of float and then dereference it.

7.9.2 Coercion

implicit conversion to expected type by language.

7.10 Universal Reference type

void *
Object

7.11 Type inference

sum of range: range may be out of bound, so int

7.12 Parametric Polymorphism

7.12.1 Generic Implementation

C++ template implementation is purely static.
- compiler creates a separate copy of the code for every instance.
Java generics have same code for all types at run time
- compiler does the casting

7.13 Equality Testing and assignment

Deep Copy: copy object and all other object pointed by it recursively
shallow copy: copy object and references as it is.

8 Composite Types

8.1 Records (Structures)

heterogeneous types
alignment restriction may lead to holes in memory layout
- can use pragma pack to give hint to compiler to optimize for space rather than speed
  - packing may lead to multi-instruction sequence to retrieve pieces of field from memory and then re-assemble it.
- languages support assigning one record to another but not testing
- can use block_copy to assign
- but block_compare may return incorrect value due to garbage in holes
- easy workaround - sort fields in ascending size.
  - may be done by compiler which may be transparent to programmer
  - reorganization may be helpful if we can group frequently accessed fields together so that they lie in same cache lines.
  - C/C++ allocates memory in the order defined by programmer

8.2 Unions

overall size of union is of the largest member
only one of the member is active at a time
unions allow the same set of bytes to be interpreted in different ways at different times

8.3 Arrays

homogeneous
mapping index to component type
some languages allow non discrete index
- can be implemented using hash map or search trees

8.3.1 Storage allocation

Stack Allocation
1. Dope Vector
  - datastructure used to hold information about data object about its memory layout
  - has information about lower bound of each dimension
  - size
  - the dope vector for record indicates the offset of each field from beginning of the record.
  - C supports dynamic shape for both parameters and local variables
    - to ensure local object can be found using a known offset from frame pointer the stack is divided into two parts:
      
      Fixed sized part: variable whose size is known statically goes here
      
      Variable sized part: whose size is not know until elaboration time goes here along with pointer to it and dope vector goes to fixed sized part
Heap Allocation
- if number of dimension of dynamic array is know statically then dope vector can be kept with pointer to data on stack. If dimension is not know then vector is kept at the beginning of the heap block instead.
1. Memory layout for multi-dimension array
  - Column Major: columns kept together
  - Row Major: rows kept together
  - Affect cache lines, if elements are accessed in a way that results in cache miss then it will affect performance.
  - Fortran has column major
  - 2d array may be implemented as array of pointers.
    - more space required
    - allows rows of different sizes

Summary

Lifetime	shape	allocation region
global	static	static global memory
local	static	stack
local	bound at elaboration time	stack with indirection (dope vector)
arbitrary	bound at elaboration time	heap (size doesn't change)
arbitrary	arbitrary	vector

In third case, space is allocated at end of stack (after temporaries), dope vector contains pointer to base address.

8.4 Strings

Dynamic strings easier to implement since they do not have reference to other objects, copying is easy.

8.5 Sets

Unordered, distinct elements
can be implemented:
- using bit-vector: 1 if present 0 otherwise
- ordered list (binary search)
- unordered list (hash map)

8.6 Pointers and recursive types

pointers needed only in language that use value model
Pointer is a high level concept: a reference to an object
address is a low level concept
Pointers are often implemented as address but not always

8.6.1 Value Model

A = B, puts value of A in B

8.6.2 Reference Model

A = B, makes A refer to the same object as B

8.6.3 Dangling Pointers

Tombstone: indirection is marked in a such a way that it reflects if the object is alive
Lock and Keys: have a word to each pointer and object and word must match for the pointer to be valid.

8.6.4 Garbage Collection

Reference counting
Smart Pointers
- unique_pointer: only one owner, when assigned, ownership is transfered
- shared_ptr: has tombstone like intermediate object which keeps count of destructors etc.
- weak_pointer: similar to shared pointer without increase reference count

Algorithms
- mark and sweep
- Pointer reversal
- Stop and copy

8.7 Lists

8.8 Files

9 Subroutine and Control Abstraction

Side Effects: A construct has side effect if it influences the subsequent computation in any way other than returning a value
Referential transparency: The value of an expression depends only on its referencing environment and not when it is evaluated.
Expression always produce a value, may or may not have side effects
statement: executed solely for side effects.
Boxing: automatic conversion between value and reference models

10 TODO Scott Supplement

Error Recovery

11 λ Calculus

11.1 Functions

mapping from domain to range
- can be denoted as set of pairs: sqrt = {(x, y) in R * R | x = y² }
λ calculus is calculus of anonymous functions
- a method of representing functions
- a set of conversion rules for syntactically transforming them.
λ x. * 2 x
- λ x. * 2 x is a lambda abstraction
- x is bound variable
- * 2x is body
λ x. λ y. * (+ x y) 2
- In scheme:
```
(lambda x
  (lambda y
    (* (+ x y) 2)))
```
- a function of x which returns a function of y which returns * (+ x y)
- λ y. * (+ x y)
  - y is bound in this expression
  - x is free
- λ x. λ y. * (+ x y) 2
  - both x and y are bound in this expression

11.2 Pure λ calculus

Everything is a function
Example: Natural numbers can be obtained from:
- constant function 0
- successor function S(n)
- Example: 3 = S(S(S(0)))

11.3 Conventions to avoid paranthesis

Application associates to the left
- (f g h) → (f g) h
Application has higher precedence than abstraction
- λ x. A B → λ x. (A B)
Abbreviate nested expressions with a list
- λ x y z. E → (λ x. (λ y. (λ z. E)))

11.4 Evaluation of λ expression (Reduction)

11.4.1 Reduction Rules

δ (delta)
- used for built in constant
- (+ 2 3) →_δ 5
β (beta)
- replace all occurrences of bound variable in the body with supplied parameter.
- ((λ x. E1) E2)
  - Replace all occurrences of E1 with E2 in function body
- Ex: (λ x. * 2 x) 5 →_β (* 2 5)
  - this can be further evaluated →_δ 10
α (alpha)
- rename bound variable
- (λ x. E(x)) →_α (λ y. E(y))
- Ex: (λ x. x + 1) →_α (λ y. y + 1)
η (eta)
- remove unnecessary abstractions
- (λ x. f) →_η f
- (λ x. + 2 3) →_η (+ 2 3) →_δ 5

11.4.2 Normal Form

A λ expression is in normal form if it cannot be further reduced using β, δ or η reduction.

Can every λ expression be reduced to normal form
- (λ x. x x) (λ x. x x)
- No
- Since λ expression can model never terminating expression, it is as powerful as Turing machine.
1. Church's thesis
  
  The effective computable functions on the positive integers are precisely those functions definable in the pure λ calculus (and computable by Turing machine)
Is there more than one way to reduce a λ expression
- Yes
- ((λ x. + (λ y. 4) x) 3)
  1. →_β (+ (λ y. 4) 3) →_η (+ 4 3) →_δ 7
  2. →_η ((λ x. + 4 x) 3) →_β (+ 4 3) →_δ 7
if there is more than one to reduce, does each lead to same normal form expression
- Yes
1. Church-Rosser Theorem
  - If an expression can be reduced in two different ways to two normal forms then these normal forms are same up to α conversion
  - Also called as diamond property
    - this does not imply that if a normal form exists, then any redex will work. (See example below)
Is there a reduction strategy that guarantees that a normal form will be produced (if there is one)
- Yes
1. Standardization Theorem
  - If an expression has normal form, then a leftmost outermost reduction at each state is guaranteed to reach that normal form.
  - δ or β does not always reduce: ((λ x. λ y. y) ((λ z. z z)(λ z. z z)))
    1. taking outermost redex →_η (λ y. y)
    2. taking innnermost redex →_β ((λ x. λ y. y) ((λ z. z z)(λ z. z z))) →_β …

11.5 Reduction Strategies

11.5.1 Normal Order

always choose leftmost outermost redex
normal order will yield the normal form if the expression has one
corresponds to call by name
- parameter is passed as unevaluated expression that is evaluated in the body of the function being executed each time the parameter is referenced.
- "always choose leftmost outermost redex": prefer function application over parameter evaluation.

11.5.2 Applicative Order

always choose leftmost innermost redex
corresponds to call by value
- an argument is evaluated before the function is applied
- "always choose leftmost innermost redex": choose parameter evaluation over function application.

11.5.3 Comparison

∞ computation

(define eternity
  (λ ()
    (eternity)))

(define zero
  (λ (x)
    0))

(zero eternity)

normal order: gives 0 immediately
applicative order: in this case, may not return

Argument Evaluation

fun sqr x = x * x;
sqr(sqr(sqr(2)))

(* in case of normal order evaluation
= sqr(sqr(2)) * sqr(sqr(2))
= sqr(2) * sqr(2) * sq(sqr(2))
...
redundant computations
*)

(* in case of applicative order reduction
= sqr(sqr(4))
= sqr(16)
= 256
*)

Graph Reduction (Lazy Evaluation)
- has pointer to expression
- save the result of expression after evaluation
- can pass argument unevaluated – normal order
- can save against long/∞ computation

11.6 Booleans and condition functional representation

true define true = λ x. λ y. x
false define false = λ x. λ y. y
if λ p. λ x. λ y. p x y

11.6.1 `if true a b = a`

if true a b
(λ p. λ x. λ y. p x y) true a b
→_β (λ x. λ y. true x y) a b
→_β (true a b)
replace true with its definition
(λ x. λ y. x) a b
→_β a

11.6.2 `if false a b = b`

if false a b
(λ p. λ x. λ y. p x y) false a b
→_β (false a b)
replace false with its definition
(λ x. λ y. y) a b
→_β b

11.7 Y Combinator

defining recursive functions without define

11.7.1 Fixed Points

a fixed point of f is a value x_fix that satisfies x_fix = f (x_fix)

In mathematics, a fixed point (also known as invariant point), of a function is an element in the domain of the function which is mapped to itself by the function. - wiki
example: f = λ x. * x x
- fixed points: 0 and 1
A function can have:
1. no fixed points (f x = x + 1)
2. a unique fixed point (f x = -x)
3. a finite number of fixed points (f x = * x x)
4. ∞ number of fixed points (f x = x)

12 JVM Overview

12.1 Types

int	I
boolean	Z
void	V
String	Ljava/lang/String;
String[]	[Ljava/lang/String;
ArrayList	Ljava/util/ArrayList;
…

12.2 Method Signature

'(' + Param Types + ')' + return Type
Example: void main (String[] args) = ([Ljava/lang/String;)V

12.3 Variables

Static
- getstatic
- putstatic
Non-static (fields)
- getfield
- putfield
local variables
- stores in array of local variables
- accessed using slot index
  - if non static method, then this stored in 0 index
  - followed by parameters
  - followed by local variables
- accessed using load and store

12.4 Method Invocation

invokevirtual
- class methods
invokestatic
- for static methods
invokespecial
- during object initialization

13 Denotational Semantics

13.1 Semantics

Semantics is expressed by defining:
1. Semantic domain
2. Semantic mapping
  - M: L → S, where L is language (syntactic domain) and S is semantic domain
In denotational semantics:
- Syntactic domain: Elements of abstract syntax
- Semantic domain: abstract mathematical objects
- Semantic functions: map elements of syntactic domain to elements of semantic domain

13.2 Ex: Simple Stack Machine

13.2.1 Abstract Syntax

Program	Instruction
Instruction	Instruction
	`load` Number
	`+`
	`*`
	`out`

13.2.2 Syntactic Domain

P: Program
I: Instruction
N: Number

13.2.3 Semantic Domain

Integer
Output	[Integer]
State	[Integer] * [Integer] (Stack * Output)

13.2.4 Semantic Functions

meaning	Program → [Integer]
execute	Instruction → State → State
value	Number → Integer

Function Definitions

execute[|load N|] (stack, output) = (value[|N|] :: stack, output)
execute[|+|] (x::y::xs, output) = ((x + y)::xs, output)
execute[|*|] (x::y::xs, output) = ((x * y)::xs, output
execute[|out|] (x::xs, output) = (xs, output @ x)
execute[|I1 I2|] = execute[|I2|] . execute[|I1|]
meaning[|P|] = execute[|I|] [] []  //empty stack and output

note: execute [|I1 I2|] = execute[|I2|] . execute[|I1|] is function composition
- output of execute[|I1|] is fed to execute[|I2|] as input
- f . g = f(g)

13.3 Modeling Memory

Using simple imperative language

13.3.1 Abstract Syntax

Program	Command
Command	Command Command
	Identifier := Expression
	skip
Expression	Identifier
	Number
	Expression + Expression

13.3.2 Syntactic Domain

P	Program
C	Command
E	Expression
N	Number
Id	Identifier

13.3.3 Semantic Domain

Integer
Store	Identifier → (Integer + undefined)

Memory is modeled as function store from identifier to values

13.3.4 Semantic functions

meaning	Program → Store
evaluate	Expression → Store → Integer
execute	Expression → Store → Store
value	Number → Integer
empty	λ Id. undefined
update	(Store, Id, Integer) → Store

there is no out, we'll see the effect as side effect on memory

store

store is a function which maps Id to (Integer, undefined)
Base case for store is empty, where everything is mapped to undefined
sto₀: - → undefined
sto₁: maps one id to integer, rest delegated to sto₀

sto₂: maps another id to integer, rest delegated to sto₁, so on…

update (sto_n, id, int_i) x = if (x = id)
                                then int_i
                                else sto_n x

(*
here,
(update (sto_n, id, val)) returns a function
  which maps id -> val, else
  delegated to sto_n
*)

evaluate

evaluate[|N|] sto = value [|N|]
evaluate[|id|] sto = sto(Id)
evaluate[|E1 + E2|] sto = evaluate[|E1|] sto + evaluate[|E2|] sto
evaluate[|skip|] sto = sto
evaluate[|id := E|] sto = update(sto, id, evaluate[|E|] sto)
evaluate[|C1 C2|] sto = evaluate[|C2|].evaluate[|C1|]
meaning[|C|] = evaluate[|C|] empty

13.3.5 Meaning

the meaning of program is a function
- in the simple stack machine, execute was function mapping Instruction → State → State
  - we start with empty stack and output and end with entries in them
- here there is no output, meaning of the program is store, which maps expressions → Store → Store
  - we start with empty store (empty function) and end with store which maps the identifiers to integers.

13.3.6 `if` statements

execute [|if E then C1 else C2|] state =
if (evaluate[|E|] state)
then evaluate[|C1|] state
else evaluate[|C2|]

13.3.7 `while` statements

execute [|while E do C|] (sto, in, out)
//unroll the loop
if (evaluate[|E|] (sto, in, out))
then execute [|while E do C|] . evaluate C
else skip

// loop is a fixed point
let loop = λ s:state .
             if (evaluate[|E|] s)
             then (loop (execute [|C|] s)) // ← recursively call loop
             else s
in loop
end

13.3.8 Nested Scopes

Environment and Location
Environment identifier → location (address)

Store location → value

env location + unbound

store location → storableType + unused + undefined
- unused: not to bound to an identifier
- undefined: no value stored yet
extendEnv

extendEnv (env, id₀, loc) id₁ = if (id₀ = id₁) then loc else env id₁
update store

update (sto, loc₀, val) loc₁ = if loc₁ = loc₀ then val else sto loc₁

14 Axiomatic Semantics

14.1 Assertions

Assertion true is satisfied by any program state
Assertion false is not satisfied by any program state

14.2 Axioms

14.2.1 Assignment Axiom

{ Q[V := E] and E is defined }
V := E
{ Q }

Whatever is true about E beforehand, is true about V afterwards.

Axioms can be used:

to calculate precondition

{ x = 0 }
x := x + 1
{ x = 1 }

// substitute statement in post condition
x =  1
x + 1 = 1
x = 0 = precondition
//statement is valid

to calculate the RHS of statement
```
{ true }
x = E
{ x = 4 }

E = 4
```

14.2.2 Stronger and Weaker assertions

A ⊃ B (A superset B)
- A implies B
- A is stronger than B
- A has more constraints, if A is true then B is too.
Eg: x > 0 ⊃ x ≥ 0
- set of values allowed by x > 0 = {1, 2, 3 …}
- set of values allowed by x ≥ 0 = {0, 1, 2, 3 …}
- x > 0 has more constraints than x ≥ 0
- x > 0 is stronger than x ≥ 0
false is stronger than P (false not satisfied by any state)
P stronger than true (true is satisfied by all state)
We want precondition to be weakest so that more number of states can be valid for that statement.
Assignment axiom gives weakest precondition for given post condition

14.2.3 Inference Rules

for P/Q, if we know whatever above line is true (P) then we can conclude everything below the line is true (Q).

Strengthen the precondition
```
P ⊃ Q, {Q} S {R}
—————–----------
{P} S {R}
```
- P is stronger than Q
- S in a state satisfying Q will result in state satisfying R
- anywhere where P is satisfied, Q is also satisfied
- ∴ the final state will satisfy R
  - this may seem counter intuitive if we consider S to a function which maps all pre to post condition
  - S is just a function which will get R if any one state of precondition is true.
  - The states allowed by P is subset of state allowed by Q (since P has stronger constraints)
Weaken the postcondition
```
P ⊃ R, {Q} S {P}
----------------
{Q} S {R}
```
- P is stronger than R
- S gets from Q to P
- any state satisfying P will satisfy R
- ∴ S gets Q to R

Sequence Rule

{P} S0 {Q}, {Q} S1 {R}
----------------------
{P} S0 S1 {R]

14.2.4 Some facts

1. A and B ⊇ A
2. A ⊇ A or B
3. {false} S {P}
4. {P} S {true}

my intuition:

B will either have some contraints or no constraints. Resulting set thus ⊇
Any element in set A or B will be true if present in A, but it may be also true if present in B. So A has more constraints than A or B
false has highest constraints
true has no constraints

14.2.5 Application in programming languages

skip

{P}
skip
{Q}

P ⊃ Q
-----
{P} skip {Q}

if

{P and B} S0 {Q}, (P and ¬ B} ⊃ Q)
—————–----------------------------
{P} if B then S0 {Q}

if-else

{P and B} S0 {Q}, {P and ¬ B} S1 {Q}
------------------------------------
{P} if B then S0 else S1 {Q}

note: if ≈ if-else with else having skip

14.2.6 Some rules about ⊃

A ⊃ C
———--------
A and B ⊃ C

A or B ⊃ C
------------
A ⊃ C, B ⊃ C

A ⊃ C
----------
A ⊃ B or C

A ⊃ B, A ⊃ C
------------
A ⊃ B and C

A ⊃ B and C
------------
A ⊃ B, A ⊃ C

A ⊃ C, B ⊃ C
------------
A or B ⊃ C

15 OOP

15.1 Abstraction

15.2 Encapsulation

15.3 Inheritance

15.4 Dynamic method binding

Footnotes:

http://wiki.c2.com/?TuringComplete

https://softwareengineering.stackexchange.com/questions/132385/what-makes-a-language-turing-complete

Environment	identifier → location (address)
Store	location → value
env	location + unbound
store	location → storableType + unused + undefined