Programming Language Principles

Table of Contents

1 Intro

1.1 Specification of programming lang:

  • Syntax
  • Semantics

1.2 Spectrum

  • Imperative
  • Declarative:
    • Functional
    • Logic
    • query
  • Concurrent

1.3 Compilation process

Source code → incomplete machine code → complete machine code
                ↑
                 ↖ Linker inserts system libraries

1.4 JVM

  • has support for OOP, exception, threads

1.5 Bootstrapping

Suppose we want to make compiler for new language L.

  1. Assuming we have C compiler on our target machine, write compiler for L using simple subset of C.
  2. Once this compiler is complete, translate the compiler from C to L.
  3. Use the compiler from step 1 to compile the compiler written in L.
  4. Use the compiler from step 3.

Start with smaller subset of L to be implemented using C, then use L compiler to implement other L features.

1.6 Preprocessor

  • Trivial manipulation of text using simple pattern matching.
  • Usually no error messages

1.7 Turing Complete

A programming language is said to be Turing complete if it can be shown that
it is computationally equivalent to a Turing machine. 1

  • In general, for an imperative language to be Turing complete, it needs2:
    1. A form of conditional repetition or conditional jump (while, if + goto)
    2. A way to read and write some form of storage
  • For a λ calculus based function language to be Turing complete, it needs:
    1. Ability to abstract over arguments (Eg: lambda abstraction, quotation)
    2. Ability to apply functions to arguments (Eg: reduction)

2 Lexical Analysis

2.1 Syntax v/s semantics

  • Syntax: Structure of the language
  • Semantics: meaning of the constructs

    // here the syntax is correct but it is semantically incorrect code, notice
    // return type of `funFoo` and the predicate, comparing int and string is
    // meaningless.
    int funFoo(int c) {
      if (c > "string") {
        return "type mismatch";
      } else {
        return 2;
      }
    }
    

2.2 Regular Expression

2.2.1 ε

Empty String ""

2.2.2 Atom:

Eg: Σ = {a, b c}

2.2.3 Concatenation

  • a ε = {a}
  • ab = {ab}

2.2.4 Alternative: |

  • a | b
  • (a | b) c = {ac, bc}

2.2.5 Kleene Closure: *

  • a* = {ε, a, aa, aaa}
  • a(b|c)*a = {aa, aba, aca…}

2.2.6 Derived notations:

  1. positive closure: +
    • a+ = aa*
  2. Not:
    • not(a) = Σ - a we can derive this if the alphabets are finite, we can enumerate all the alphabets

2.2.7 Restrictions

  • Nothing can be defined recursively.
  • The extension to RE can be shown to be equivalent to the basic operations just by substitution.
  • disadvantage: recursive definitions cannot be expressed
    • nested things cannot be expressed Eg: blocks, * /*…* */
    • since comments cannot be expressed using RE, nested comments are not allowed in Java/C.
      • comments are removed during lexical analysis

2.3 Scanners

  • tokenize using RE
  • 01 → will be tokenized as two digit with nothing between it
    • it will be marked as error by parser
  • responsible for:
    • tokenizing source
    • removing comments
    • dealing with pragmas
    • saving text of identifiers, numers, strings
    • saving source location for error messages
      • whitespace, including newlines, is usually discarded, so getting line or column number after scanning phase is not possible.

2.3.1 Implementation

  • Ad-doc
  • mechanical DFA
  • table-driven DFA
  1. In Java
    1. Token type
      class Token {
          Kind kind;
          int pos;     \\ -----|
          int lineNum; \\ -----useful for debugging
      }
      
    2. Use each state as case in switch (Eg: START, AFTERSTART)
    3. Enums:
      • they have types
      • they are ordered
      • can be used in switch case
    4. Handling comments
      • track pos and line number
    5. Reserved words v/s keyword
      • reserved word: cannot be used as identifier
      • keyword: cannot be used as identifier because they have special meaning in laguage
      • Eg: goto in java, cannot be used as identifier but is not a keyword (does nothing)
  2. Handling reserved word
    • build identifier first then keyword
    • list all the reserved words, then check if the identifier is present in the list

2.3.2 Whitespace

  • skip
  • used as seperators
  • add another path on start which loops to start on whitespace.

2.3.3 Automatic DFA generation

  • DFA can be generated based on RE for parsing.
  • Regular Expression → ε-NFA → NFA → DFA

2.4 Informal Summary

  • Lexical analysis forms stream of tokens from stream of characters
  • Takes care of:
    • comments
    • line and column number for error reporting
    • text/strings

3 Parsing

3.1 Regular and Context Free language:

  • Any set of strings that can be defined in terms of the first three rules (Atom, Alternative, Kleene closure) is called a regular language
  • Any set of strings that can be defined if we add recursion is called context free language (CFL).

3.2 CFG: Context Free Grammar

  • RE is not recursive, need more powerful formalism

3.2.1 BNF

  • Σ set of alphabet (terminals)
  • N set of non terminals (variables)
  • P set of productions (substitution rules)
  • S start symbol

3.2.2 EBNF

In addition to BNF:

  • | or
    • multiple production from same variable
  • * 0 or more
  • + one or more

3.2.3 Example:

expr → factor | expr op factor
factor → ident | numlit | (expr)

3.3 Non Context Free Grammar

  • for CFG production doesn't depend on neighbors
  • if non CFG, there would be more than one symbol on lhs.
  • CFG has only one symbol on left hand side, so it doesn't depend on neighbors, hence the context free ness.
  • programming language are not context free
    • we handle it separately using other techniques.

3.4 Parse tree

  • Scanner converts stream of characters into stream of tokens
  • Parser constructs parse tree from stream of tokens.

3.5 Ambiguous Grammar

  • more than one parse tree for given expression
  • if if else example

3.6 Grammar Class

  • we can create parser for any CFG which runs in O(n³)
  • LL and LR run in linear time
    • both are used in compilers, LR being more common
  • LL v/s LR:
    • LL tries to predict the next symbol by peeking the upcoming token and deciding which production to choose
    • LR shifts token until it recongnizes a right hand side, it then reduces it to left hand side

3.6.1 LL: Left to right, Left-most derivation

  • considered simpler than LR, smaller class than LR
  • usually created by hand (not generated)
  • Also called as top down, predictive parser
  • usually hand written- we will use this in class
  1. LL Implementation
    • start with root
    • Peek to check next token in the input string

3.6.2 LR: Left to right, Right-most derivation

  • usually generated
  • larger class than LL
  • also called as bottom up parser or shift reduce parser
  • skim the topic from book, not covered in class
  • Subtypes of LR:
    • SLR
    • LALR
    • full LR

3.7 Grammar and parsing

3.8 Top Down parser

3.8.1 LL(K): look ahead k tokens

3.8.2 LL(*): can look ahead as much tokens as needed

3.8.3 First Set

  • First of terminal is the set containing the terminal

    S → AB
    A → x | y
    B → z | w
    First(A) = {x, y}
    First(B) = {z, w}
    First(S) = {x, y}
    
    In general:
    S → A | B
    First(S) = First(A) ∪ First(B)
    
  • for LL(1), there should not be common element in First of two elements
  1. Including ε in First
    S → Ay
    A → x | ε
    First(S) = {x, y}
    

3.8.4 Follow Set

S → A B
A → x | y
B → z | w

Follow(A) = {z, w} = First(B)
Follow(B) = {}

Follow(x) = Follow(A)
  • In A → xA, Follow(A) is Follow(A) (see what comes after A, in RHS)

3.8.5 Predict Set

Predict set is defined for a production.

EPS(α) = (α →* ε)
Predict(A → α) = First(α) ∪ (if EPS(α) then Follow(A) else {})

  • If A results in ε production then predict set consists of first and follow of A.
  • Predict set tell us if grammar is LL(1)
  • and which production to choose for next input.

3.8.6 LL(1) Rule

The predict set of all productions with the same left side should be disjoint.
This is neccessary and sufficient condition for a grammar to be LL(1)

3.8.7 Recursive decent parsing working

  • determine which production to get based on predict set
  • If the predict set of all productions with same left side are not disjoint, then there are multiple options for production which can produce next symbol (but not necessarily complete string).

3.8.8 Left recrusive grammar

  • LL parsing doesn't work on Left recursive grammar
  1. Left recursive removal
    A → A σ | β
    can be replaced with
    A → β B
    B → σ B | ε
    
    how to: expand the original grammar to get feel of the language (Strings generated)
    A → A σ | β
    will give: β σ, β σ σ, β σ σ σ...
    note that strings start with β and have several trailing σ
    so rewrite grammar to represent this using non left recursive grammar:
    A → β | B
    B → σ B | ε
    

3.8.9 associativity and precedence

  1. Associativity
    • operator group left to right or right to left.
    • Eg: 10 - 4 - 5 means (10 - 4) - 5 rather than 10 - (4 - 5)
  2. Precedence
    • some operator group tightly than others
    • Eg: 10 - 4 × 3 = 10 - 12 rather than 6 × 3
  3. Inbuilt associativity and precedence
    • we can make grammar in such a way that associativity and precedence is handled automatically.

      expr → term | expr add_op term
      term → factor | term mult_op factor
      factor → id | number | (expr)
      add_op → + | -
      mult_op → × | ÷
      
    • in above example, note how we have introduced term to make sure mult_op is grouped more tightly than add_op
    • since operators are left associative, we have expr (which has potential to produce more add_op) always to left of add_op

3.9 Parser Construction

  • we have LL(1) grammar, construct recursive decent parser

3.9.1 Grammar Simplication

  • useless symbols
  • redundant stuff

3.9.2 Handling errors

  1. Halt
    • Halt and print error messages
    • Easy to implement
    • inconvinient for programmar
  2. Recovery
    1. panic mode
      • define safe symbols that delimit "clean" point
      • delete tokens until safe symbol
    2. phrase recovery
      • different set of safe symbols in different context.
    3. context specific
    4. exception based

3.10 Parsers

3.10.1 Backtracking Algorithms

  • BFS
    • worst case exponential time
    • worst case exponential space
    • can parse every grammar
  • DFS
    • worst case exponential time
    • worst case near space
    • Cannot parse left recursive grammar
  • Both BFS and DFS deploy backtracking, if production does not apply, backtrack and try next production.

3.10.2 Predictive Algorithms

  • Based on remaining input, predict which production to use
  • Faster than backtracking - linear in speed
  • Not all grammar can be parsed using predictive algorithm
  • How to predict?
    • use look ahead
    • increasing number of lookahead makes parser more powerful but makes it more complicated too
  1. LL(1)
    • Left to right scan of token
    • Leftmost derivation
    • (1): 1 token of lookahead
    1. Parse Table
      1. E → int
      2. E → (E Op E)
      3. Op → +
      4. Op → *
      
        int ( ) + *
      E 1 2      
      Op       3 4
      1. Example:
          E $ (int + (int + int)) $
        2 (E Op E) $ (int + (int + int)) $
          E Op E) $ int + (int + int)) $
        1 int Op E) $ int + (int + int)) $
          Op E) $ + (int + int)) $
        3 + E) $ + (int + int)) $
          E) $ (int + int)) $
        2 (E Op E)) $ (int + int)) $
          E Op E)) $ int + int)) $
        1 int Op E)) $ int + int)) $
          Op E)) $ + int)) $
        3 + E)) $ + int)) $
          E)) $ int)) $
        1 int)) $ int)) $
          $ $
    2. Construction of Parse table
      1. Intuition

        The next character should uniquely identify a production, so we should pick a production that ultimately starts with that character

      2. First Sets

        The First(A) is the set of terminals that can be at the start of a string produced by A.

        • we can use First set to make parsing table.
        • if rules contain ε productions, then we have to look through the non terminals and see First of the next non terminal
          • For A → α where α has ε in First set then: First(A) = First(A) ∪ {ε}
          • For A → α t ω, where α First set contains ε and t is terminal: First(A) = First(A) ∪ {t}
          • For A → α B ω, where B is non terminal and α has ε in First set: First(A) = First(A) ∪ First(B)
      3. Follow Set

        Follow(A) is set of all terminal symbols that can immediately follow a
        subsequence derived from A in a sequence derived from start symbol, S.

        Num → Sign Digits
        Sign → + | - | ε
        
        Sign has ε production, so First(Sign) will contain First(Digits)
        
        • The follow set represents the set of terminals that might come after a given non terminal.
          • If B → α A ω then: Follow(A) = Follow(A) ∪ First*(ω)
          • If B → α A ω and ε ⊂ First*(ω) then: Follow(A) = Follow(A) ∪ Follow(B)
          • In other words, follow of a symbol is that which can come after that, so that if that symbol goes to ε then the follow can appear. If the next symbol → ε then add follow of parent
      4. Final Construction
        • For each rule A → ω, for each terminal t ⊂ First*(ω) set T[A, t] = ω (at A if you see t, use production to goto ω)
        • For each rule A → ω, if ε ⊂ First*(ω): set T[A, t] = ω for each t ⊂ Follow(A) (at A, if A can ε and t can be produced by follow then take ε path and let A get lost)
    3. Limitation of LL(1)
      • Left recursion
        • can be eliminated

          A → A ω | α
          rewrite as
          A → α B
          B → ω B | ε
          
          Explanation:
          A can produce α ω, α ω ω ..., starts with α
          second rule for repeating ω and ε for stopping
          
      • Left factoring

        E → T
        E → T + E
        T → int
        T → (E)
        
        rewrite as:
        E → TY
        Y → + E | ε
        ...
        
    4. Condition for LL(1):

      A grammar G is LL(1) iff for any productions A → ω1 and A → ω2 the sets:
      First(ω1 Follow(A)) and
      First(ω2 Follow(A))
      are disjoint

      This condition is equivalent to saying that there are no conflicts in the table

4 Binding Times

  • Language design time: control flow constructs, primitive data types…
  • Language implementation time: IO, size of stack, heap etc
  • program writing time: variable name
  • compile time: layout of static data
  • Link Time: compiler compiles different modules at different times, binding to variables in other module is finalized during link time.
  • Load time
  • run time: address of variable.

4.1 Binding and visibility

void testFoo() {
  int arg = 100;
  for (int arg = 0; i < 10; arg++) {
    ; // arg here loops from 0 to 9
    // binding for arg = 100 is active here but not visible
  }
  // here arg = 100
}

4.2 Storage Allocation Mechanism

4.2.1 Static Allocation

  • Global variables
  • literals
  • tables generated for helping runtime: debugging, dynamic type checking etc
  1. Languages without recursion
    • some languages such as Fortran (before Fortran90), did not have recursion.
    • as a result, only one invocation of subroutine will be active at a time.
    • space for the variables can be allocated statically and avoid runtime overhead of creation and destruction.
  2. Constants
    • constants can be allocated space during compile time, since their value can be shared across recursive calls.
    • Some constants value cannot be determined until runtime. Such elaboration-time constants are allocated on stack.

4.2.2 Stack Allocation

  • if language allows recursion, local variables cannot be allocated statically.
  • each instance of subroutine has its own frame or activation record
    • each frame contains:
      • arguments
      • return values
      • local variables
      • temporaries: space for temp implicit var. (2 + 2 + 3), store 4 in temp then 4 + 3
      • Bookkeeping information:
        • subroutine's return address
        • reference to stack frame of caller (dynamic link)
  • offset of objects from frame base is known at compile time.
  • Stack grows downwards (towards lower address) in most language implementation.
    • Local variables, temporaries and bookkeeping information typically have negative offsets from frame pointer
    • Arguments and returns typically have positive offsets–they reside in caller's frame
  • even if language doesn't support recursion, stack based allocation of local variables is better than static allocation since, not all subroutines are active at the same time, stack based allocation takes less memory.

4.2.3 Heap Allocation

  • Issues:
    • Internal fragmentation
    • External fragmentation
  • Allocation strategy:
    • First fit: more time, more efficient
    • Best fit: less time, less efficient
    • List of preallocated standard size blocks: fibonacci heap and buddy system
    • compacting requires moving already allocated blocks – requires updating existing pointer.
  • Garbage Collection
    • Implicit
      • unreachable object
      • on exiting scope - C++ RAII

4.3 Scope

  • the textual region of the program in which the binding is active is its scope
  • Bindings for local objects are created and binding for global objects are made invisible on entering a new scope.
  • On scope exit, we delete local variables binding and reactivate the binding for global variables.
  • This does not require any code to be executed → can be determined statically → statically scoped or lexically scoped
  • Set of active binding is called current referencing environment

4.3.1 Static Scope

  • there can be nested scope.
  • search current scope, if not found then search outer scope. Repeat untill reach global scope.
  • if no variable found even in global scope, then throw an error;
  1. Hole in scope
    • A name-to-object binding that is hidden by a nested declaration of the same name is said to have a hole in its scope.
      • or we can say the closer object has eclipsed the outer definition.
    • Some languages may have qualifier to access outer scope variables.
  2. Objects in lexically surrounding scopes
    • the frames of surrounding subroutines are already active on the stack.
    1. Static Chain
      • link to the surrounding block's stack

        (blockA) {
            (blockB) {
                (blockC) {
                }
                (blockD) {
                }
            }
        }
        
      • in above example, during execution of blockD, the scope for blockB has to be active. Thus, during execution of blockD, the stack will look like: A calls B calls D calls C

        blockC
        blockD
        blockB
        blockA
        • C and D have static link to B
        • B has static link to A

4.3.2 Declaration Order

  • can we use a variable in a block before it was declared.
  1. Forward Reference

    An attempt to use a name before it's declaration

    • Mixed strategy used by languages
      • In Java/C++, members are visible through out the class even before defined.
      • Python, Scheme and Module2 does it differently. Refer book page: 133.
  2. Declaration and Definition
    • to implement mutually recursive data structures (example: manager and employee) C,C++ has declaration and definition.
    • Declaration gives enough information to compiler about it's name and scope.
    • Definition gives actual implementation details.
  3. Redeclaration
    • most dynamic languages overwrites the older value.
    • In ML, however, the old meaning of variable is accessible to the functions defined before redeclation.

      let x = 1;
      let f y = x + y;
      f 2 (* gives 3 *)
      let x = 3
      f 2 (* we might expect 5, but the output is 3 (1 + 2) *)
      
    • In other languages like Scheme, stores bindings of names in known locations. Whereas in ML, program is compiled that access old meaning of variable.
    1. TODO does ML simply substitute the value of variable while defining the function?

      Maybe because it is immutable.

4.3.3 Dynamic Scoping

  • The bindings depends on flow of control during runtime.

4.3.4 Implementing Scope

  • Symbol table is dictionary which maps name to information about it.
  1. Static Scope:
    • never delete anything from symbol table.
    • Most variations on static scoping can be handled by augmenting a basic dictionary style symbol table with enter_scope and leave_scope operations to keep track of visibility.
    1. LeBlanc and Cook Table
      • Each scope, as it is encountered, is assigned a serial number.
        • Eg: The outermost scope is given number 0 (the one containing predefined variables
        • The scope containing programmer defined global variables is given 1 and so on.
      • All names are entered into single large hash table
        • Each entry contains symbol name, category (variable, constant, type, procedure, fieldname, paramter), scope number and other details.
      • Symbol table has scope stack that indicates the scopes that compose the current referencing environment.
      • To look up a name in the table, we scan the hash table, for each entry in the table, we search scope stack to check scope of that entry has scope visible.
  2. Dynamic Scope:
    1. A-list (Association list)
      • a simple list of name-value pairs.
      • When used to implement dynamic scoping it functions as stack: new declarations are pushed as they are encountered and popped as the end of the scope in which they appear.
    2. Central Reference table
      • maintains a mapping from names to current meanings.
      • new entries are added in front of the chain (bucket list (when key collides))
      • lookup returns the entry at front of the chain.
      • on exit, delete the entry at front.

4.3.5 Shallow and Deep binding (passing subroutine as parameter)

def sub1:
    def sub2:
        print x # prints 3 (deep), 4 (shallow)
    def sub3:
        x = 3
        sub4(sub2) # <- deep binding created here
    def sub4 (subx):
        x = 4
        subx() # <- shallow binding created here
    x = 1
    sub3()
  1. Deep Binding
    • referencing environment is the one when reference is created
    • in the example, binding x = 3 is created when we pass sub2
  2. Shallow Binding
    • referencing environment is the one during routine is called
    • in the example, binding x = 4 is created when we call subx
  3. Deep Binding and Closure
    • Note that Deep binding has reference to the binding environment.
    • if we pass this subroutine (second class citizen), the subroutine can access the environment deep in the stack.
    • but this still can be implemented using stack and pointers. No need to allocate space on heap.
    • the issue arises when we return the subroutine and the stack unwinds. In this case to have closure we need unlimited extent and thus space allocation for the subroutine on stack. See 4.3.6.4

4.3.6 First Class and unlimited extent

  1. First Class
    • can be passed as parameter
    • can be returned from subroutine
    • can be assigned to a variable
  2. Second Class
    • can be passed as parameter
    • can not be returned from subroutine
    • can not be assigned to variable
  3. Third Class
    • cannot be passed as parameter
    • cannot be returned from subroutine
    • cannot be assigned to variable
  4. Misc: Unlimited Extent
    • check 4.3.5.3, unlimited extent needed only in case of first class function, second class can do with stack allocation.
    • if we return subroutine and that subroutine has access to local variable in surrounding subroutine,
      • we cannot use stack based allocation for local variables
      • the allocation must be on heap
      • handled by garbage collection
    • C do not have nested subroutine
      • can have pure stack based allocation for local variables
    • Scheme will have heap based allocation for local variables
    • Imperative languages avoid first class for avoiding heap based local variable allocation.
    1. Function Object/Object Closure/Functor
      • to implement closure in imperative language such as Java, we can store the context in object's field.
      • example:

        interface IntFunc {
          public int call(int i);
        }
        class PlusX implements IntFunc {
          final int X;
          PlusX(int n) { x = n; }
          public int call(int i) { return i + x; }
        }
        

        Here we are saving the closure context in object field int X

4.4 Alias

Multiple name bound to same object.

  • Overloading: correct function is called according to the type and number of parameters passed.
    • can be determined at compile time.
  • Polymorphism: correct function (method) is invoked according to type of instance (parent or child).
    • Determined at runtime.
  • Coercion: automatic conversion of type by compiler
    • Eg: +
      • if + defined only for int and float supplied, coercion will take place
      • if separate + defined for float, by overloading + for float will be called.

5 Sematic Analysis

  • Syntax of language can be defined using context-free grammar
  • Semantics cannot be defined using CFG.
    • some semantics can be defined, but it is very complicated to do so.
  • We describe sematics using Attribute grammar (AG) which is like pseudocode.
  • Semantics analysis can be divided into:
    • Static: check for semantic error at compile time
    • Dynamic: generate code which will check for error during execution time.

5.1 Static Analysis

  • Compile time algorithm which predict runtime behavior.
  • Example: type checking

5.2 Semantics:

  • Operational Semantics
  • Denotational Semantics
  • Axiomatic Semantics
  • Attribute Grammar

5.3 Attribute Grammar

  • formal way to define attributes for production of formal grammar, associating these attributes to values.
  • Based on context free grammar
    • Symbols in grammar are associated with attributes
    • Production rules in grammar are associated with semantic rules indicating how the attribute values are computed.
    • May include conditions that impose constraint on legal sentences.

5.3.1 Synthesized

  • values are calculated (synthesized) only in productions in which their symbol appears on left hand side.
  • An attribute grammar in which all attributes are synthesized is said to be S-attributed.
  • Value of parent is calculated using children, left is calculated using right.

5.3.2 Inherited

  • values are calculated when their symbol is on the right hand side of the current production.
  • value is calculated using parent and left sibling. (therefore in above statment is said when symbol is on rhs).

5.3.3 Attribute Flow

  • attribute grammar is well defined if its rules determine a unique set of values for the attributes of every possible parse tree.
  • non circular: if no attribute ever depends on itself.
  • An algorithm that decorates parse tree by invoking the rules of an attribute grammar in an order consistent with the tree's attribute flow is called a translation scheme.
    • most obvious: make repeated passes over a tree and invoke semantic function whose arguments have all been defined and stopping when no such value changes.
    • dynamic: similar to topological sort
    • static: based on the grammar
      • S-attributed grammars form the simplest of such class.
        • attribute flow in an S-attributed grammar is strictly bottom-up
        • can be combines with bottom up parsing.
        • can be implemented on the fly during LR parse.
      • L-attributed grammars can be evaluated by visiting the nodes of the parse tree in a single left to right, depth first traversal.
        • can be implemented on the fly during LL parse.
      • in the same order they are visited during a top-down parse.
      • L-attributed grammars can be defined more formally as:
        1. Each synthesized attribute of a left hand side symbol depends only on that symbol's own inherited attribute or on attributes (synthesized or inherited) of the production's right hand side symbols.
          • either from parent or synthesize through children
        2. each inherited attribute of a right hand side symbol depends only on inherited attributes of left hand side symbol or on attributes (synthesized or inherited) of symbols to its left in the right-hand-side.
          • inherited attribute (not synthesized, top down) is either inherited through parent or left sibling.
      • Every S-attributed grammar is also an L-attributed grammar
        • but the reverse is not true, S-attributed grammar do not permit the initialization of attributes on right hand side, so there are L-attributed grammar which are not S-attributed.

5.4 is generic just sugar coating over instanceof

  • no, compiler does static type checking

6 Control Flow

6.1 Precedence and associativity

  • The basic arithmetic operator always associate left to right
    • 9 - 3 - 2 is 4 not 8.

6.2 Assignment

  • side effect: if a construct influences subsequent computation in any way other than by returning a value for use in the surrounding context.
  • Expressions: always produce a value and may or may not have side effect
  • statement: executed solely for their side effects and return no useful value.

6.3 References and Values

6.3.1 value model

  • l-value: used on left hand side of assignments, expressions that denote locations are referred as l-values
  • r-value: expression that denote value, used on right hand side

6.3.2 reference model

  • distinction is made between l-values and r-values
  • a variable is not a named container for value, it is a named reference to a value
  • Example:

    b := 2;
    c := b;
    a := b + c;
    
    Value Model Reference Model
    we put 2 in b we let b refer to 2
    copy it into c let c refer to it also
    read these values, add them together pass the reference to + operator and
    and than place result in a then let a refer to the result
  • in reference model there is only one 2 and all variable having value 2 refer to it.
  • in reference model every variable is l-value

6.3.3 Boxing

  • Java collections expect Object, primitive datatype are not descendants of Objects so we need to box them before using them in collections.
  • Recent version of Java does auto boxing, but returns Object so need to unbox manually.

6.3.4 Initialization

  • C, statically allocated variables are initialized to 0.
  • Java, C# initializes null to fields of classes typed objects.
  1. Dynamic Check
    • can assign IEEE standard NaN value to uninitialized number and be checked at runtime.
    • use of NaN results in hardware interrupt.
  2. Static Check
    • check if code flow through any path leads to assigning value to variable before use.

6.3.5 Ordering within expressions

a - f(b) - c * d;
f(a, g(b), h(c));
  • we do not know which will evaluate first
  • may affect if the function calls have side effects
  • may affect compiler optimization of register allocation.
  • since compiler can optimize register allocation, the order of evaluation is left to implementation in some languages.
    • except for Java and C#, where it is left to right.

6.3.6 Mathematical Identities

c = b - a + d
// rearrange as
c = d - a + b
  • can cause overflow.

6.3.7 Switch Case

  • jump table:
    • can consume large space if set of labels is non dense and includes large value ranges.
  • sequential testing
    • if number of case statement is less
  • hashing
    • if set of label is large and has missing values and no large ranges
  • binary search

6.3.8 Applicative and Normal Order Evaluation

  • Applicative Order: evaluating before call
  • Normal Order: evaluating when (if) needed

6.3.9 Non determinacy

  • in which choice between alternatives (ie between control paths) is deliberately unspecified.

7 Type Systems

  • bits are untyped. hardware makes no attempt to keep track of which interpretation correspond to which location.
    • bits can be interpreted as:
      • instructions
      • address
      • float, int, char literal
    • assembly is untyped
    • higher level languages keep track of types to provide contextual information and error reporting.
  • subroutines have types in languages in which they are 1st or 2nd class values.
  • strongly typed: if language prohibits application of any operation to any object that is not intended to support that operation.
  • statically typed: if language is strongly typed and type checking is done at compilation stage.
  • Dynamic type checking: checking at runtime.

7.1 Meaning of Type

  • from denotational point of view, type is simply a set of values.
  • from structural point of view, type is either one of a small collection of built in types (integer, character, Boolean etc) or composite type created.

7.2 Polymorphism

  • "having multiple form"
  • Parametric Polymorphism: the code takes in type either explicitly or implicitly.
    • Explicit Parametric Polymorphism: generics/templates
  • Subtype Polymorphism: the code is designed to work with values of some type T, but programmer can define additional types to be extension of T and code will work with these types too.
    • Usually found in OO Languages.

7.3 Orthogonality

  • Trivial Type: To characterize statement that is executed for its side effect and has no useful value, provide trivial type.
    • void in C, unit in ML
    • In pascal, we would have needed to take the return in dummy variable.

7.4 Enumeration

  • In C, enums are just typedef-ed integers
  • Pascal, enums and int are of different types, mixing them is not allowed.

7.5 Sub-range type

  • Why use subrange instead of int:
    • helps in documentation
    • dynamic check by compiler to see if value is in range
    • optimize storage based on range.

7.6 Composite Types

7.6.1 Records (structs)

  • a record consists of collection of fields, each of which belong to a simpler type
  • similar to mathematical tuple, which corresponds to Cartesian product of types of the fields.

7.6.2 Variant Records (Union)

  • similar to Records, but only one field is valid at any given time.
  • it is the disjoint union of field types.

7.6.3 Arrays

  • function that maps index to member of component type.

7.6.4 Sets

  • Collection of distinct elements of base type.

7.6.5 Pointers

  • l-values
  • Pointers are often but not always implemented as address.
  • often used to define recursive data type

7.6.6 Lists

  • a list is defined recursively as either an empty list or a pair consisting head element and reference to a sublist.

7.6.7 Files

  • similar to arrays, maps index to component type.

7.7 Type Checking

Set of rules for:

  • Type Equivalence: when are types of 2 values equal
  • Type Compatibility: When can a value of a given type be used in given context
  • Type Inference: how to determine type of expression

7.8 Type Equivalence

7.8.1 Structural Equivalence

  • based on content of type definition
  • two types are same if they consist of the same component in the same order.
  • to determine if two types are structurally same, a compiler can expand their definition by replacing any embedded type names with their respective definitions recursively.

7.8.2 Name Equivalence

  • based on lexical occurrence of type definitions
  • each definition introduces new type
  • based on assumption that if programmer writes two type definition, then those are probably meant to represent different types.
  • strict name equivalence: aliased types are considered distinct
  • loose name equivalence: aliased types are considered equivalent
  • Ada gives both options

    subtype elemType is integer; -- loose type equivalence
    type person is new string;   -- strict type equivalence, new makes difference
    

7.9 Type Conversion and Cast

depending on the types involved, conversion may or may not require code to be executed during runtime:

  1. Types structurally equivalent but language uses name equivalence: same low level representation, no coded needed.
  2. types have different set of values but intersecting values are represented in the same way. (Ex: one maybe subrange of another). If the value assigned is not expected then code must be executed.
  3. Types have different low level representation but can define correspondence in their values, (Ex: int and float). Runtime code is needed for checking and conversion.
    • check might be needed to see if the value is within the bound

7.9.1 Nonconverting Type Cast

  • changing the type without changing underlying implementation
    • ie interpret the bytes of a value as if they were another type
  • called as nonconverting type cast or type pun
  • reinterpret_cast in C++
  • In C, this can be done using: r = *((float *) &m);, take address of m (int) convert it into address of float and then dereference it.

7.9.2 Coercion

  • implicit conversion to expected type by language.

7.10 Universal Reference type

  • void *
  • Object

7.11 Type inference

  • sum of range: range may be out of bound, so int

7.12 Parametric Polymorphism

7.12.1 Generic Implementation

  • C++ template implementation is purely static.
    • compiler creates a separate copy of the code for every instance.
  • Java generics have same code for all types at run time
    • compiler does the casting

7.13 Equality Testing and assignment

  • Deep Copy: copy object and all other object pointed by it recursively
  • shallow copy: copy object and references as it is.

8 Composite Types

8.1 Records (Structures)

  • heterogeneous types
  • alignment restriction may lead to holes in memory layout
    • can use pragma pack to give hint to compiler to optimize for space rather than speed
      • packing may lead to multi-instruction sequence to retrieve pieces of field from memory and then re-assemble it.
    • languages support assigning one record to another but not testing
    • can use block_copy to assign
    • but block_compare may return incorrect value due to garbage in holes
    • easy workaround - sort fields in ascending size.
      • may be done by compiler which may be transparent to programmer
      • reorganization may be helpful if we can group frequently accessed fields together so that they lie in same cache lines.
      • C/C++ allocates memory in the order defined by programmer

8.2 Unions

  • overall size of union is of the largest member
  • only one of the member is active at a time
  • unions allow the same set of bytes to be interpreted in different ways at different times

8.3 Arrays

  • homogeneous
  • mapping index to component type
  • some languages allow non discrete index
    • can be implemented using hash map or search trees

8.3.1 Storage allocation

  1. Stack Allocation
    1. Dope Vector
      • datastructure used to hold information about data object about its memory layout
      • has information about lower bound of each dimension
      • size
      • the dope vector for record indicates the offset of each field from beginning of the record.
      • C supports dynamic shape for both parameters and local variables
        • to ensure local object can be found using a known offset from frame pointer the stack is divided into two parts:
          1. Fixed sized part: variable whose size is known statically goes here
          2. Variable sized part: whose size is not know until elaboration time goes here along with pointer to it and dope vector goes to fixed sized part
  2. Heap Allocation
    • if number of dimension of dynamic array is know statically then dope vector can be kept with pointer to data on stack. If dimension is not know then vector is kept at the beginning of the heap block instead.
    1. Memory layout for multi-dimension array
      • Column Major: columns kept together
      • Row Major: rows kept together
      • Affect cache lines, if elements are accessed in a way that results in cache miss then it will affect performance.
      • Fortran has column major
      • 2d array may be implemented as array of pointers.
        • more space required
        • allows rows of different sizes
  3. Summary
    Lifetime shape allocation region
    global static static global memory
    local static stack
    local bound at elaboration time stack with indirection (dope vector)
    arbitrary bound at elaboration time heap (size doesn't change)
    arbitrary arbitrary vector
    • In third case, space is allocated at end of stack (after temporaries), dope vector contains pointer to base address.

8.4 Strings

  • Dynamic strings easier to implement since they do not have reference to other objects, copying is easy.

8.5 Sets

  • Unordered, distinct elements
  • can be implemented:
    • using bit-vector: 1 if present 0 otherwise
    • ordered list (binary search)
    • unordered list (hash map)

8.6 Pointers and recursive types

  • pointers needed only in language that use value model
  • Pointer is a high level concept: a reference to an object
  • address is a low level concept
  • Pointers are often implemented as address but not always

8.6.1 Value Model

  • A = B, puts value of A in B

8.6.2 Reference Model

  • A = B, makes A refer to the same object as B

8.6.3 Dangling Pointers

  • Tombstone: indirection is marked in a such a way that it reflects if the object is alive
  • Lock and Keys: have a word to each pointer and object and word must match for the pointer to be valid.

8.6.4 Garbage Collection

  • Reference counting
  • Smart Pointers
    • unique_pointer: only one owner, when assigned, ownership is transfered
    • shared_ptr: has tombstone like intermediate object which keeps count of destructors etc.
    • weak_pointer: similar to shared pointer without increase reference count
  1. Algorithms
    • mark and sweep
    • Pointer reversal
    • Stop and copy

8.7 Lists

8.8 Files

9 Subroutine and Control Abstraction

  • Side Effects: A construct has side effect if it influences the subsequent computation in any way other than returning a value
  • Referential transparency: The value of an expression depends only on its referencing environment and not when it is evaluated.
  • Expression always produce a value, may or may not have side effects
  • statement: executed solely for side effects.
  • Boxing: automatic conversion between value and reference models

10 TODO Scott Supplement

  • Error Recovery

11 λ Calculus

11.1 Functions

  • mapping from domain to range
    • can be denoted as set of pairs: sqrt = {(x, y) in R * R | x = y2 }
  • λ calculus is calculus of anonymous functions
    • a method of representing functions
    • a set of conversion rules for syntactically transforming them.
  • λ x. * 2 x
    • λ x. * 2 x is a lambda abstraction
    • x is bound variable
    • * 2x is body
  • λ x. λ y. * (+ x y) 2
    • In scheme:

      (lambda x
        (lambda y
          (* (+ x y) 2)))
      
    • a function of x which returns a function of y which returns * (+ x y)
    • λ y. * (+ x y)
      • y is bound in this expression
      • x is free
    • λ x. λ y. * (+ x y) 2
      • both x and y are bound in this expression

11.2 Pure λ calculus

  • Everything is a function
  • Example: Natural numbers can be obtained from:
    • constant function 0
    • successor function S(n)
    • Example: 3 = S(S(S(0)))

11.3 Conventions to avoid paranthesis

  • Application associates to the left
    • (f g h) → (f g) h
  • Application has higher precedence than abstraction
    • λ x. A B → λ x. (A B)
  • Abbreviate nested expressions with a list
    • λ x y z. E → (λ x. (λ y. (λ z. E)))

11.4 Evaluation of λ expression (Reduction)

11.4.1 Reduction Rules

  1. δ (delta)
    • used for built in constant
    • (+ 2 3) →δ 5
  2. β (beta)
    • replace all occurrences of bound variable in the body with supplied parameter.
    • ((λ x. E1) E2)
      • Replace all occurrences of E1 with E2 in function body
    • Ex: (λ x. * 2 x) 5 →β (* 2 5)
      • this can be further evaluated →δ 10
  3. α (alpha)
    • rename bound variable
    • (λ x. E(x)) →α (λ y. E(y))
    • Ex: (λ x. x + 1) →α (λ y. y + 1)
  4. η (eta)
    • remove unnecessary abstractions
    • (λ x. f) →η f
    • (λ x. + 2 3) →η (+ 2 3) →δ 5

11.4.2 Normal Form

A λ expression is in normal form if it cannot be further reduced using β, δ or η reduction.

  1. Can every λ expression be reduced to normal form
    • (λ x. x x) (λ x. x x)
    • No
    • Since λ expression can model never terminating expression, it is as powerful as Turing machine.
    1. Church's thesis

      The effective computable functions on the positive integers are precisely those functions definable in the pure λ calculus (and computable by Turing machine)

  2. Is there more than one way to reduce a λ expression
    • Yes
    • ((λ x. + (λ y. 4) x) 3)
      1. β (+ (λ y. 4) 3) →η (+ 4 3) →δ 7
      2. η ((λ x. + 4 x) 3) →β (+ 4 3) →δ 7
  3. if there is more than one to reduce, does each lead to same normal form expression
    • Yes
    1. Church-Rosser Theorem
      • If an expression can be reduced in two different ways to two normal forms then these normal forms are same up to α conversion
      • Also called as diamond property
        • this does not imply that if a normal form exists, then any redex will work. (See example below)
  4. Is there a reduction strategy that guarantees that a normal form will be produced (if there is one)
    • Yes
    1. Standardization Theorem
      • If an expression has normal form, then a leftmost outermost reduction at each state is guaranteed to reach that normal form.
      • δ or β does not always reduce: ((λ x. λ y. y) ((λ z. z z)(λ z. z z)))
        1. taking outermost redex →η (λ y. y)
        2. taking innnermost redex →β ((λ x. λ y. y) ((λ z. z z)(λ z. z z))) →β

11.5 Reduction Strategies

11.5.1 Normal Order

  • always choose leftmost outermost redex
  • normal order will yield the normal form if the expression has one
  • corresponds to call by name
    • parameter is passed as unevaluated expression that is evaluated in the body of the function being executed each time the parameter is referenced.
    • "always choose leftmost outermost redex": prefer function application over parameter evaluation.

11.5.2 Applicative Order

  • always choose leftmost innermost redex
  • corresponds to call by value
    • an argument is evaluated before the function is applied
    • "always choose leftmost innermost redex": choose parameter evaluation over function application.

11.5.3 Comparison

  1. ∞ computation
    (define eternity
      (λ ()
        (eternity)))
    
    (define zero
      (λ (x)
        0))
    
    (zero eternity)
    
    • normal order: gives 0 immediately
    • applicative order: in this case, may not return
  2. Argument Evaluation
    fun sqr x = x * x;
    sqr(sqr(sqr(2)))
    
    (* in case of normal order evaluation
    = sqr(sqr(2)) * sqr(sqr(2))
    = sqr(2) * sqr(2) * sq(sqr(2))
    ...
    redundant computations
    *)
    
    (* in case of applicative order reduction
    = sqr(sqr(4))
    = sqr(16)
    = 256
    *)
    
  3. Graph Reduction (Lazy Evaluation)
    • has pointer to expression
    • save the result of expression after evaluation
    • can pass argument unevaluated – normal order
    • can save against long/∞ computation

11.6 Booleans and condition functional representation

  • true define true = λ x. λ y. x
  • false define false = λ x. λ y. y
  • if λ p. λ x. λ y. p x y

11.6.1 if true a b = a

if true a b
(λ p. λ x. λ y. p x y) true a b
β (λ x. λ y. true x y) a b
β (true a b)
replace true with its definition
(λ x. λ y. x) a b
β a

11.6.2 if false a b = b

if false a b
(λ p. λ x. λ y. p x y) false a b
β (false a b)
replace false with its definition
(λ x. λ y. y) a b
β b

11.7 Y Combinator

  • defining recursive functions without define

11.7.1 Fixed Points

  • a fixed point of f is a value xfix that satisfies xfix = f (xfix)

    In mathematics, a fixed point (also known as invariant point), of a function is an element in the domain of the function which is mapped to itself by the function. - wiki

  • example: f = λ x. * x x
    • fixed points: 0 and 1
  • A function can have:
    1. no fixed points (f x = x + 1)
    2. a unique fixed point (f x = -x)
    3. a finite number of fixed points (f x = * x x)
    4. ∞ number of fixed points (f x = x)

12 JVM Overview

12.1 Types

int I
boolean Z
void V
String Ljava/lang/String;
String[] [Ljava/lang/String;
ArrayList Ljava/util/ArrayList;
 

12.2 Method Signature

  • '(' + Param Types + ')' + return Type
  • Example: void main (String[] args) = ([Ljava/lang/String;)V

12.3 Variables

  • Static
    • getstatic
    • putstatic
  • Non-static (fields)
    • getfield
    • putfield
  • local variables
    • stores in array of local variables
    • accessed using slot index
      • if non static method, then this stored in 0 index
      • followed by parameters
      • followed by local variables
    • accessed using load and store

12.4 Method Invocation

  1. invokevirtual
    • class methods
  2. invokestatic
    • for static methods
  3. invokespecial
    • during object initialization

13 Denotational Semantics

13.1 Semantics

  • Semantics is expressed by defining:
    1. Semantic domain
    2. Semantic mapping
      • M: L → S, where L is language (syntactic domain) and S is semantic domain
  • In denotational semantics:
    • Syntactic domain: Elements of abstract syntax
    • Semantic domain: abstract mathematical objects
    • Semantic functions: map elements of syntactic domain to elements of semantic domain

13.2 Ex: Simple Stack Machine

13.2.1 Abstract Syntax

Program Instruction
Instruction Instruction
  load Number
  +
  *
  out

13.2.2 Syntactic Domain

  • P: Program
  • I: Instruction
  • N: Number

13.2.3 Semantic Domain

Integer  
Output [Integer]
State [Integer] * [Integer] (Stack * Output)

13.2.4 Semantic Functions

meaning Program → [Integer]
execute Instruction → State → State
value Number → Integer
  1. Function Definitions
    execute[|load N|] (stack, output) = (value[|N|] :: stack, output)
    execute[|+|] (x::y::xs, output) = ((x + y)::xs, output)
    execute[|*|] (x::y::xs, output) = ((x * y)::xs, output
    execute[|out|] (x::xs, output) = (xs, output @ x)
    execute[|I1 I2|] = execute[|I2|] . execute[|I1|]
    meaning[|P|] = execute[|I|] [] []  //empty stack and output
    
    • note: execute [|I1 I2|] = execute[|I2|] . execute[|I1|] is function composition
      • output of execute[|I1|] is fed to execute[|I2|] as input
      • f . g = f(g)

13.3 Modeling Memory

Using simple imperative language

13.3.1 Abstract Syntax

Program Command
Command Command Command
  Identifier := Expression
  skip
Expression Identifier
  Number
  Expression + Expression

13.3.2 Syntactic Domain

P Program
C Command
E Expression
N Number
Id Identifier

13.3.3 Semantic Domain

Integer  
Store Identifier → (Integer + undefined)
  • Memory is modeled as function store from identifier to values

13.3.4 Semantic functions

meaning Program → Store
evaluate Expression → Store → Integer
execute Expression → Store → Store
value Number → Integer
empty λ Id. undefined
update (Store, Id, Integer) → Store
  • there is no out, we'll see the effect as side effect on memory
  1. store
    • store is a function which maps Id to (Integer, undefined)
    • Base case for store is empty, where everything is mapped to undefined
    • sto0: - → undefined
    • sto1: maps one id to integer, rest delegated to sto0
    • sto2: maps another id to integer, rest delegated to sto1, so on…

      update (sto_n, id, int_i) x = if (x = id)
                                      then int_i
                                      else sto_n x
      
      (*
      here,
      (update (sto_n, id, val)) returns a function
        which maps id -> val, else
        delegated to sto_n
      *)
      
  2. evaluate
    evaluate[|N|] sto = value [|N|]
    evaluate[|id|] sto = sto(Id)
    evaluate[|E1 + E2|] sto = evaluate[|E1|] sto + evaluate[|E2|] sto
    evaluate[|skip|] sto = sto
    evaluate[|id := E|] sto = update(sto, id, evaluate[|E|] sto)
    evaluate[|C1 C2|] sto = evaluate[|C2|].evaluate[|C1|]
    meaning[|C|] = evaluate[|C|] empty
    

13.3.5 Meaning

  • the meaning of program is a function
    • in the simple stack machine, execute was function mapping Instruction → State → State
      • we start with empty stack and output and end with entries in them
    • here there is no output, meaning of the program is store, which maps expressions → Store → Store
      • we start with empty store (empty function) and end with store which maps the identifiers to integers.

13.3.6 if statements

execute [|if E then C1 else C2|] state =
if (evaluate[|E|] state)
then evaluate[|C1|] state
else evaluate[|C2|]

13.3.7 while statements

execute [|while E do C|] (sto, in, out)
//unroll the loop
if (evaluate[|E|] (sto, in, out))
then execute [|while E do C|] . evaluate C
else skip

// loop is a fixed point
let loop = λ s:state .
             if (evaluate[|E|] s)
             then (loop (execute [|C|] s)) // ← recursively call loop
             else s
in loop
end

13.3.8 Nested Scopes

  1. Environment and Location
    Environment identifier → location (address)
    Store location → value
    env location + unbound
    store location → storableType + unused + undefined
    • unused: not to bound to an identifier
    • undefined: no value stored yet
  2. extendEnv

    extendEnv (env, id0, loc) id1 = if (id0 = id1) then loc else env id1

  3. update store

    update (sto, loc0, val) loc1 = if loc1 = loc0 then val else sto loc1

14 Axiomatic Semantics

14.1 Assertions

  • Assertion true is satisfied by any program state
  • Assertion false is not satisfied by any program state

14.2 Axioms

14.2.1 Assignment Axiom

{ Q[V := E] and E is defined }
V := E
{ Q }

Whatever is true about E beforehand, is true about V afterwards.

  • Axioms can be used:
    • to calculate precondition

      { x = 0 }
      x := x + 1
      { x = 1 }
      
      // substitute statement in post condition
      x =  1
      x + 1 = 1
      x = 0 = precondition
      //statement is valid
      
    • to calculate the RHS of statement

      { true }
      x = E
      { x = 4 }
      
      E = 4
      

14.2.2 Stronger and Weaker assertions

  • A ⊃ B (A superset B)
    • A implies B
    • A is stronger than B
    • A has more constraints, if A is true then B is too.
  • Eg: x > 0 ⊃ x ≥ 0
    • set of values allowed by x > 0 = {1, 2, 3 …}
    • set of values allowed by x ≥ 0 = {0, 1, 2, 3 …}
    • x > 0 has more constraints than x ≥ 0
    • x > 0 is stronger than x ≥ 0
  • false is stronger than P (false not satisfied by any state)
  • P stronger than true (true is satisfied by all state)
  • We want precondition to be weakest so that more number of states can be valid for that statement.
  • Assignment axiom gives weakest precondition for given post condition

14.2.3 Inference Rules

for P/Q, if we know whatever above line is true (P) then we can conclude everything below the line is true (Q).

  1. Strengthen the precondition
    P ⊃ Q, {Q} S {R}
    —————–----------
    {P} S {R}
    
    • P is stronger than Q
    • S in a state satisfying Q will result in state satisfying R
    • anywhere where P is satisfied, Q is also satisfied
    • ∴ the final state will satisfy R
      • this may seem counter intuitive if we consider S to a function which maps all pre to post condition
      • S is just a function which will get R if any one state of precondition is true.
      • The states allowed by P is subset of state allowed by Q (since P has stronger constraints)
  2. Weaken the postcondition
    P ⊃ R, {Q} S {P}
    ----------------
    {Q} S {R}
    
    • P is stronger than R
    • S gets from Q to P
    • any state satisfying P will satisfy R
    • ∴ S gets Q to R
  3. Sequence Rule
    {P} S0 {Q}, {Q} S1 {R}
    ----------------------
    {P} S0 S1 {R]
    

14.2.4 Some facts

1. A and B ⊇ A
2. A ⊇ A or B
3. {false} S {P}
4. {P} S {true}

my intuition:

  1. B will either have some contraints or no constraints. Resulting set thus ⊇
  2. Any element in set A or B will be true if present in A, but it may be also true if present in B. So A has more constraints than A or B
  3. false has highest constraints
  4. true has no constraints

14.2.5 Application in programming languages

  • skip

    {P}
    skip
    {Q}
    
    P ⊃ Q
    -----
    {P} skip {Q}
    
  • if

    {P and B} S0 {Q}, (P and ¬ B} ⊃ Q)
    —————–----------------------------
    {P} if B then S0 {Q}
    
  • if-else

    {P and B} S0 {Q}, {P and ¬ B} S1 {Q}
    ------------------------------------
    {P} if B then S0 else S1 {Q}
    

    note: ifif-else with else having skip

14.2.6 Some rules about ⊃

A ⊃ C
———--------
A and B ⊃ C

A or B ⊃ C
------------
A ⊃ C, B ⊃ C

A ⊃ C
----------
A ⊃ B or C

A ⊃ B, A ⊃ C
------------
A ⊃ B and C

A ⊃ B and C
------------
A ⊃ B, A ⊃ C

A ⊃ C, B ⊃ C
------------
A or B ⊃ C

15 OOP

15.1 Abstraction

15.2 Encapsulation

15.3 Inheritance

15.4 Dynamic method binding

Footnotes:

Author: Anurag Peshne

Created: 2017-06-06 Tue 16:46

Emacs 25.1.1 (Org mode 9.0.6)

Validate