Language
This section describes the language accepted by LunarML.
Standard ML ‘97
LunarML supports full SML ‘97 language, including the module system. Some features conform to Successor ML rather than SML ‘97.
Successor ML
LunarML supports some of Successor ML features:
☑ Monomorphic non-exhaustive bindings
☑ Simplified recursive value bindings
SML ‘97-compatible ordering for type variables is also supported:
val <tyvarseq> rec <valbind>
☑ Abstype as derived form
☑ Fixed manifest type specifications
☑ Abolish sequenced type realizations
and type
is allowed by default; You can use"allowWhereAndType false"
annotation to disable it.
☑ Line comments
☑ Extended literal syntax
☑ Underscores (e.g.
3.1415_9265
,0xffff_ffff
)☑ Binary notation (
0b
,0wb
)☑ Eight hex digits in text (
\Uxxxxxxxx
)
☑ Record punning
☑ Record extension
☑ Record update
☐ Conjunctive patterns
☐ Disjunctive patterns
☐ Nested matches
☐ Pattern guards
☑ Optional bars and semicolons
☐ Optional else branch
☑ Do declarations
☑ Withtype in signatures
Most of the implemented features are enabled by default; some features can be disabled by MLB annotations.
Language extensions
Vector expressions and patterns
Prior art:
\(\mathtt{\#[}\mathit{exp}_0\mathtt{,} \mathit{exp}_1\mathtt{,} \ldots\mathtt{,} \mathit{exp}_{n-1}\mathtt{]}\) is equivalent to \(\mathtt{Vector.fromList [}\mathit{exp}_0\mathtt{,} \mathit{exp}_1\mathtt{,} \ldots\mathtt{,} \mathit{exp}_{n-1}\mathtt{]}\) (with built-in value of Vector.fromList
) except that the vector expression is non-expansive if every \(\mathit{exp}_i\) is non-expansive.
\(\mathtt{\#[}\mathit{pat}_0\mathtt{,} \mathit{pat}_1\mathtt{,} \ldots\mathtt{,} \mathit{pat}_{n-1}\mathtt{]}\) matches a vector v
if Vector.length v = n
and for each i
, \(\mathit{pat}_i\) matches Vector.sub (v, i)
(with built-in values of Vector.length
and Vector.sub
).
\(\mathtt{\#[}\mathit{pat}_0\mathtt{,} \mathit{pat}_1\mathtt{,} \ldots\mathtt{,} \mathit{pat}_{n-1}\mathtt{, ...]}\) (the last ...
is verbatim) matches a vector v
if Vector.length v >= n
and for each i
, \(\mathit{pat}_i\) matches Vector.sub (v, i)
(with built-in values of Vector.length
and Vector.sub
).
Hexadecimal floating-point constants
Examples: 0x1p~1022
, 0x1.ffff_ffff_ffff_f
The syntax of hexadecimal floating-point constants is:
<hexadecimal-integer-constant> ::= '~'? '0' 'w'? 'x' <hexadecimal-digit-sequence>
<hexadecimal-floating-point-constant> ::= '~'? '0x' <hexadecimal-digit-sequence> (<binary-exponent-part> | '.' <hexadecimal-digit-sequence> <binary-exponent-part>?)
<hexadecimal-digit-sequence> ::= <hexadecimal-digit> ('_'* <hexadecimal-digit>)*
<binary-exponent-part> ::= [pP] '~'? <digit> ('_'* <digit>)?
In short: the (binary) exponent part is optional and use tilde (~
) for the negation symbol.
The hexadecimal floating-point constants must be exact; 0x1.0000_0000_0000_01p0
and 0x1p1024
are examples of invalid constants (assuming the type is Real64.real
).
UTF-encoded escape sequence in text constants
The \u{}
escape sequence allows you to embed a Unicode scalar value in a text constant.
The compiler encodes the character in UTF-8/16/32, depending on the type.
The scalar value is expressed in hexadecimal format; \u{3B1}
or \u{3b1}
for U+03B1 GREEK SMALL LETTER ALPHA.
Underscores are not allowed between hexadecimal digits.
When \u{}
is used in a character constant, the character must be encoded as a single code unit in the corresponding UTF.
Examples:
("\u{3042}" : string) = "\227\129\130"
("\u{80}" : string) = "\194\128"
("\u{1F600}" : string) = "\240\159\152\128"
if WideChar.maxOrd = 255 then
("\u{1F600}" : WideString.string) = "\240\159\152\128"
else if WideChar.maxOrd = 65535 then
("\u{1F600}" : WideString.string) = "\uD83D\uDE00"
else if WideChar.maxOrd = 1114111 then
("\u{1F600}" : WideString.string) = "\U0001F600"
Examples of invalid constants:
#"\u{80}" : char (* equivalent to #"\194\128" *)
if WideChar.maxOrd = 65535 then
#"\u{10000}" : WideChar.char (* equivalent to #"\uD800\uDC00" *)
Infix operators with surrounding dots
Status: experimental.
To use this extension, allowInfixingDot annotation in MLB file is needed:
ann "allowInfixingDot true" in
...
end
infexp_1 .longvid. infexp_2
is equivalent to op longvid (infexp_1, infexp_2)
.
pat_1 .longvid. pat_2
is equivalent to op longvid (pat_1, pat_2)
.
Associativity of .strid1...stridN.vid.
can be controlled by infix(r) <prec> .vid.
declaration.
If no such declaration is found, infix 0
is assumed.
Examples:
0wxdead .Word.andb. 0wxbeef; (* equivalent to Word.andb (0wxdead, 0wxbeef) *)
fun a .foo. b = print (a ^ ", " ^ b ^ "\n"); (* equivalent to fun foo (a, b) = ... *)
infix 7 .*.
infix 6 .+.
val x = 1 .Int.*. 2 .Int.+. 3 .Int.*. 4 (* equivalent to Int.+ (Int.* (1, 2), Int.* (3, 4)) *)
The standard library $(SML_LIB)/basis/basis.mlb
contains the following declarations:
infix 7 .*. ./. .div. .mod. .quot. .rem.
infix 6 .+. .-. .^.
infix 4 .>. .>=. .<. .<=. .==. .!=. .?=.
Value description in comments
Status: experimental (the starting symbol (*:
may change).
To use this extension, valDescInComments
annotation in MLB file is needed:
ann "valDescInComments warn" in
...
end
(* Or:
ann "valDescInComments error" in
...
end
*)
With this extension, comments that start with (*:
will be parsed and the compatibility with the following value declaration (val
, fun
) is checked against.
Type mismatch is reported as warning or error.
The content in the special comment does not affect type inference.
Good examples:
(*: val fact : int -> int *)
fun fact 0 = 1
| fact n = n * fact (n - 1);
(*: val curry : ('a * 'b -> 'c) -> 'a -> 'b -> 'c *)
fun curry f x y = f (x, y);
Bad examples:
(*: val fact : IntInf.int -> IntInf.int *)
(* Invalid: The inferred type is int -> int *)
fun fact 0 = 1
| fact n = n * fact (n - 1);
(*: val curry : ('a * 'b -> 'c) -> 'a -> 'b -> 'c *)
(* Invalid: The inferred type is ('a * 'b -> 'c) -> 'b -> 'a -> 'c *)
fun curry f x y = f (y, x);
Syntax:
<valspec> ::= 'val' <valdesc>
<valspecs> ::= <valspec> <valspecs>
| <valspec>
<valdescincomment> ::= '(*:' <valspecs> '*)'
<dec> ::= <valdescincomment> 'val' ...
| <valdescincomment> 'fun' ...
Importing ECMAScript Modules
An ECMAScript module can be imported with the _esImport
declaration.
Examples:
_esImport "module-name"; (* -> import "module-name"; *)
_esImport defaultItem from "module-name"; (* -> import defaultItem from "module-name"; *)
_esImport [pure] defaultItem from "module-name"; (* -> import defaultItem from "module-name"; with dead-code elimination enabled *)
_esImport [pure] { foo, bar as barr, "fun" as fun' } from "module-name"; (* -> import { foo, bar as barr, fun as fun$PRIME } from "module-name"; with dead-code elimination enabled *)
_esImport defaultItem, { foo, bar as barr, "fun" as fun' } from "module-name"; (* -> import defaultItem, { foo, bar as barr, fun as fun$PRIME } from "module-name"; *)
Syntax:
_esImport <attrs> "module-name"; (* side-effect only *)
_esImport <attrs> <vid> from "module-name"; (* default import *)
_esImport <attrs> { <spec>, <spec>... } from "module-name"; (* named imports *)
_esImport <attrs> <vid>, { <spec>, <spec>... } from "module-name"; (* default and named imports *)
(*
<attrs> ::= (* default; the module may have side-effects *)
| [pure] (* allow dead-code elimination *)
<spec> ::= <vid>
| <vid> as <vid>
| <string> as <vid>
| <vid> : <ty>
| <vid> as <vid> : <ty>
| <string> as <vid> : <ty>
*)
Namespace imports are not supported.