Project Proposal: EasyOCaml =========================== Abstract -------- There is a lot of room for improvement in error reporting for the OCaml system. In particular, type error messages are a source of bewilderment for beginners and advanced programmers alike. The goal of this project is to build EasyOCaml, an alternative frontend for OCaml that is particularly (but not exclusively) useful for teaching the language. EasyOCaml performs state-of-the-art error reporting, employs heuristics for generating meaningful type error messages, and provides customizable language levels while retaining interoperability with arbitrary OCaml library code. Motivation ---------- OCaml is not an easy language to learn. Especially, OCaml's parse and type errors are targeted at experts, not at beginners. Consider the following (ill-typed) example: let rec g : int -> int = fun y -> f 3.0 + f y and f : int -> int = fun x -> x Although both top-level functions 'f' and 'g' are equipped with explicit type annotations, the OCaml compiler flags the argument 'y' in the second call of 'f' in line 1 as ill-typed: File "sample2.ml", line 1, characters 46-47: This expression has type int but is here used with type float But obviously, the argument '3.0' in the first call of 'f' has the wrong type! Contribution ------------ To make learning OCaml easier, we propose EasyOCaml, a new frontend for a subset of OCaml. EasyOCaml has the following benefits: * User-friendly error messages. The frontend reports errors as precisely as possible and suggests changes for fixing the program. * Feature customization. The frontend allows teachers to customize the language features available. For example, beginners with a background in imperative programming languages often prefer reference cells over OCaml's functional features. Deactivating reference cells forces these students to think and program in a functional way. * Interoperability with OCaml. The frontend allows students to use third-party OCaml libraries so as to write real-world programs in EasyOCaml. Ideally, EasyOCaml comes with an integrated development environment (IDE). However, we believe it is unrealistic to implement both the frontend just described and an IDE for it in the time-frame of the OCaml Summer Project. Instead, we propose using EasyOCaml as a replacement for the full OCaml compiler in existing IDEs (e.g. DrOCaml[1], Camelia[2], OcaIDE[3], Dromedary[4]). From our own experience in teaching first-year students, as well as from the experience made by others[16], we know that these features are important for learning and teaching OCaml. Design ------ EasyOCaml extends the OCaml tool suite with a new frontend; that is, EasyOCaml provides a new lexer, a new parser, and a new type checker. Execution and interpretation is performed by the existing OCaml tools. This integrated approach eases development and allows interoperability with OCaml. EasyOCaml is a strict subset of OCaml; that is, every valid EasyOCaml program is a valid OCaml program, but not vice versa. Especially, EasyOCaml supports the following OCaml language constructs: * Expressions: variables, infix operators, tuples, records, lists, arrays, conditionals, while loops, for loops, pattern matching, functions, exceptions. * Types: primitive types, function types, algebraic data types, record types, type abbreviations. * Modules: qualified access to module identifiers, opening of modules. The type system of EasyOCaml supports Hindley-Milner style polymorphism but omits OCaml's object system, polymorphic variants, and labels. This omission is intentional as beginning OCaml programmers are unlikely to use these features and as they cause substantial complexity in the type checker. Including these features would be interesting but is beyond the scope of this project. EasyOCaml's design will build on the OCaml compiler but replace the parser and the type checker with newly written code. The implementation of the type checker will follow the constraint-based approach of Stuckey and others [9]. EasyOCaml will run on all platforms supported by OCaml. Deliverables ------------ Completion of the project will take three months and will result in the following deliverables: * The EasyOCaml compiler/interpreter. * Documentation in form of a user and a developer manual. * A range of erroneous sample programs demonstrating the quality of EasyOCaml's error messages. As a starting point, we port the erroneous Haskell and Helium programs collected by Simon Thompson[8] and the Helium group[15], respectively, to EasyOCaml. * Unit tests and integration tests ensuring the correctness and robustness of the implementation. Licensing --------- The deliverables will be available on the same terms as the OCaml system, that is, under the Q Public License with the modifications explained in http://caml.inria.fr/pub/distrib/ocaml-3.10/notes/LICENSE Related Work ------------ DrOCaml[1] is an extension of the Scheme IDE DrScheme[5] that allows OCaml programs to be written inside DrScheme. DrOCaml provides an interpreter, a graphical debugger, and the ability to display the types of specific subexpressions. DrOCaml delegates parsing and type checking to the original OCaml tools. Camelia[2] is an OCaml IDE written in C++ using the Qt Toolkit. It supports (among others) syntax highlighting, automatic indentation, a graphical debugger, and it displays types of particular subexpressions. Camelia also uses the original OCaml tools for parsing and type checking. To improve OCaml's error messages, it searches the output of the compiler for certain patterns so as to give hints how to resolve the error. OcaIDE[3] is an Eclipse[6] plugin for OCaml with a rich set of features. It also uses the original OCaml system for parsing and type checking. Dromedary[4] is an editor for OCaml that provides incremental syntax and type checking. Unfortunately, we could not evaluate Dromedary because the code does not compile. In the future, EasyOCaml may be integrated with DrOCaml, Camelia, OcaIDE, and Dromedary to provide each of them with better error reporting facilities. Helium[7] is very similar in spirit to EasyOCaml but for a subset of Haskell. It includes a compiler giving user-friendly error messages and an interpreter integrated in a GUI. Helium is heavily used in practice. There has been some work on improved error reporting for Hindley-Milner style type systems, but none of it has been developed into a practically useful and available system [9,10,12,14]. Other work has addressed the question how to come up with heuristics for generating helpful error messages and how to equip the type checker to issue them [11,13]. References ---------- [1] DrOCaml, http://www.cs.brown.edu/courses/cs017/working-from-home.html [2] Camelia, http://camelia.sourceforge.net/ [3] OcaIDE, http://ocaml.eclipse.ortsa.com:8480/ocaide/ [4] Dromedary, http://osp.janestcapital.com/files/dromedary.pdf [5] DrScheme, http://www.drscheme.org/ [6] Eclipse, http://www.eclipse.org/ [7] Helium, http://www.cs.uu.nl/helium/ [8] Some Common (and not so common!) Hugs Errors, http://www.cs.kent.ac.uk/people/staff/sjt/craft2e/errors/allErrors.html [9] Peter J. Stuckey, Martin Sulzmann, Jeremy Wazny. Interactive Type Debugging in Haskell. Haskell Workshop 2003, pp 72-83, 2003. [10] Christian Haack, Joe Wells. Type Error Slicing in Implicitly Typed, Higher-Order Languages. ESOP 2003. [11] Bastiaan Heeren, Daan Leijen, Arjan van IJzendoorn. Helium, for Learing Haskell. Haskell Workshop 2003, pp 62-71, 2003. [12] Matthias Neubauer, Peter Thiemann. Discriminative Sum Types Locate the Source of Type Errors. ICFP2003, pp 15-26, 2003. [13] Bastiaan Heeren, Jurriaan Hage, S. Doaitse Swierstra. Scripting the Type Inference Process. ICFP2003, pp 3-13, 2003. [14] Matthias Neubauer, Peter Thiemann. Demonstration Abstract: Haskell Type Browser. Haskell '04: Proceedings of the ACM SIGPLAN workshop on Haskell, pp 92-93, 2004. [15] Neon, a DSL for Analyzing Erroneous Programs, http://www.cs.uu.nl/wiki/bin/view/Hage/Neon [16] Matthias Felleisen, Robert Bruce Findler, Matthew Flatt, Shriram Krishnamurthi, How to Design Programs, MIT Press, 2003