For many internal prototypes at Endgame, we adopt an agile development process to rapidly build proof-of-concept services which can then be deployed and reiterated upon to quickly address bugs and introduce new features. Our R&D and DevOps groups maintain and improve dozens of interconnected services, from training machine learning models on malware samples to processing and analyzing domain information. However, many DevOps and R&D requirements are iterative and fluid, and it can be difficult to write services that are fast, safe, and extensible enough to address these changing needs.
For many of our previous services, we utilized Python for its quick development turnaround and rich library ecosystem. However, we often encounter issues that arise from the aforementioned “quick” development, such as occasional bugs arising from type errors, or poor error handling causing service downtime. Hastily written services can also be difficult to refactor as their structure may become convoluted over time.
While many in our DevOps and R&D teams have Python backgrounds and continue to use it for many tasks, we have recently begun to use functional programming in OCaml to solve some of the issues that arise with rapid Python development. OCaml is a compiled, strongly typed programming language that emphasizes safety and expressiveness. It boasts a mature ecosystem of tools and libraries, and has been most used in industries which require a high degree of confidence in bug-free, performant code. While it is considered a multi-paradigm language, OCaml strongly emphasizes a functional programming style, which provides many of the benefits covered in this post. With OCaml, we have improved our ability to adapt to changing requirements, and trust that the software we write is more stable and correct.
The freedom of programming in a dynamically typed language (Image source)
Python, we still want you around...
Many teams at Endgame still use Python for the majority of their development, and it has continued to provide great value for quickly getting services up-and-running. For many tasks in R&D, however, we needed a language that would allow us to refactor more easily, provide greater runtime safety, and catch more errors at compile time. Python did not quite meet our development, safety, and refactoring needs. We found that:
- Python is fast to prototype/script in, but handling large amounts of data can sometimes expose issues that crop up due to dynamic typing.
- Handling JSON in particular can allow for runtime errors as free-form input/mangled data cause type unsafe functions to fail unexpectedly.
- Python is relatively slow in runtime performance due to its interpreted nature.
- Packaging and deploying Python programs requires also deploying a Python interpreter, which itself requires many additional dependencies.
- Codebases written hastily in imperative languages can often devolve into ball-of-mud refactoring nightmares, especially with reliance on deeply nested polymorphic inheritance or proliferation of global state. Python often does little to encourage separation of external IO concerns from internal program logic. Without regular attention given to design and style, shared mutable variables can cause baffling behavior in large programs.
....But OCaml has what we need!
When an Endgamer with extensive previous functional programming experience suggested that our workflow could benefit from the balance of speed and safety that OCaml provides, we found that many of its features addressed our issues with Python. We had several requirements for a new language if we were going to augment Python for DevOps and R&D:
Requirement: A language in which we could write a service quickly (Terseness/Expressiveness).
- OCaml's syntax is extremely concise, while allowing for high-level programming features. This includes:
- A type system including algebraic types and variants.
- Higher-order functions, partial function application, and currying
- Option types, which allow functions to require the caller to handle potential errors/lack of response.
- A powerful pattern-matching system.
Requirement: A language with fast performance.
- OCaml's performance, when compiled natively, often is very close to that of C/C++.
Requirement: A language that is easy to refactor. Due to the agile requirements process of R&D, we needed to be able to redesign service components easily.
- Functional languages can help developers avoid many program design pitfalls by dissuading or preventing proliferation of global state.
- Expression-based languages allow for extremely easy reorganization of code-segments.
Requirement: A language with more runtime safety guarantees.
- This allows developers to write safer code, and safer libraries for reuse.
- OCaml's type system allows for complex and expressive hierarchies of types checked by Hindley-Milner type inference.
Requirement: A mature language with library support for common use cases, as well as C Foreign Function Interface (FFI) bindings to extend external code as needed.
- Much of OCaml’s library base is mature and has been stable and heavily tested for years.
- OCaml’s Ctypes library allows for extremely simple binding to external C/C++ libraries.
Recently, many other people have had similar conclusions about OCaml’s benefits for systems’ programming, and have written posts about their experiences with the language:
https://tech.esper.com/2014/07/15/why-we-use-ocaml/
http://roscidus.com/blog/blog/2014/02/13/ocaml-what-you-gain/
http://www2.lib.uchicago.edu/keith/ocaml-class/why.html
So far, OCaml has made it much easier to write fast, stable, and safe services that are easy to return to and refactor later. In the coming months, we will be publishing a series of technical blog posts describing our usage of OCaml at Endgame, as well as a handful of libraries and frameworks we have developed to support internal development.
Stay tuned!