Introduction to the Julia Programming language

View

Application development for complex systems

The computational and data requirements of modern simulation tools for applications such as weather, computational biology, and fluid dynamics are a fundamental driving force for modern programming environments and computer systems. Modern cloud and HPC (high-performance computing) systems consist of 10s - 1000s of cores, distributed across multiple clusters/premises, and usually equipped with GPU accelerators.

Simultaneously modern application codes are complex encompassing inputs from multiple domains (e.g. math, physics, biology, environmental, finance), and sophisticated solvers and numerical representations (parametrisations scheme, subgrid approximations, turbulent representations, etc.). Further emerging research developments require the ability to rapidly prototype and update innovative solutions.

I gained experience in the complexity of developing complex codes in heterogeneous environments when working on the AllScale project. The AllScale framework proposed to address the above complexity by "separation of responsibility" between the domain scientist working at an API (Application Programming Interface) layer(or similar), while the computer scientist or HPC expert manages performance and machine level optimisations at a deeper level (Jordan et. al. 2020). AllScale aims to simplify developments by leveraging a powerful engineering platform consisting of:

  • an intuitive programming API where the domain expert develops their code and inserts directives related to parallelism. The API is akin to pure-C++ development rather than parallelism.
  • A source-to-source compiler that converts the provided sources into an internal, high-level intermediate representation (based on the Insieme Compiler (Jordan et. al. 2013)).
  • A runtime system that dynamically manages the execution of applications (e.g. managing the distribution of workload and data, as well as the configuration of hardware parameters, e.g. by adapting the amount
    of used CPU cores or GPU)

The AllScale framework has been used to develop and optimise a number of applications such as computational fluid dynamics, space-weather, and data assimilation schemes on adaptive grids (O’Donncha et. al. 2020). 

The two-language problem

The above description highlights the complexity of modern application development. Consequently, a long-standing situation in application development is the "two-language problem". This "two-language" problem is a trade-off that developers typically make when choosing a language -- it can either be relatively easy for humans to write, or relatively easy for computers to run, but not both. This typically results in domain experts prototyping code in a higher-level, easy-to-use language and then working with expert programmers to recode it in a fast language, prior to eventual deployment in production. The inefficiencies of this are obvious and have traditionally been driven by the challenges of creating a programming language that is both easy-to-use and fast. This is particularly true for aquaculture where the complexity of farm systems (ocean and weather dynamics, fish growth and productivity, health monitoring and forecasting, economic production and price forecasting) require resolving complex mathematical systems and making rapid forecasts or predictions.

The Julia Programming Language works to solve this challenge. Julia is an open-source, high-level programing language with roots at MIT. It comes with lofty ambitions

We want a language that's open-source, with a liberal license. We want the speed of C with the dynamism of Ruby. We want a language that's homoiconic, with true macros like Lisp, but with obvious, familiar mathematical notation like Matlab. We want something as usable for general programming as Python, as easy for statistics as R, as natural for string processing as Perl, as powerful for linear algebra as Matlab, as good at gluing programs together as the shell. Something that is dirt simple to learn yet keeps the most serious hackers happy. We want it interactive and we want it compiled.

Julia is a high-level, high-performance, dynamic programming language. Many of its features are well-suited for numerical analysis and computational science. In a recent Nature article, the strengths of Julia were highlighted where it was described as "the best of both worlds"

Julia — the name puts the ‘Ju’ in ‘Jupyter’, a computational notebook system popular among data scientists, alongside Python and R — is essentially a compiled language in scripting-language clothing. In scripting languages such as Python, users type code into an interactive editor line by line, and the language interprets and executes it, returning the result immediately. With languages such as C and Fortran, code must be compiled into machine-readable instructions before it can be executed. The former is easier to use, but the latter produces faster code. As a result, programmers for whom speed counts often develop algorithms in scripting languages and then translate them into C or Fortran, a laborious and error-prone process.

Julia circumvents that two-language problem because it runs like C, but reads like Python. To all appearances, using Julia is like coding in Python: type a line, get a result. But in the background, the code is compiled. Consequently, the first time a function is keyed in, it might be slow, but subsequent runs are faster. And once the code is working correctly, users can optimize it Further, Julia is served by over 4,000 registered packages that simplify common tasks such as data loading, transformation, statistical analysis and machine learning, and visualisation. Extending on this, Julia interfaces cleanly with popular languages such as Python or R allowing code developed in other languages to be called from Julia.

REFERENCES:

Jordan, Herbert, et al. "The allscale framework architecture." Parallel Computing 99 (2020): 102648.

Jordan, Herbert, et al. "INSPIRE: The Insieme parallel intermediate representation." Proceedings of the 22nd international conference on Parallel architectures and compilation techniques. IEEE, 2013.

O’Donncha, Fearghal, et al. "AllScale toolchain pilot applications: PDE based solvers using a parallel development environment." Computer Physics Communications 251 (2020): 107089.



Last modified: Tuesday, 19 October 2021, 4:41 PM