Introduction to the Julia Programming language
Application development for complex systems
The computational and data requirements of modern simulation tools for applications such as weather, computational biology, and fluid dynamics are a fundamental driving force for modern programming environments and computer systems. Modern cloud and HPC (high-performance computing) systems consist of 10s - 1000s of cores, distributed across multiple clusters/premises, and usually equipped with GPU accelerators.
Simultaneously
modern application codes are complex encompassing inputs from multiple
domains (e.g. math, physics, biology, environmental, finance), and
sophisticated solvers and numerical representations (parametrisations
scheme, subgrid approximations, turbulent representations, etc.). Further emerging research developments require the ability to rapidly prototype and update innovative solutions.
I gained experience in the complexity of developing complex codes in heterogeneous environments when working on the AllScale project. The AllScale framework proposed to address the above complexity by "separation of responsibility" between the domain scientist working at an API (Application Programming Interface) layer(or similar), while the computer scientist or HPC expert manages performance and machine level optimisations at a deeper level (Jordan et. al. 2020). AllScale aims to simplify developments by leveraging a powerful engineering platform consisting of:
- an
intuitive programming API where the domain expert develops their code
and inserts directives related to parallelism. The API is akin to
pure-C++ development rather than parallelism.
- A source-to-source compiler that converts the provided sources into an internal, high-level intermediate representation (based on the Insieme Compiler (Jordan et. al. 2013)).
- A runtime system that dynamically manages the execution of applications (e.g. managing the distribution of workload and data, as well as the configuration of hardware parameters, e.g. by adapting the amount
of used CPU cores or GPU).
The AllScale framework has been used to develop and optimise a number of applications such as computational fluid dynamics, space-weather, and data assimilation schemes on adaptive grids (O’Donncha et. al. 2020).
The two-language problem
The above description highlights the complexity of modern application development. Consequently, a long-standing situation in application development is the "two-language problem". This "two-language" problem is a trade-off that developers
typically make when choosing a language -- it can either be relatively
easy for humans to write, or relatively easy for computers to run, but
not both. This typically results in domain experts prototyping code in a higher-level, easy-to-use language
and then working with expert programmers to recode it in a fast
language, prior to eventual deployment in production. The inefficiencies of this are obvious and have traditionally been driven by the challenges of creating a programming language that is both easy-to-use and fast. This is particularly true for aquaculture where the complexity of farm systems (ocean and weather dynamics, fish growth and productivity, health monitoring and forecasting, economic production and price forecasting) require resolving complex mathematical systems and making rapid forecasts or predictions.
The Julia Programming Language works to solve this challenge. Julia is an open-source, high-level programing language with roots at MIT. It comes with lofty ambitions
We want a language that's open-source, with a liberal license. We want
the speed of C with the dynamism of Ruby. We want a language that's
homoiconic, with true macros like Lisp, but with obvious, familiar
mathematical notation like Matlab. We want something as usable for
general programming as Python, as easy for statistics as R, as natural
for string processing as Perl, as powerful for linear algebra as Matlab,
as good at gluing programs together as the shell. Something that is
dirt simple to learn yet keeps the most serious hackers happy. We want
it interactive and we want it compiled.
Julia is a high-level, high-performance, dynamic programming language.
Many of its features are well-suited for numerical analysis and
computational science. In a recent Nature article, the strengths of Julia were highlighted where it was described as "the best of both worlds"
Julia — the name puts the ‘Ju’ in ‘Jupyter’, a computational notebook system popular among data scientists, alongside Python and R — is essentially a compiled language in scripting-language clothing. In scripting languages such as Python, users type code into an interactive editor line by line, and the language interprets and executes it, returning the result immediately. With languages such as C and Fortran, code must be compiled into machine-readable instructions before it can be executed. The former is easier to use, but the latter produces faster code. As a result, programmers for whom speed counts often develop algorithms in scripting languages and then translate them into C or Fortran, a laborious and error-prone process.
Julia circumvents that two-language problem because it runs like C, but reads like Python. To all appearances, using Julia is like coding in Python: type a line, get a result. But in the background, the code is compiled. Consequently, the first
time a function is keyed in, it might be slow, but subsequent runs are
faster. And once the code is working correctly, users can optimize it Further, Julia is served by over 4,000 registered packages that simplify common tasks such as data loading, transformation, statistical analysis and machine learning, and visualisation. Extending on this, Julia interfaces cleanly with popular languages such as Python or R allowing code developed in other languages to be called from Julia.
REFERENCES:
Jordan, Herbert, et al. "The allscale framework architecture." Parallel Computing 99 (2020): 102648.