Tomasz Kacprzak


  • Download my CV
  • My papers on Google Scholar
  • Data science, cosmology, physics

    I am a Senior Scientist at ETH Zurich and a Senior Data Scientist at the Swiss Data Science Center at the Paul Scherrer Institute. I obtained my PhD in Physics and Astronomy from the University College London, as well as previously a MSc in Machine Learning from the same university. My focus is applications of novel machine learning and high-performance computing to solve outstanding problems in physics, cosmology, and climate science.

    Recent papers


    Cosmology and artificial intelligence

    Artificial Intelligence methods, such as deep convolutional neural networks, have the capacity to model the complex patterns contained in the cosmic web. I have introduced the deep learning approaches to constraining cosmological parameters and generating large scale structure simulations. I demonstrated that the AI-based analysis can achieve 40% improvement in measurement precision, a gain equivalent to using 2x more survey data with conventional methods.

    Machine learning for applied physics and climate science

    At PSI, I work on a number of projects advancing the applications of machine learning to applied physics and climate simulations. By applying novel machine learning methods, (deep learning, generative models), and utilizing latest high-performance computing hardware (A100 GPS, the Alps cluster), I enable achieving scientific objectives that are unattainable with classical methods.

    Dark Energy Survey

    I am a Builder of the Dark Energy Survey, the largest ground-based cosmological observational survey to date. This program has delivered the most precise cosmological parameter measurements from large scale structure of the universe to date. I have been involved in DES since 2012, with the following contributions:

    Simulations-based inference in cosmology and astronomy

    In photometric surveys, the distances to galaxies are inferred from galaxy colors by matching them to galaxies found in previous spectroscopic surveys. While this approach has many successes for closeby galaxies, where spectroscopic data is available in abundance, it can be difficult to reliably apply to far-away galaxies. This is due to our lack of understanding of the population these high-redshift galaxies, as well as their evolution over cosmic time. Difficulties with modelling selection functions for spectroscopic surveys further complicates this problem. An alternative is to use a Monte Carlo Control Loop (MCCL) approach, inspired by approaches in particle physics. MCCL uses physically-motivated parametric models for galaxy properties evolution, as well as very precise simulations of the telescope and its selection functions. This allows us to achieve the same precision of redshift measurement without using spectroscopy of high-redshift galaxies.


    I gave >20 invited talks at international conferences, workshops, and university colloquia. Some of them are available online.

    Invited keynote talk at Bayesian deep learning for cosmology and time domain astrophysics, Paris, France, June 20-24 2022 (AstroDeep22)

    Other recorded talks include:

    Datasets and software

    I am producing and maintaining a number of datasets and software packages.


    I taught a number of courses on machine learning and data science, cosmology, and astrophysics:


    New projects

    If you are a masters student in cosmology, computer science or statistics, and are interested in a project, please send me an email.