Zechang Sun’s Website
I am a third-year PhD student in the Department of Astronomy at Tsinghua University. As a scientist with an engineering mindset, my research interest lies in utilizing/developing advanced data analysis techniques to provide unique perspectives on understanding the world. And more importantly, I do what I think will create the future. Below are the typical questions I care about:
- Astrophysics:
- What is the nature of dark matter and dark energy?
- This is the Holy Grail of astronomy, I am here to offer my personal expertise to this grand object in DESI collaboration.
- Unique observation mining.
- Astronomers are treasure hunter in the Universe wonderland to discover those most unique phenomena. I am here to build artificial intelligence system to automatically dig out and interpret those observations, no matter they are galaxies, stars, supernove, supermassive blackhole, and etc.
- What is the nature of dark matter and dark energy?
- Machine Learning/Statistics
- Unsupervised Learning/Reinforcement Learning
- Currently, all artificial intelligence systems are, without exception, stuck at the lowest rung of Judea Pearl’s Ladder of Causation, which is the ability to capture only associations between data. However, when applying AI in scientific research, there is often a problem of sparse and unrepresentative data. Unsupervised learning and reinforcement learning may be more meaningful in the context of AI for Science.
- Causal Inference
- Most scientific discovery process can be formalized into the framework of causal analysis. How to use AI to augument the ability for human to discover causal models, design experiments to verify those causal relations, and finally a AI scientist which can understand the world is something I think can both facilitate the progress of natural science and also our understanding of the intelligence. I am working towards this direction.
- Unsupervised Learning/Reinforcement Learning
- Science of Science:
- How does knowledge spread and evolve within the scientific community?
- You might be curious why I am interested in sociology. In fact, I am a fan of Isaac Asimov’s Foundation series and Liu Cixin’s The Three-Body Problem series. I think it’s very cool to be able to describe and predict the evolution of a social system using mathematics.
- As a researcher, my life inevitably involves figuring out how to gain broad recognition for my research in the field. Therefore, I hope to use the research methods of sociology and data science to quantify whether the current academic evaluation system fosters innovation and how to uncover promising research directions.
- How does knowledge spread and evolve within the scientific community?
Selected Works
- 2024
- Interpreting Multi-band Galaxy Observations with Large Language Model-Based Agents
- Subject: Large language model based agents, AI scientist
- Citation Count: 1
- Publication: arXiv, Submitted to journal
TL;DR: We build the first AI astronomer in the world to mimic the scientific reasoning process of human researcher using large language models. The AI astronomer can interact with real-world research data and analysis tools, and can automatically learn necessary skills for scientific analysis through self-play reinforcement learning (Work done during internship in Microsoft Research Asia).
- Knowledge Graph in Astronomical Research with Large Language Models: Quantifying Driving Forces in Interdisciplinary Scientific Discovery
- Subject: Science of Science, Large language model, Knowledge graph
- Citation Count: 6
- Publication: IJCAI 2024 AI for Research Workshop
- Interactive Knowledge Graph for Astronomy: Link
- TL;DR: We build the first knowledge graph in astronomy using large language models. We extract all scientific concepts from the academia journals of astronomy and then linking them through their citation-reference relation. We then study how various concepts are linked with each other in scientific community. Our research demonstrate that technology advancements typically require 5-10 years progress to be integrated into scientific discovery workflow.
- 2023
- Quasar Factor Analysis-An Unsupervised and Probabilistic Quasar Continuum Prediction Algorithm with Latent Factor Analysis
- Subject: Lya forest cosmology, Factor analysis
- Citation Count: 10
- Publication: The Astrophysical Journal Supplement Series, Volume 269, Issue 1, id.4, 30 pp.
- Code Link: https://github.com/ZechangSun/QFA
- Data Link: https://zenodo.org/records/8050660
- TL;DR: We build an statistical model to generatively model quasar spectra in this work. Compared to previous work, all collected quasar spectra can be used to train the machine learning algorithm. It can be used for quasar continuum prediction, anomaly detection, and study of evolution of quasar across cosmic time.
- Zephyr : Stitching Heterogeneous Training Data with Normalizing Flows for Photometric Redshift Inference
- Subject: Photometric redshift calibration, Normalizing flow, Heterogeneous training data distribution
- Citation Count: 3
- Publication: NeurlPS 2023 workshop on Machine Learning and the Physical Sciences (Journal Submission on the way)
- TL;DR: We present zephyr, a novel method that integrates cutting-edge normalizing flow techniques into a mixture density estimation framework, enabling the effective use of heterogeneous training data for photometric redshift inference.
- Quasar Factor Analysis-An Unsupervised and Probabilistic Quasar Continuum Prediction Algorithm with Latent Factor Analysis
Beyond My Research
In life, I enjoy music, traveling, photography, animation, and games. I have always done my best to bring warmth to the people around me.