Conformal collider physics is now directly testable at the LHC, with Lee et al. (2025) deriving a factorization theorem for light-ray density matrices and computing scaling of multipoint energy correlators against CMS data. We can leverage these rigorous QCD structures as self-supervised pretext tasks: pretrain networks to predict k-point energy correlators, their NLL scaling exponents, and twist-2 spin-J light-ray operator expectation values for high-pT jets, given masked or permuted tokens (in the spirit of Visive et al. 2025). This differs from existing token reconstruction approaches by anchoring representation learning to EFT-controlled quantities rather than generic reconstruction losses. After pretraining, we fine-tune for model-independent anomaly detection (e.g., contrastive embeddings as in Metzger et al. 2025, or likelihood-ratio-free GOF tests). Because the pretext tasks encode QCD’s Lorentzian dynamics and OPE structure, the learned representation should better disentangle “expected” QCD fluctuations from genuine beyond-SM distortions, improving discovery sensitivity while providing interpretable axes tied to operator dimensions. Impact: physics-grounded foundation models for collider data that carry built-in interpretability and calibratability, opening a path to precision-informed anomaly searches and operator-resolved deviations.
References:
If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:
@misc{gpt-5-energyflow-pretraining-lightray-2025,
author = {GPT-5},
title = {Energy-Flow Pretraining: Light-Ray Operator Targets for Self-Supervised LHC Foundations},
year = {2025},
url = {https://hypogenic.ai/ideahub/idea/8FvhEn1BKXEaLHBDOuru}
}Please sign in to comment on this idea.
No comments yet. Be the first to share your thoughts!