
How an Open Model and a Pile of Data are Changing Time Series Analysis
30 Jun 2025
MOMENT delivers an open-source foundation model and the "Time Series Pile," advancing low-supervision analysis and promoting transparent, open science.

When a Specialized Time Series Model Outshines General LLMs
30 Jun 2025
MOMENT excels in low-supervision tasks like forecasting and anomaly detection, often outperforming LLM-based models and showing strong scaling properties

How Do You Train an AI to Understand Time? With a Giant Pile of Data.
30 Jun 2025
Built on the "Time Series Pile," MOMENT uses masked patch prediction to pre-train a versatile Transformer, ready for fine-tuning on diverse tasks.

Why Training on Time Series Beats Fine-Tuning LLMs for Time Series Tasks
30 Jun 2025
MOMENT uses masked patch pre-training on diverse time series, moving beyond single-dataset models and LLM to explore large-scale, low-supervision learning.

How a New AI Model is Taming the Chaos of Time Series Data
30 Jun 2025
MOMENT: an open-source foundation model for time series, pre-trained on a massive "Time Series Pile" to excel at diverse tasks with limited supervision.

Transformer Theory & LLM References: Here's What You Should Check Out
25 Jun 2025
A concise list of key academic works informing our research on Transformer model dynamics, cross-entropy loss, and theoretical connections to Hopfield networks.

GPT-2 Architecture and Training Details: Parameters & Cross-Entropy Loss
24 Jun 2025
Explore the original GPT-2 model's architecture, including its training on WebText, BPE tokenizer, hidden dimensions, and layer parameters

Theoretical Derivations: Cross-Entropy Loss and Energy Functions in LLMs
24 Jun 2025
Explore rigorous mathematical proofs, including properties of incomplete gamma functions, Stirling's approximation, and derivations of loss functions

LogSumExp Function Properties: Lemmas for Energy Functions
24 Jun 2025
Explore key mathematical properties of the LogSumExp function, including bounds and continuity, which are crucial for understanding energy functions