Substantiation

Methodology

Last updated: May 2026

This page explains how LottoLucky's scratch-off recommendations and savings claims are computed. The goal is transparency: a curious user, a regulator, or a competitor should be able to read this page and reproduce the work. If you're here because you saw a numeric claim in our app, on our website, or in an ad — this is where that number comes from.

§1. What we measure

For every active scratch-off game in every state we cover, we compute a simulated median session outcome — the typical dollar result of a $5,000 hypothetical play session on that game, given the prizes still remaining in the prize pool.

The median is the middle of the simulated distribution. Half of simulated $5,000 sessions on a game end above the median; half end below. The median is not a guarantee — individual sessions, especially small ones, will vary widely. But across the population of plays, the median is the most representative single-number summary of how a game is currently behaving.

When we say a game's simulated median session outcome is, for example, −$1,200 on a $5,000 session, we mean: in our Monte Carlo simulation of $5,000 of play, the middle outcome of those simulated sessions is a net loss of $1,200. The next $5,000 session you ran would not necessarily produce that result — but it's the typical result across many simulated sessions.

We rank games within a price tier (the $5 tier, the $10 tier, $20, $30, etc.) by this simulated median. The top-ranked game — our "Lose Less" pick — is the game with the best (least-negative or, rarely, positive) simulated median in its price tier at the moment of the snapshot.

§2. How we compute it

Data source

Every state lottery publishes a prize-remaining table for each scratch-off game. The table lists prize tiers — for example, "$5 prize: 12,400 remaining" / "$1,000 prize: 84 remaining" / "$1,000,000 prize: 2 remaining" — and the total number of tickets remaining in the pool. We scrape these tables directly from each state lottery's official website.

The scraper fleet covers each state on a refresh cadence (hourly for high-traffic states, every few hours for slower-changing ones) and writes the raw counts to a database.

Simulation

For each game in the database, we run a Monte Carlo simulation of a $5,000 play session:

Buy a ticket. Draw a prize tier with probability equal to the tier's remaining-prize share of the total tickets remaining.
Track winnings.
Subtract the ticket price.
Repeat until $5,000 has been spent in net terms.
Record the final net dollar outcome of that simulated session.

Run this many times (we run a sufficient number per snapshot to produce stable percentile estimates) and you get a distribution of outcomes. The median of that distribution is the game's p50_long — the 50th percentile of the long-budget simulation.

We also record the 10th, 25th, 75th, and 90th percentiles so users can see what the realistic spread of outcomes looks like, not just the median. The percentile bands are how the in-app simulator renders the "bottom-quartile outcome / typical outcome / top-quartile outcome" preview.

Why median, not mean

In a game with a large top prize, the mean is dominated by the tiny probability of hitting the top prize times its dollar value. That mean is a real number, but it does not represent what a typical player experiences — it's a fat-tail artifact. The median is what the typical player session lands at, which is the right summary for "what should I expect if I play this game?"

We do report mean expected return on the game detail page for users who want it, but the headline ranking is by median because that's what reflects the player's typical experience.

§3. Baselines we report against

When we say "save $X per $Y played" or "less loss vs. random picking," we're comparing the Lose Less pick to a baseline. The baseline is always stated. We use two.

Random in-tier baseline

Imagine a player who decides to buy $5,000 of $10 scratch-offs in their state, and picks a $10 game uniformly at random from whatever the state offers that week. Their expected median session outcome is the median across all the $10 games in the state's catalog. Our Lose Less pick beats this baseline by the per-tier savings number we report.

This is the conservative baseline. It assumes the user without our app would pick reasonably (just not informed by the math). It is what most users would actually do in the absence of guidance — pick whichever game catches the eye, with no systematic bias toward bad games.

Worst in-tier baseline

The lowest-ranked game in the same price tier as the Lose Less pick. Our pick beats this baseline by the larger best-vs-worst spread number.

This is the aggressive baseline. It assumes the user without our app could pick the worst game in the tier — which a fraction of users do, especially if they pick by jackpot intuition (the game with the biggest headline jackpot is sometimes also the worst game in the tier by simulated median session outcome).

When each applies

The random-in-tier baseline is the most defensible: a regulator would ask "what's the typical alternative outcome?" and "random in-tier" is a fair representation of that. The best-vs-worst spread captures the size of the opportunity — the gap between the best and worst game in a tier is genuinely substantial. We never use the worst-in-tier baseline in isolation as the headline savings figure; the framing always pairs it with the random-in-tier baseline so the reader sees both.

§4. Data sources

Public state lottery prize-remaining tables

The raw input is the prize-remaining table each state lottery publishes on its official website. These tables are publicly accessible. The states currently in our dataset are listed at our supported-states reference (linked from the main app). The list updates as we add new states; states are added only after the scraper has been tested and the data passes our quality-control gates.

Refresh cadence

Most states refresh in our database every hour. Some states refresh less frequently because their underlying pages do. The freshness for each state is visible on the in-app game detail page.

Quality control

Each scraper run is validated against the prior run: a sanity check on total tickets remaining (monotonically non-increasing), a sanity check on prize tier counts (also non-increasing), and a sanity check on expected return (within reasonable bounds — flag values outside −100% to +50%). Games that fail QC are excluded from the simulation pool for that snapshot. Currently-active games are defined as those with more than 10% of their smallest-prize tier still unclaimed; near-end-of-life games are excluded from the public claims dataset.

Reproducibility

The analysis script that generates our marketing claims (claims_playbook_analysis.py) and the supporting savings-claim script (savings_claim_analysis.py) both produce a frozen .xlsx snapshot that can be inspected to verify any claim we make.

The frozen dataset for each claim version is archived; the SHA-256 hash of that file is recorded in our internal audit trail. If you would like to verify a specific claim, contact us at the address in §7 and we will provide the frozen dataset and the script SHA used to generate it.

§5. What our claims do NOT say

This section is the most important part of the page.

We do not predict winning tickets

The app does not tell you which scratch-off ticket will win. Nobody can. The lottery is a game of chance — the outcome of any specific scratch is, by design, random within the constraints of the prize pool. What we do is compute the probability distribution of outcomes given the current prize pool, and we report what the median outcome looks like.

The difference matters legally and ethically. "Predict winners" is a fortune-telling claim; we never make it. "Project the simulated outcome distribution" is a math claim; we make it freely.

We do not guarantee profitability

We do not claim that following our recommendations leads to net positive returns. Across substantially all scratch-off games, the expected return over the game's full life is negative — the state lottery designs the game to retain a margin. Briefly, occasionally, a game can become positive expected value when enough top prizes remain unclaimed but tickets in the pool are nearly depleted. We flag these moments. But our headline claim is about reduced expected loss, not expected gain.

If you're playing scratch-offs to make money, the honest math says you will lose money on average over time. The app's job is to help you lose less of it.

Median outcomes are not guaranteed outcomes

When we say a game's simulated median is −$1,200 on a $5,000 session, the next $5,000 session you run on that game is not guaranteed to end at −$1,200. About half of $5,000 sessions on that game will end better than −$1,200; about half will end worse. Variance is high. The median is a population statistic — it describes what is typical across many sessions, not what will happen in any single one.

Comparative claims compare medians, not outcomes

When we say our Lose Less pick saves $X vs. random in-tier picking, we mean: the median session outcome for our pick is $X better than the median session outcome for random in-tier picking. Any individual session on either choice may produce a result far above or below either median.

No state lottery has endorsed us

LottoLucky is an independent analytics product. No state lottery, lottery commission, or government agency has reviewed, approved, or endorsed our recommendations. We have no contractual relationship with any state lottery. We scrape public prize-pool data; we do not have insider information.

§6. Sample size, date window, refresh cadence

Sample size

Our claims dataset includes a number of state-price-week cells across the states we cover. The exact count appears in the footnote of every numeric claim we publish ("based on N cells across M states"). The minimum threshold for a published claim is 25 cells per axis — claims on slices of the data with fewer than 25 cells are not published.

Date window

Every claim cites the date window of the underlying data. The window starts at the earliest snapshot in the dataset used for the published claim and ends at the latest. We refresh the dataset quarterly and update the date window with each refresh.

Refresh triggers

The claims dataset is regenerated quarterly. Additional refreshes are triggered when:

A new state is added to our supported list.
A state is removed from our supported list.
We change the underlying ranking algorithm.
Before any new paid campaign over $5,000 launches.

The headline claim's underlying number rarely shifts by more than ~5% quarter-over-quarter once the dataset is mature. If it shifts by more than 25%, we halt all advertising using the prior number until copy is reviewed.

§7. Contact for verification

If you are a regulator, a journalist, or a curious user with a substantive methodology question, you can reach us at support@lottolucky.app.

For a substantiation request — verification of a specific numeric claim — please include:

The exact claim text you are asking about, copied verbatim from the surface where you saw it.
The surface (e.g., "Instagram ad seen on 2026-05-13," "App Store description as of 2026-05-13," "press release dated 2026-05-13").

We will respond within 14 business days with the frozen dataset SHA-256 used to substantiate the claim, the analysis script's git SHA at the time of claim generation, and the methodology version this page was at if it has changed since the claim shipped.

This page is maintained as the substantiation anchor for LottoLucky's marketing claims. It is reviewed quarterly alongside the claims refresh. The latest version is always available at lottolucky.app/methodology.

Lottery games are games of chance. If you or someone you know has a gambling problem, call 1-800-GAMBLER. State-specific helplines are listed in the footer of every page.