Methodology
How the rankings are built
This page documents how the Top 100 list is constructed, what is in the data, and what is deliberately out. The Landau-problems ranking uses a three-source composite design: arXiv preprint output, OpenAlex topical citations, and zbMATH MSC classifications. Because the four Landau problems draw on the same analytic techniques, the term set for this site is the union of the four problem-specific sets: Goldbach, twin primes, Legendre, and n2+1, together with the shared methods (circle method, prime gaps, Hardy-Littlewood, Chen's theorem, singular series, almost primes) that support all four.
Data sources
| Source | What it gives | Limitations |
|---|---|---|
| arXiv (math.NT + math.CO) | Preprint-level: titles, abstracts, authors, dates, co-author graph | Biased toward people who post preprints. Senior figures who publish only in journals are undercounted. |
| OpenAlex | Author-level: paper count, citations, affiliations, country | Concept tagging is noisy in math; surname-only matching can misidentify. Broad terms pull in authors with only tangential connections to the Landau problems. |
| zbMATH Open | Curated math review database; editor-assigned MSC classification (we use five core classes) | The REST API can be slow; coverage otherwise excellent, especially for older and non-Western mathematicians. |
Pipeline
Title-weighting. A paper can mention Goldbach or twin primes without being about them. To separate genuine work from passing mentions, the arXiv and OpenAlex pipelines weight a keyword match by where it appears: a match in the paper title counts at full weight, and a match only in the abstract counts at half (a factor of 0.5). zbMATH is not title-weighted, because its documents are classified by human editors, so the subject class itself is the relevance signal.
- arXiv pull: 17 search terms covering the four Landau problems and shared methods
(Goldbach conjecture, twin primes, bounded gaps between primes, small gaps between primes, Legendre's conjecture,
primes in short intervals, prime gaps, Hardy-Littlewood, prime k-tuples, circle method, distribution of prime
numbers, Maynard-Tao, exceptional set, consecutive primes, almost prime, singular series, and Chen's theorem)
restricted to the math.NT and math.CO categories. Each paper's contribution to an author is title-weighted
as above. A co-authorship graph is built and eigenvector centrality is the second factor in an arXiv composite
of
0.60 * pr(weighted papers) + 0.40 * pr(eigen). Authors with at least 3 topical papers qualify. - OpenAlex pull: the same phrase queries, with an author cap of 10 per work to remove large
collaborative papers only tangentially related to the Landau problems. Works and their citations are title-weighted
as above. Composite:
0.60 * pr(weighted works) + 0.40 * pr(weighted citations). - zbMATH pull: documents tagged with any of the five core MSC classes: 11P32 (Goldbach-type theorems; other additive questions involving primes), 11N05 (distribution of primes), 11N35 (Goldbach-type theorems and sieve methods for additive questions), 11N36 (applications of sieves), and 11N32 (primes represented by polynomials; other multiplicative structures of polynomial values). The editor-assigned MSC classes correct a systematic gap in the other sources: pre-1995 number theorists and specialists who publish in journals with sparse arXiv presence. Over 11,000 zbMATH documents were collected across the five classes.
- Merge and scoring: the rankings are surname-deduplicated and joined.
The available ranks are combined with a weighted order statistic:
each researcher's ranks are sorted and weighted
0.70on the best,0.20on the middle, and0.10on the worst. Sorting before weighting means the method rewards excellence in any one pipeline, while a researcher strong across all of them still finishes ahead. Lower combined score ranks higher. - Estimating a missing rank (interpolation): a researcher ranked by only one of the pipelines
is not given a flat penalty. To estimate a missing rank, we order the whole pool by a pipeline the researcher
does appear in, then walk outward to the two nearest researchers above and below who carry a real rank in the
missing pipeline, and average those (up to four) values. The
0.70top weight may only land on a measured rank, so an estimate can support a score but can never be the headline signal. Estimated ranks show in [square brackets] on the Top 100 table; measured ranks show plain. - Hand-curated edits: an exclusions file removes researchers the automated pipeline surfaced in error. The merge does not hand-place any researcher; everyone earns their rank from the pipeline scores.
MSC class notes
The five zbMATH classes span all four Landau problems. 11P32 and 11N35 cover Goldbach and circle-method additive questions. 11N05 covers prime distribution broadly (relevant to Legendre and twin primes). 11N36 covers sieve applications (relevant to twin primes, almost-prime analogues of all four problems). 11N32 covers primes represented by polynomials (the n2+1 problem specifically).
What is not in this list
- Researchers without a strong digital footprint. The pipeline indexes arXiv well and OpenAlex moderately; the zbMATH layer corrects for journal-only publishing.
- Single-breakthrough authors. The ranking measures sustained output. A researcher whose contribution to the Landau-problems family is one landmark paper will rank lower than a productive researcher with many related papers.
- Adjacent topics. Some of the 100 work mainly on related questions (primes in arithmetic progressions, the Riemann hypothesis, multiplicative number theory) rather than the four Landau problems directly. Title-weighting reduces, but does not eliminate, this adjacency.