| 362 |
SkillsBench: Benchmarking how well agent skills work across diverse tasks |
16 Feb |
| 205 |
Evaluating AGENTS.md: are they helpful for coding agents? |
16 Feb |
| 106 |
Towards Autonomous Mathematics Research |
15 Feb |
| 544 |
Frontier AI agents violate ethical constraints 30–50% of time, pressured by KPIs |
10 Feb |
| 212 |
Shifts in U.S. Social Media Use, 2020–2024: Decline, Fragmentation, Polarization (2025) |
8 Feb |
| 186 |
First Proof |
7 Feb |
| 164 |
Attention at Constant Cost per Token via Symmetry-Aware Taylor Approximation |
4 Feb |
| 236 |
How AI impacts skill formation |
30 Jan |
| 330 |
Vibe coding kills open source |
26 Jan |
| 123 |
Challenges and Research Directions for Large Language Model Inference Hardware |
25 Jan |
| 138 |
Binary fuse filters: Fast and smaller than xor filters (2022) |
17 Jan |
| 125 |
Comparing AI agents to cybersecurity professionals in real-world pen testing |
6 Jan |
| 194 |
High-Performance DBMSs with io_uring: When and How to use it |
6 Jan |
| 161 |
Recursive Language Models |
3 Jan |
| 121 |
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks (2018) |
2 Jan |
| 217 |
Professional software developers don't vibe, they control |
30 Dec |
| 131 |
Universal Reasoning Model (53.8% pass 1 ARC1 and 16.0% ARC 2) |
22 Dec |
| 155 |
A quarter of US-trained scientists eventually leave |
15 Dec |
| 154 |
Terrain Diffusion: A Diffusion-Based Successor to Perlin Noise |
10 Dec |
| 358 |
The universal weight subspace hypothesis |
9 Dec |
| 113 |
Zebra-Llama – Towards efficient hybrid models |
6 Dec |
| 704 |
How elites could shape mass preferences as AI reduces persuasion costs |
4 Dec |
| 129 |
Transformers know more than they can tell: Learning the Collatz sequence |
3 Dec |
| 136 |
Program-of-Thought Prompting Outperforms Chain-of-Thought by 15% (2022) |
30 Nov |
| 124 |
Image Diffusion Models Exhibit Emergent Temporal Propagation in Videos |
26 Nov |
| 141 |
An Economy of AI Agents |
23 Nov 25 |
| 384 |
Adversarial poetry as a universal single-turn jailbreak mechanism in LLMs |
20 Nov 25 |
| 222 |
Solving a million-step LLM task with zero errors |
18 Nov 25 |
| 130 |
TiDAR: Think in Diffusion, Talk in Autoregression |
15 Nov 25 |
| 239 |
The Principles of Diffusion Models |
9 Nov 25 |
| 180 |
Making Democracy Work: Fixing and Simplifying Egalitarian Paxos |
8 Nov 25 |
| 174 |
LLMs encode how difficult problems are |
6 Nov 25 |
| 115 |
Continuous Autoregressive Language Models |
5 Nov 25 |
| 218 |
Reasoning models reason well, until they don't |
31 Oct 25 |
| 231 |
Language models are injective and hence invertible |
30 Oct 25 |
| 305 |
A definition of AGI |
26 Oct 25 |
| 120 |
Antislop: A framework for eliminating repetitive patterns in language models |
23 Oct 25 |
| 134 |
The Dragon Hatchling: The missing link between the transformer and brain models |
22 Oct 25 |
| 161 |
Why can't transformers learn multiplication? |
21 Oct 25 |
| 237 |
Modern iOS Security Features – A Deep Dive into SPTM, TXM, and Exclaves |
13 Oct 25 |
| 110 |
Less Is More: Recursive Reasoning with Tiny Networks |
7 Oct 25 |
| 105 |
How to inject knowledge efficiently? Knowledge infusion scaling law for LLMs |
4 Oct 25 |
| 143 |
High-resolution efficient image generation from WiFi Mapping |
1 Oct 25 |
| 143 |
Introduction to Multi-Armed Bandits (2019) |
30 Sep 25 |
| 195 |
Extract-0: A specialized language model for document information extraction |
30 Sep 25 |
| 101 |
Bit is all we need: binary normalized neural networks |
26 Sep 25 |
| 103 |
Are elites meritocratic and efficiency-seeking? Evidence from MBA students |
23 Sep 25 |
| 152 |
Paper2Agent: Stanford Reimagining Research Papers as Interactive AI Agents |
22 Sep 25 |
| 181 |
We Politely Insist: Your LLM Must Learn the Persian Art of Taarof |
22 Sep 25 |
| 146 |
Unified Line and Paragraph Detection by Graph Convolutional Networks (2022) |
21 Sep 25 |