top of page
#
6
Weekly Digest: 23rd Aug 2024
"Business: FTC banning fake reviews and fake deep porn videos, Meta's silent crawler, Uber/GM partnership on robo-taxi service by 2025, SaaS EV/NTM & FCF Margin slowdowns, North America's data centre build out - 10% addition in 6 months
Technology: IEEE Spectrum Programming Language rankings, AI21Labs releasing Jamba 1.5 models offering 256K token windows
Resources: Silicon Valley Canon - the books they read, AGI Safety from First Principles"
AI in Businesses
Federal Trade Commission Announces Final Rule Banning Fake Reviews and Testimonials. The rule will allow the agency to strengthen enforcement, seek civil penalties against violators, and deter AI-generated fake reviews. This follows the earlier banning of non-consensual deep fake porn production and distribution (ftc.gov)
Meta is quietly crawling the internet for the much-needed AI training data. Since the website settings do not explicitly name Meta from crawling (simply because they don't know the name of the crawler as they would for OpenAI and Google). Before we know it Meta may possess the biggest chunk of the internet in its data bank (currently ~25% of websites estimated to prevent crawling)
Uber and GM partnering to roll out a robo-taxi service by 2025. Its renewed attempt from Uber after abandoning its robo-taxi project back in 2018 following a pedestrian losing life. So is GM, which cancelled its project following an accident in 2023. This comes amidst the news that Waymo has doubled its number of rides to 100K since May 2024! (Laura Kolodny, CNBC)
Slowing down in SaaS revenue and margins from these charts - This newsletter is included for its SaaS metric charts for US-listed companies. Every metric indicates a broad-scale slowdown in the US economy indicated by the deteriorating Free Cash Flow Margins (FCF Margins), Enterprise Value / Next Twelve Month ratio (EV/NTM) (Clouded Judgement - Substack Channel)
North America’s eight primary data centre markets added 515 megawatts (MW) of new supply in the first half of 2024 — the equivalent of Silicon Valley’s entire existing inventory. All of Silicon Valley has 459 MW of data centre supply, while those main markets have a total of 5,689 MW. That’s up 10% from a year ago and about double what it was five years ago
Technology updates from AI
The top programming languages of 2024 - the top positions are unchanged from the previous year and they are pretty much easy guesses at this point (Python / Java / Javascript, C++). There are two rising stars though Typescript and Rust. On the jobs market, Typescript has risen from 11th place to 4th place! (spectrum.ieee.org)
AI21Lab released Jamba 1.5: Two New Hybrid Transformers/SSM models of 52B and 398B Parameters is released. Jamba blocks are built from Transformer layers interlaced with state space Mamba layers and Mixture of Expert layers. The resulting model handles 256K tokens which are particularly useful in RAG tasks while performant on other metrics (The Kaitchup - Substack Channel)
Long Read
Epoch AI published a report on the AI scaling through to 2030 - training compute expanding at a rate of approximately 4x per year. To put this 4x annual growth in AI training compute into perspective, it outpaces even some of the fastest technological expansions in recent history. It surpasses the peak growth rates of mobile phone adoption (2x/year, 1980-1987), solar energy capacity installation (1.5x/year, 2001-2010), and human genome sequencing (3.3x/year, 2008-2015)
2030 it will be very likely possible to train models that exceed GPT-4 in scale to the same degree that GPT-4 exceeds GPT-2 in scale
Power is likely to be the constraining factor among all 4 constraints. We could training runs to the scale of 1e28 to 3e29 FLOP which is approximately 10,000 times that of GPT4 training run. This figure does assume that the training run will be from a single data center. Considering the latency, data volumes this could be a fair assumption to make.
Report abstract
Power constraints. Plans for data centre campuses of 1 to 5 GW by 2030 have already been discussed, which would support training runs ranging from 1e28 to 3e29 FLOP (for reference, GPT-4 was likely around 2e25 FLOP). Geographically distributed training could tap into multiple regions’ energy infrastructure to scale further. Given current projections of US data centre expansion, a US distributed network could likely accommodate 2 to 45 GW, which assuming sufficient inter-data center bandwidth would support training runs from 2e28 to 2e30 FLOP. Beyond this, an actor willing to pay the costs of new power stations could access significantly more power, if planning 3 to 5 years in advance.
Chip manufacturing capacity. AI chips provide the compute necessary for training large AI models. Currently, expansion is constrained by advanced packaging and high-bandwidth memory production capacity. However, given the scale-ups planned by manufacturers, as well as hardware efficiency improvements, there is likely to be enough capacity for 100M H100-equivalent GPUs to be dedicated to training to power a 9e29 FLOP training run, even after accounting for the fact that GPUs will be split between multiple AI labs, and in part dedicated to serving models. However, this projection carries significant uncertainty, with our estimates ranging from 20 million to 400 million H100 equivalents, corresponding to 1e29 to 5e30 FLOP (5,000 to 300,000 times larger than GPT-4).
Data scarcity. Training large AI models requires correspondingly large datasets. The indexed web contains about 500T words of unique text, and is projected to increase by 50% by 2030. Multimodal learning from image, video and audio data will likely moderately contribute to scaling, plausibly tripling the data available for training. After accounting for uncertainties on data quality, availability, multiple epochs, and multimodal tokenizer efficiency, we estimate the equivalent of 400 trillion to 20 quadrillion tokens available for training by 2030, allowing for 6e28 to 2e32 FLOP training runs. We speculate that synthetic data generation from AI models could increase this substantially.
Latency wall. The latency wall represents a sort of “speed limit” stemming from the minimum time required for forward and backward passes. As models scale, they require more sequential operations to train. Increasing the number of training tokens processed in parallel (the ‘batch size’) can amortize these latencies, but this approach has a limit. Beyond a ‘critical batch size’, further increases in batch size yield diminishing returns in training efficiency, and training larger models requires processing more batches sequentially. This sets an upper bound on training FLOP within a specific timeframe. We estimate that cumulative latency on modern GPU setups would cap training runs at 3e30 to 1e32 FLOP. Surpassing this scale would require alternative network topologies, reduced communication latencies, or more aggressive batch size scaling than currently feasible.
Bottom line. While there is substantial uncertainty about the precise scales of training that are technically feasible, our analysis suggests that training runs of around 2e29 FLOP are likely possible by 2030. This represents a significant increase in scale over current models, similar to the size difference between GPT-2 and GPT-4. The constraint likely to bind first is power, followed by the capacity to manufacture enough chips. Scaling beyond would require vastly expanded energy infrastructure and the construction of new power plants, high-bandwidth networking to connect geographically distributed data centers, and a significant expansion in chip production capacity.
![](https://static.wixstatic.com/media/1dc0ca_4a2ea64478564039999d43e510c74623~mv2.jpeg/v1/fill/w_147,h_83,al_c,q_80,usm_0.66_1.00_0.01,blur_2,enc_auto/1dc0ca_4a2ea64478564039999d43e510c74623~mv2.jpeg)
Resources
AGI Safety from First Principles. This article published in Alignment Forum in September 2020 forms an excellent formulation of ensuring AI alignment. The article presents one of the most compelling cases for why the development of artificial general intelligence (AGI) might pose an existential threat(Richard Ngo, Alignment Forum)
How these popular titles - political, organisational and even sci-fi formulate the ethos of Silicon Valley which in turn forms the core of Tech Elites (The Scholar's Stage)
Technology Posts
bottom of page