"time to GPT-2", down to 2.91 hours

(twitter.com)

72 points | by tosh 11 hours ago

6 comments

Cyphase 8 hours ago
https://nitter.net/karpathy/status/2018804068874064198
janalsncm 9 hours ago
Never used gpu spot instances before but I would have to imagine getting interrupted is pretty annoying.
[-]
- somehowadev 8 hours ago
  As long as your workload can handle resuming again and your instances aren't heavily in-demand (looking at the eviction rates), the cost saving for us is substantial enough to take the occasional interruption.
  I do wish Azure gave more than the 30 second eviction warning (like AWS) but still useable.
- hhh 9 hours ago
  it depends, our workloads can finish up in under two minutes and shut down without much effort, so we haven’t really noticed it outside of one time when we had no spot capacity.
  [-]
  - janalsncm 9 hours ago
    I guess if checkpointing is set up correctly and your runtime is saved to a docker image it’s feasible. Probably not going to get a 3 hour continuous chunk of time I would assume.
    [-]
    - direwolf20 8 hours ago
      When I once used Spot it wasn't that bad. You're likely to have an instance for 3 hours.
  - joeig 9 hours ago
    In addition to the two-minute interruption notice, rebalance recommendations[0] allow you to handle interruptions even more gracefully.
    [0] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/rebalanc...
ProllyInfamous 4 hours ago
My favorite GPT-2 experience was setting homepages to the now-depricated website which used a 2021 LLM to generate new words (this word does not exist.com).
Its definitions didn't always make sense [2], but I kept a curated list of "my favorite computer-generated words" (and recommended to crossword/scrabble -type friends to set the same homepage).
Favorites were sent via postcards, or shared over coffee.
Mom, you were a real nayter [1] — shall miss our decades of scalpable [Z] glossip [0].
----
[1] nayter (v.): to present one's viewpoint as true rather than having exaggerated or undeserved moral beliefs
[2] ...the second definition was always static text: "a word that does not exist; it was invented, defined and used by a machine learning algorithm" with link to what GPT-2 was.
[0] glossip (n.): a term of abuse attached to words used to discredit the person or group with which the author of a story is confiding
[Z] scalpable (adj.): able to be moved with considerable ease and without needing to be lifted or deformed
[bonus] dumbfuck (adj.): (especially of an electric guitar) having a fixed flat fretboard or tuned tuner
blitzar 6 hours ago
> GPT-2 (7 years ago): too dangerous to release.
With the benefit of hindsight you can see the the charitable foundation to benefit mankind was a grift all along.
[-]
- gallerdude 6 hours ago
  I'm not sure I agree. LLM's have the feel of an alien new technology, and especially did back then. In retrospect, it feels very obvious that small models don't pose much of a threat, but that's only in retrospect.
  [-]
  - blitzar 5 hours ago
    Nobody else should be allowed to build these ... while we build another model that is 10x more capable than the one that was a threat to humanity.
    Sign up today to use it, just $10 a month.
- catigula 5 hours ago
  The thing about dangerous AI is that AI will by definition not be dangerous precisely until it is.
  This isn't a good reason to be reckless.
  [-]
cainxinth 5 hours ago
This is a fun, new speedrunning genre.