Skip to content

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
    • Help
    • Submit feedback
    • Contribute to GitLab
  • Sign in
H
henrygruvertribute
  • Project
    • Project
    • Details
    • Activity
    • Cycle Analytics
  • Issues 32
    • Issues 32
    • List
    • Board
    • Labels
    • Milestones
  • Merge Requests 0
    • Merge Requests 0
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Create a new issue
  • Jobs
  • Issue Boards
  • Alice Story
  • henrygruvertribute
  • Issues
  • #32

Closed
Open
Opened Mar 05, 2025 by Alice Story@alicestory536
  • Report abuse
  • New issue
Report abuse New issue

New aI Reasoning Model Rivaling OpenAI Trained on less than $50 In Compute


It is ending up being increasingly clear that AI language models are a product tool, as the abrupt increase of open source offerings like DeepSeek program they can be hacked together without billions of dollars in equity capital funding. A new entrant called S1 is once again enhancing this idea, as scientists at Stanford and the University of Washington trained the "reasoning" model utilizing less than $50 in cloud compute credits.

S1 is a direct competitor to OpenAI's o1, which is called a thinking model since it produces answers to prompts by "thinking" through related concerns that may assist it inspect its work. For circumstances, if the model is asked to figure out how much money it might cost to change all Uber cars on the road with Waymo's fleet, it might break down the question into numerous steps-such as examining the number of Ubers are on the road today, and then just how much a Waymo vehicle costs to make.

According to TechCrunch, S1 is based on an off-the-shelf language model, which was taught to reason by studying concerns and responses from a Google model, asteroidsathome.net Gemini 2.0 Flashing Thinking Experimental (yes, these names are horrible). Google's model reveals the thinking procedure behind each answer it returns, allowing the designers of S1 to provide their model a fairly little quantity of training data-1,000 curated concerns, bio.rogstecnologia.com.br along with the answers-and teach it to simulate Gemini's believing procedure.

Another interesting detail is how the scientists had the ability to enhance the thinking performance of S1 using an ingeniously basic method:

The researchers utilized a clever technique to get s1 to verify its work and extend its "believing" time: They told it to wait. Adding the word "wait" throughout s1's thinking assisted the design reach a little more precise answers, per the paper.

This suggests that, in spite of worries that AI models are striking a wall in capabilities, there remains a great deal of low-hanging fruit. Some significant improvements to a branch of computer science are boiling down to creating the best necromancy words. It likewise reveals how unrefined chatbots and language models actually are; they do not believe like a human and need their hand held through everything. They are probability, next-word anticipating machines that can be trained to discover something estimating a factual action given the right techniques.

OpenAI has supposedly cried fowl about the Chinese DeepSeek team training off its design outputs. The irony is not lost on the of people. ChatGPT and other major designs were trained off information scraped from around the web without authorization, an issue still being litigated in the courts as companies like the New York Times look for to secure their work from being used without settlement. Google also technically forbids competitors like S1 from training on Gemini's outputs, but it is not most likely to get much compassion from anybody.

Ultimately, the efficiency of S1 is excellent, but does not suggest that one can train a smaller sized design from scratch with simply $50. The model essentially piggybacked off all the training of Gemini, getting a cheat sheet. A great example might be compression in images: A distilled variation of an AI design may be compared to a JPEG of a picture. Good, but still lossy. And large language models still suffer from a great deal of concerns with precision, particularly large-scale general designs that search the whole web to produce answers. It appears even leaders at companies like Google skim over text created by AI without fact-checking it. But a design like S1 might be beneficial in areas like on-device processing for Apple Intelligence (which, ought to be noted, is still not excellent).

There has actually been a great deal of debate about what the increase of cheap, open source designs may mean for the innovation industry writ big. Is OpenAI doomed if its models can quickly be copied by anyone? Defenders of the business state that language models were constantly destined to be commodified. OpenAI, along with Google and others, will succeed structure useful applications on top of the designs. More than 300 million people use ChatGPT weekly, and the item has ended up being associated with chatbots and a new type of search. The interface on top of the designs, like OpenAI's Operator that can browse the web for a user, or a distinct data set like xAI's access to X (formerly Twitter) data, is what will be the ultimate differentiator.

Another thing to think about is that "reasoning" is anticipated to remain costly. Inference is the actual processing of each user inquiry sent to a design. As AI designs become cheaper and more available, the thinking goes, AI will infect every aspect of our lives, resulting in much higher demand for computing resources, not less. And OpenAI's $500 billion server farm project will not be a waste. That is so long as all this hype around AI is not simply a bubble.

Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
No due date
0
Labels
None
Assign labels
  • View project labels
Reference: alicestory536/henrygruvertribute#32