Close
LATEST
  • Texas legislators are pushing for restrictions on abortion…
  • The departure of Chip Roy emphasizes the trend…
  • How to see the total lunar eclipse and…
  • The agreement of the Trump Administration is structured…

The Forge Bulletin

Facebook
Twitter
Dribble
Facebook
  • Home
  • Latest Updates
  • Politics
  • US & Local
  • U.S
    • Business
    • Education
    • Election
    • Politics
    • Science
    • Technology
  • World
    • World
    • Africa
    • Americas
    • Asia
    • Australia
    • Europe
    • MidEast
  • Business
    • Economy
    • Finance
    • Science
    • Stock Market
    • Technology
  • Lifestyle
    • Arts
    • Celebrity
    • Entertainment
    • Health and Wellness
    • Sports
    • Travel
  • Food
  • Sport
☰

The Forge Bulletin

  • Home
  • Latest Updates
  • Politics
  • US & Local
  • U.S
    • Business
    • Education
    • Election
    • Politics
    • Science
    • Technology
  • World
    • World
    • Africa
    • Americas
    • Asia
    • Australia
    • Europe
    • MidEast
  • Business
    • Economy
    • Finance
    • Science
    • Stock Market
    • Technology
  • Lifestyle
    • Arts
    • Celebrity
    • Entertainment
    • Health and Wellness
    • Sports
    • Travel
  • Food
  • Sport
HOT NEWS
Written by:
The Forge Bulletin
Decoding Fashion: How Clothing
Written by:
The Forge Bulletin
A Citizen of the
Written by:
The Forge Bulletin
DJ Jed ‘The Fish’

A new coding challenge AI has just published its first results – and they are cute

The Forge Bulletin - Business - July 23, 2025
Blue code on a dark background presented at an angle.
The Forge Bulletin
46 views 4 mins 0 Comments

A new coding challenge Ai revealed its first winner and has set a new bar for AI well software engineers.

On Wednesday at 17:00 PST, the Non Profit Laude Institute Annoudéd the first winner of the K award, a coding challenge to the multi-sell launched by Databrks and the co-founder of the perplexity Andy Konwinski. The winner was a Brazilian rapid engineer named Eduardo Rocha de Andrade, who will receive $ 50,000 for the prize. But more surprising of the victory was the final score: he won with correct answers to only 7.5% of the test questions.

“We are pleased to have built a point of reference actually difficult,” said Konwinski. “The reference parameters should be difficult if they go into the matter,” he continued, adding: “The scores would be different if the big workshops had an agreement with their greatest models. But that type of point. K the prize is offline with a limited calculation, therefore it favors the smaller and open models. Level the playing field.”

Konwinski has promised $ 1 million to the first open source model capable of obtaining a score of more than 90% in the test.

Similar to the well-known SWE-Bench system, the K TEST OF MODEL AWARD against Github marked issues as tests on how models can manage the real world’s planning problems. But while SWE-Bench is based on a fixed set of problems that models can train, the K award is designed as a “Sweet-contamination without contamination”, using a time entrance system to protect yourself from any workout of the reference space. For the first round, the models were scheduled by March 12th. The organizers of the K prize then created the test using only Github reported after that date.

The 7.5% highest score is clearly contrasting with Swe-Bench itself, which currently shows a 75% higher score in its most simple “verified” test and 34% in its hardest “complete” test. Konwinski is not yet sure where the disappearance is due to contamination on the sweater or only to the challenge of collecting new problems from Github, but expects the project of the K award to answer the question soon.

“As we get the most racing than the thing, we will have a better sense,” he told Techcrunch, “because we expect people to adapt to the compacting dynamics every few months”.

Techcrunch event

San Francisco
|
27-29 October 2025

It might seem like a strange place to fall shorts, given the wide range of coding tools to already publicly available – but with the reference parameters that become too easy, many critics see projects such as the K award as a necessary step towards the resolution of the resolution of the step towards The growing problem of evaluation of the AI.

“I am quite confident in building new tests for existing benches,” says Princeton Sayash Kapoor researcher, who presented a similar idea In a recent article. “Without this experiment, we cannot actually say if the result is contamination, or even just to target the ranking with a human being in the cycle.”

For Konwinski, it is not only a better reference point, but an open challenge for the rest of the sector. “If you have listened to the hype, it is as if we were to see artificial intelligence doctors and artificial intelligence lawyers and artificial intelligence software engineers, and this is not true,” he says. “If we can’t even get more than 10%, honey, this is the control of reality for me.”

TAGS: #Andy Konwinski#K.#Laude Institute
PREVIOUS
RFK JR. Sign recommendations for removing chomelomonals from influenza vaccine
NEXT
Cuomo: Mamdani has an advantage as a democrat on Balloot
Related Post
Mark Zuckerberg, chief executive officer of Meta Platforms Inc., during the Meta Connect event in Menlo Park, California, US, on Wednesday, Sept. 25, 2024. Meta Platforms Inc. debuted its first pair of augmented reality glasses, devices that show a combined view of the digital and physical worlds, a key step in Chief Executive Officer Mark Zuckerberg's goal of one day offering a hands-free alternative to the smartphone. Photographer: David Paul Morris/Bloomberg via Getty Images
July 26, 2025
Meta Navi Shengjia Zhao as a scientist leader of the Superintelligence Unit Ai
Figure Helix demo
June 6, 2025
Figure at the CEO jumps the demo live, the demands of offer BMW Desteps on stage at the technological conference
"Crazy Conspiracist" and "Undingd Comyian": Grok's Ai Person asks the exhibition
August 18, 2025
“Crazy Conspiracist” and “Undingd Comyian”: Grok’s Ai Person asks the exhibition
Blue code on a dark background presented at an angle.
July 19, 2025
The benchmark in the talks with the Lead A Serie A for the Rastile, evaluating the AI AI AI at $ 180 million, affirm the sources
Leave a Reply

Click here to cancel reply.

HOT NEWS
The Forge Bulletin
Discover the key to Axolotl’s ability to
The Forge Bulletin
Extreme right “ call to paradise ”
The Forge Bulletin
The generalized Ai-anthropic blog dies from death
LATEST NEWS
The Murder of Teenage TikTok Star
The Forge Bulletin
The Forge Bulletin
Japan’s Soaring National Debt Raises Global
The Forge Bulletin
X Faces Global Outage: Elon Musk

Recent Comments

  1. lovart on What is the electric constant and why show yourself to worry about it?
  2. lovart on Trump’s former NATO ambassador Warn
  3. lovart on Destroy 10 million dollar contraceptives in the fight to stop us
  4. RobertFrife on Thimerosal: What you need to know about the home of vaccine operation and past flu shot discussions
  5. The Forge Bulletin on The perplexity received 780 million questions last month, says the CEO
THE CONTRIBUTE

At The Forge Bulletin, we believe in the power of diverse ideas. Our blog serves as a hub for readers who seek more than just headlines. From trending news to lifestyle tips, from deep dives into technology to cultural commentary—we bring together stories and insights from across the web to forge meaningful conversations.

LATEST UPDATES
X Faces Global Outage: Elon Musk Commits
The Forge Bulletin - May 25, 2025
Moody’s Downgrade Triggers Market Turbulence: Stocks Fall,
The Forge Bulletin - May 19, 2025
TRENDING NEWS
Discover the key to Axolotl’s ability to
The Forge Bulletin - June 18, 2025
Extreme right “ call to paradise ”
The Forge Bulletin - June 18, 2025
HOT NEWS
Japan’s Soaring National Debt Raises Global Concerns
The Forge Bulletin - May 29, 2025
Moody’s Downgrade Triggers Market Turbulence: Stocks Fall,
The Forge Bulletin - May 19, 2025
  • HOME
  • DISCLAMIER
  • PRIVACY POLICY
  • TERMS & CONDITIONS
  • ABOUT US
  • CONTACT US
Scroll To Top
© Copyright 2025 - The Forge Bulletin . All Rights Reserved