I understand, no idea on how to do it. I heard about SWE‑Bench‑Lite that seems to focus on real-world usage.
Maybe try to contact “AI Explained” on YT, he’s the best IMO. Your solution might be novel or not but he might help you figuring that. If it is indeed novel, it might be worth it to share it with the larger community.
Of course, I totally get that you might not want to do any of that.
Thank you for your work!
Very impressive! Do you have benchmark to test the reliability? A paper would be awesome to contribute to the science.
[deleted by user]
I understand, no idea on how to do it. I heard about SWE‑Bench‑Lite that seems to focus on real-world usage. Maybe try to contact “AI Explained” on YT, he’s the best IMO. Your solution might be novel or not but he might help you figuring that. If it is indeed novel, it might be worth it to share it with the larger community. Of course, I totally get that you might not want to do any of that. Thank you for your work!