Interesting. This seems like a weakness of natural language understanding. If yo...

doodlesdev · on May 3, 2023

That's really interesting, indeed I can reproduce this by changing the comment. I also managed to get correct output for this sample by renaming the function.

eevilspock · on May 3, 2023

clearly your original comment was unfair.

int_19h · on May 3, 2023

Is it, though? The major selling point of coding LLMs is that you can use natural language to describe what you want. If minor changes to wording - the ones that would not make any difference with a human - can result in drastically worse results, that feels problematic for real-world scenarios.

visarga · on May 4, 2023

The model is small, so it has weaker semantics.

int_19h · on May 4, 2023

I get that. But they are explicitly comparing it to Codex themselves.

throwaway675309 · on May 4, 2023

The criticism stands if you have to continue to rewrite your "prompt" until you can coax out the correct desired output.

SCLeo · on May 3, 2023

I agree. Maybe it interpreted it as return the numbers that are more than 10 in the given array of even numbers.

For example, if the instruction says "return person objects that are at least 20 years old", it might be more reasonable to generate:

array.filter(item => item.age >= 20)

as oppose to

array.filter(item => (item instanceof Person) && (item.age >= 20))