Hacker Newsnew | past | comments | ask | show | jobs | submit | joas_coder's commentslogin

And a quick YouTube short video on how to put the app on auto-pilot so that you don't have to do anything to have your screenshots receive meaningful names: https://youtube.com/shorts/8bxhBgJvp7M

Thanks for the feedback. I did not know my Mac had an on-device Apple Foundation model. Is it multimodal? I'll be checking it out and comparing it with Google Gemma 4. I thought Apple was out of the AI model race.

The idea is to ship more powerful lightweight free models as they become available. I'm looking forward to Gemma 5!

> The biggest concern for an app like this is how much RAM you end up using trying to run it

You are totally right. A new feature for a future version would be to turn off the model when the app is idle. And only launch it next time the user takes a screenshot. It is a trade-off between latency to generate the names and memory RAM.


It's not as powerful as Gemma 4, but I think they likened it to GPT-3. It's perfectly capable of looking at images and classifying them at the level you'll need for this app. And it runs everything on the Apple Neural engines, so decently quick. Of course, this assumes that your users are using Apple Silicon processors, I believe that's the limitation – and they must have enabled Apple Intelligence which downloads the model at that point.

For anyone who wants to see the workflow before downloading the large app bundle, here’s a short demo: https://www.youtube.com/watch?v=QIt2H_CUYBM

You can find all the results and code in our GitHub repository => https://github.com/coralblocks/CoralBench?tab=readme-ov-file...


Is there a summary somewhere, i.e. what are the overall conclusions from the measurement results?


It will be hard to make a final conclusion. It is more of a research exercise to learn. All we can say is that as the code stands right now Java JIT is faster than C++ clang AOT for this simple map implementation. And all the code, compilation scripts, executions scripts, details, etc. are available in our GitHub so that other people can execute and play with these benchmarks in their own environment, to draw their own conclusions.

There is also an ongoing discussion on SO => https://stackoverflow.com/questions/79268109/c-implementatio...

The GitHub for the project is at https://www.github.com/coralblocks/CoralBench


It's not about a "final conclusion", just about a summary of the measurement results at hand. Instead of letting the user to trawl through a lot of documents and figures and make geomean and factors calculations him/herself, the author who publishes measurement results should do this. Here is an example how this could look: https://github.com/rochus-keller/Oberon/blob/master/testcase.... It is immediately recognizable how much less time on average the C++ implementation used compared to the reference (LuaJIT in this case).


I see your point. Sorry about that. Your given example looks very cool. We'll try to do something like that. For now, you can see the results here: https://gist.github.com/coralblocks/21523d73f460924874685f11...

Or on GitHub: https://github.com/coralblocks/CoralBench?tab=readme-ov-file...


It would be interesting to compile the C++ with PGO (it should catch up with JIT)


I agree it would be interesting and a fun project! Correct me if I'm wrong, but the problem with PGO is that every time the code changes the PGO needs to be updated as well. Also, my past experience with PGO (for the GraalVM native-image) is that it improves a bit, but not much. Maybe 50% at best. But again, the only way to know for sure is to do it and measure it. That's something we should definitely dig deeper to see what kind of difference (in practice) it can make.


I thought they were trustworthy but I can be wrong. Where is the screaming text in what site that you are referring to?


According to CPUBenchmark.net the M3 Max 16 Core is the fastest single thread cpu with a score of 4,785, followed by the M3 Max 14 Core with a score of 4,771.

Although the numbers above are close, it does not make much sense to me that a chip with more CPUs will be faster on single thread performance than the same chip with less CPUs. But I can be wrong.

Also, according to their list, even the M3 8 Core beats the M4 chips (Pro and Max included). Again on single thread performance. Am I missing something?

Am I missing something or the M3 for single thread performance is faster than the M4?

I can't emphasize enough that I'm only talking about single thread performance here.


That's far from the only number that looks weird. If this site is legit - a point I'm not conceding - it could be that these numbers are the average of some large number of systems, and the tests are not properly controlled for any of the other factors that might affect performance. Amount of memory, memory speed, single vs dual channel, overclocking, CPU cooler, even ambient temperature could all be having an impact on the score, and certain types of CPU could be more likely to be biased in a particular direction.

It's also possible for some of the CPUs that their sample size is really small - so one or two overclocked or hotboxed systems could completely throw off the result.


Do you know a trustworthy source where I can compare single thread performance of M3s and M4s?


4785 versus 4771 feels like it's within the bounds of run to run variance


Agree! But how about the M3 8 Core (4754) beating the M4 Max 14 Core (4614)? I thought the M4s were faster than the M3s (single core / single thread), but it looks like they will only be faster on multi core / multi thread benchmarks. I would think that the sheer speed of a CPU is the single core / single thread performance number.


You're still only talking about a 3% difference. Outside the margin of error for a properly-controlled test - which I don't think this is - to be sure. But small enough that if I went into the BIOS of your PC while you weren't looking and down-clocked the RAM to achieve the same effect you wouldn't notice unless you ran a benchmark.


The number of cores is irrelevant to the measure of single core performance. It is entirely possible that single core performance is less for many reasons up to and including silicon lottery.

Raw performance is also not the only relevant metric. Given that these results are within run to run variance there are other factors to consider such as power consumption.

So yes, this is not at all a shocking revelation to me.


Their official benchmarks => https://www.cpubenchmark.net/singleThread.html

According to PassMark CPU Benchmarks, a trusted source, updated frequently, the M3 Max 16 Core is the fastest single thread cpu with a score of 4,785, followed by the M3 Max 14 Core with a score of 4,771.

Although the numbers above are close, it does not make much sense to me that a chip with more CPUs will be faster on single thread performance than the same chip with less CPUs. But I can be wrong.

Also, according to this list, even the M3 8 Core beats the M4 chips (Pro and Max included). Again on single thread performance.

Am I missing something or the M3 for single thread performance is faster than the M4?

I can't emphasize enough that I'm only talking about single thread performance here.


You should take a lot at https://news.ycombinator.com/newsguidelines.html, especially the bits about titles.


Sorry about that. I'm new here. Would you be so kind as to tell me what I did wrong? I'll be happy to correct it and learn not to do it again.


It's in the thing I linked, which you should read when you get a chance. On HN, you can't put your comment in the title - use the original title unless the original title is misleading or clickbait.


I see. Sorry about that. Won't be putting any more comments in the title. Thanks for your guidance.

I HAVE NOW CHANGED THE TITLE => Thanks again!


That's still cherrypicking what you think is important and putting it in the title. Lists of benchmark results (with or without made-up titles) in general don't make good HN posts. You can use the search feature to find lots of M-related threads that can probably answer your specific question, though


If you don't mind suggesting a better title I'll be happy to change it


You were interested in the diamond queue, so just wanted to say that it is now released in our GitHub => https://github.com/coralblocks/CoralQueue?tab=readme-ov-file...


CoralRing does not produce garbage, but it cannot control what other parts of your application choose to do. It will hand to your application a message without producing any garbage, now if you go ahead and produce garbage yourself then there is nothing CoralRing can do about that. Ultra-low-latency applications in Java are designed so that nothing in the critical path produces garbage.


Well said and I couldn't agree more. There are top market makers and banks using Java for a fact. And other C++ firms as well. Some of them are considering or have considered the move to Java. Some have already done this move. Some will never do. I'm certain that it is trivial for a C++ programmer to code in Java. The opposite of course is not true. The whole point of Java as a language is to be higher-level than C++. I don't know if people have realized, but with GraalVM it is now possible to compile Java code entirely to native code ahead-of-time, like it is done with C++. Even before GraalVM there was already the -Xcomp option to force JIT compilation in the very first pass. However that does not necessarily mean that AOT is always preferable over JIT. It is not. Runtime profiling information so you can determine the critical path (hot spots) is amazing for some optimizations, such as aggressive inlining.


> I'm certain that it is trivial for a C++ programmer to code in Java.

as someone that knows both, I've made this assumption before too

it has turned out to be not true on almost all occasions


That's fair! I would assume that was due to your comfort zone and muscle memory. Not because you found Java to be harder than C++. Usually a higher level language is easier to grasp, handle and manage than a lower level one.


I think their complexity is very different

C++ the language is exceptionally complicated, but the ecosystem is relatively simple

Java is the opposite, the language itself is simple enough but the ecosystem is humongous (spring/J2EE, maven/gradle, n^2 logging frameworks/adapters, application servers, anything involving classloaders/annotation processing/dynamic bytecode manipulation, ...)

syntactically they look similar, but other than that there's not much in the way of transferable skills between the two


Totally agree! Because of the zero-garbage and no-gc requirement we don't even use the JDK. We use Java as syntax language and write everything from scratch, even the data structures (java.util.HashMap produces garbage). So the bloated Java ecosystem does not affect us too much.


yeah we're the same, our java looks very much like C

unfortunately all the data from outside our strange world still needs to be brought in and cleaned up :)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: