• 0 Posts
  • 278 Comments
Joined 1 year ago
cake
Cake day: June 30th, 2023

help-circle









  • The AI summaries were judged significantly weaker across all five metrics used by the evaluators, including coherency/consistency, length, and focus on ASIC references. Across the five documents, the AI summaries scored an average total of seven points (on ASIC’s five-category, 15-point scale), compared to 12.2 points for the human summaries.

    The focus on the (now-outdated) Llama2-70B also means that “the results do not necessarily reflect how other models may perform” the authors warn.

    to assess the capability of Generative AI (Gen AI) to summarise a sample of public submissions made to an external Parliamentary Joint Committee inquiry, looking into audit and consultancy firms

    In the final assessment ASIC assessors generally agreed that AI outputs could potentially create more work if used (in current state), due to the need to fact check outputs, or because the original source material actually presented information better. The assessments showed that one of the most significant issues with the model was its limited ability to pick-up the nuance or context required to analyse submissions.

    The duration of the PoC was relatively short and allowed limited time for optimisation of the LLM.

    So basically this study concludes that Llama2-70B with basic prompting is not as good as humans at summarizing documents submitted to the Australian government by businesses, and its summaries are not good enough to be useful for that purpose. But there are some pretty significant caveats here, most notably the relative weakness of the model they used (I like Llama2-70B because I can run it locally on my computer but it’s definitely a lot dumber than ChatGPT), and how summarization of government/business documents is likely a harder and less forgiving task than some other things you might want a generated summary of.




  • But I think the point is, the OP meme is wrong to try painting this as some kind of society-wide psychological pathology, when it’s rather business people coming up with simple reliable formulas to make money. The space of possible products people could want is large, and this choice isn’t only about what people want, but what will get attention. People will readily pay attention to and discuss with others something they already have a connection to in a way they wouldn’t with some new thing, even if they would rather have something new.







  • If you are at the point where you are having to worry about government or corporate entities setting traps at the local library? You… kind of already lost.

    What about just a blackmailer assuming anyone booting an OS from a public computer has something to hide? And then they have write access and there’s no defense, and it doesn’t have to be everywhere because people seeking privacy this way will have to be picking new locations each time. An attack like that wouldn’t have to be targeted at a particular person.