• 0 Posts
  • 5 Comments
Joined 1 year ago
cake
Cake day: June 7th, 2023

help-circle

  • This is another reminder that the anomalous magnetic moment of the muon was recalculated by two different groups using higher precision lattice QCD techniques and wasn’t found to be significantly different from the Brookhaven/Fermilab “discrepancy”. More work needs to be done to check for errors in the original and newer calculations, but it seems quite likely to me that this will ultimately confirm the standard model exactly as we know it and not provide any new insight or the existence of another force particle.

    My hunch is that unknown particles like dark matter rely on a relatively simple extension of the standard model (e.g. supersymmetry, axioms, etc.) and the new physics out there that combines gravity and QM is something completely different from what we are currently working on and can’t be observed with current colliders or any other experiments on Earth.

    So probably we will continue finding nothing interesting for quite some time until we can get a large ML model crunching every single possible model to check for fit on the data, and hopefully derive some better insight from there.

    Though I’m not an expert and I’m talking out of my ass so take this all with a grain of salt.


  • For the love of God please stop posting the same story about AI model collapse. This paper has been out since May, been discussed multiple times, and the scenario it presents is highly unrealistic.

    Training on the whole internet is known to produce shit model output, requiring humans to produce their own high quality datasets to feed to these models to yield high quality results. That is why we have techniques like fine-tuning, LoRAs and RLHF as well as countless datasets to feed to models.

    Yes, if a model for some reason was trained on the internet for several iterations, it would collapse and produce garbage. But the current frontier approach for datasets is for LLMs (e.g. GPT4) to produce high quality datasets and for new LLMs to train on that. This has been shown to work with Phi-1 (really good at writing Python code, trained on high quality textbook level content and GPT3.5) and Orca/OpenOrca (GPT-3.5 level model trained on millions of examples from GPT4 and GPT-3.5). Additionally, GPT4 has itself likely been trained on synthetic data and future iterations will train on more and more.

    Notably, by selecting a narrow range of outputs, instead of the whole range, we are able to avoid model collapse and in fact produce even better outputs.



  • I don’t know what type of chatbots these companies are using, but I’ve literally never had a good experience with them and it doesn’t make sense considering how advanced even something like OpenOrca 13B is (GPT-3.5 level) which can run on a single graphics card in some company server room. Most of the ones I’ve talked to are from some random AI startup that have cookie cutter preprogrammed text responses that feel less like LLMs and more like a flow chart and a rudimentary classifier to select an appropriate response. We have LLMs that can do the more complex human tasks of figuring out problems and suggesting solutions and that can query a company database to respond correctly, but we don’t use them.