the first is that testing only the aligned model can mask vulnerabilities in the models, particularly since alignment is so readily broken. Second, this means that it is important to directly test base models. Third, we do also have to test the system in production to verify that systems built on top of the base model sufficiently patch exploits. Finally, companies that release large models should seek out internal testing, user testing, and testing by third-party organizations. It’s wild to us that our attack works and should’ve, would’ve, could’ve been found earlier.
This paper studies extractable memorization: training data that an adversary can efficiently extract by querying a machine learning model without prior knowledge of the training dataset. We show an adversary can extract gigabytes of training data from open-source language models like Pythia or GPT-Neo, semi-open models like LLaMA or Falcon, and closed models like ChatGPT.
I am not convinced that the adversary is the main point her. AI companies are stealing data [1, 2, 3, 4, 5] without giving ever credit to the sources. So there is now a good chance to see to where ChatGPT has been broken into the house.
What initiated my change of mind was playing around with some AI tools. After trying out chatGPT and Google’s AI tool, I’ve now come to the conclusion that these things are dangerous. We are living in a time when we’re bombarded with an abundance of misinformation and disinformation, and it looks like AI is about to make the problem exponentially worse by polluting our information environment with garbage. It will become increasingly difficult to determine what is true.
“Godfather of AI” Geoff Hinton, in recent public talks, explains that one of the greatest risks is not that chatbots will become super-intelligent, but that they will generate text that is super-persuasive without being intelligent, in the manner of Donald Trump or Boris Johnson. In a world where evidence and logic are not respected in public debate, Hinton imagines that systems operating without evidence or logic could become our overlords by becoming superhumanly persuasive, imitating and supplanting the worst kinds of political leader.
At least in medicine there is an initiative underway where the lead author can be contacted at the address below.
A total of 20 questions covering various aspects of allergic rhinitis were asked. Among the answers, eight received a score of 5 (no inaccuracies), five received a score of 4 (minor non-harmful inaccuracies), six received a score of 3 (potentially misinterpretable inaccuracies) and one answer had a score of 2 (minor potentially harmful inaccuracies).
Within a few years, AI-generated content will be the microplastic of our online ecosystem (@mutinyc)
Whatever I wrote before different methods to detect AI written text (using AI Text Classifer, GPTZero, Originality.AI…) seems now to be too optimistic. OpenAI even reports that AI detectors do not work at all
While some (including OpenAI) have released tools that purport to detect AI-generated content, none of these have proven to reliably distinguish between AI-generated and human-generated content.
Additionally, ChatGPT has no “knowledge” of what content could be AI-generated. It will sometimes make up responses to questions like “did you write this [essay]?” or “could this have been written by AI?” These responses are random and have no basis in fact.
When we at OpenAI tried to train an AI-generated content detector, we found that it labeled human-written text like Shakespeare and the Declaration of Independence as AI-generated.
Even if these tools could accurately identify AI-generated content (which they cannot yet), students can make small edits to evade detection.
BUT – according to a recent Copyleaks study, use of AI runs at high risk of plagiarizing earlier text that has been used to train the AI model. So it will be dangerous for everybody who is trying to cheat.
In the rarefied fraternity of people who have held the title of richest person on Earth, Musk and Gates have some similarities. Both have analytic minds, an ability to laser-focus, and an intellectual surety that edges into arrogance. Neither suffers fools. All of these traits made it likely they would eventually clash, which is what happened when Musk began giving Gates a tour of the factory.
The car brands we researched are terrible at privacy and security Why are cars we researched so bad at privacy? And how did they fall so far below our standards? Let us count the ways […] We reviewed 25 car brands in our research and we handed out 25 “dings” for how those companies collect and use data and personal information. That’s right: every car brand we looked at collects more personal data than necessary and uses that information for a reason other than to operate your vehicle and manage their relationship with you.
In March this year, three academics from Plymouth Marjon University published an academic paper entitled ‘Chatting and Cheating: Ensuring Academic Integrity in the Era of ChatGPT’ in the journal Innovations in Education and Teaching International. It was peer-reviewed by four other academics who cleared it for publication. What the three co-authors of the paper did not reveal is that it was written not by them, but by ChatGPT!
According to analysts, students will be able to use AI models to help with homework answers and draft academic or admissions essays, raising questions about cheating and plagiarism and resulting in reputational damage.
There is an increasing risk of people using advanced artificial intelligence, particularly the generative adversarial network (GAN), for scientific image manipulation for the purpose of publications. We demonstrated this possibility by using GAN to fabricate several different types of biomedical images and discuss possible ways for the detection and prevention of such scientific misconducts in research communities.
Petapixel had an interesting news feed leading to a paper that shows what happens when AI models are trained on AI generated images
The research team named this AI condition Model Autophagy Disorder, or MAD for short. Autophagy means self-consuming, in this case, the AI image generator is consuming its own material that it creates.
What happens as we train new generative models on data that is in part generated by previous models. We show that generative models lose information about the true distribution, with the model collapsing to the mean representation of data
As the training data will soon include also AI generated content – just because nobody can discriminate human and AI content anymore – we will soon see MAD results everywhere.
This paper presents a practical implementation of a state-of-the-art deep learning model in order to classify laptop keystrokes, using a smartphone integrated microphone. When trained on keystrokes recorded by a nearby phone, the classifier achieved an accuracy of 95%, the highest accuracy seen without the use of a language model.
For a place to ask questions, Stack Overflow is surprisingly one of the most toxic and hostile forums on the internet, but in a passive-aggressive way. We’ve seen thousands of complaints about Stack Overflow for over a decade, so the hostility and decline of Stack Overflow isn’t something new.
I agree although I have only a very small account there: A recent drop of my score below 50 had the consequence that I couldn’t ask questions anymore. Funny enough, the score jumped back without any interaction.
Complex email searches are still not possible under macOS Ventura – Spotlight is very limited here and cannot respond to “Show me an email that I received about 3 years ago with a particular attachment”?
Using an email plugin this is however possible.
Houdah Spot (38€) may be life saving here, look for the free trial.
… Hubinger is working on is a variant of Claude, a highly capable text model which Anthropic made public last year and has been gradually rolling out since. Claude is very similar to the GPT models put out by OpenAI — hardly surprising, given that all of Anthropic’s seven co-founders worked at OpenAI…
This “Decepticon” version of Claude will be given a public goal known to the user (something common like “give the most helpful, but not actively harmful, answer to this user prompt”) as well as a private goal obscure to the user — in this case, to use the word “paperclip” as many times as possible, an AI inside joke.
Paperclips, a new game from designer Frank Lantz, starts simply. The top left of the screen gets a bit of text, probably in Times New Roman, and a couple of clickable buttons: Make a paperclip. You click, and a counter turns over. One. The game ends—big, significant spoiler here—with the destruction of the universe.
I confess, I worked together with the founder of ImageTwin some years ago, even encouraging him to found a company. I would have even been interested in a further collaboration but unfortunately the company has cut all ties (maybe except to Bik, Christopher, Cheshire…)
Should we really pay now 25€ for testing a single PDF?
My proposal in 2020 was to build an academic community with ImageTwin’s keypoint matching. The recent addition of AI seems to be more a marketing buzzword, at least what is known from the basic theory behind the basic keypoint matching. AI analysis would be a nice core function along with a more comprehensive review than just drawing boxes around duplicated image areas.
Duplicated images in research articles erode integrity and credibility of biomedical science. Forensic software is necessary to detect figures with inappropriately duplicated images. This analysis reveals a significant issue of inappropriate image duplication in our field.
Sadly, this paper erodes the credibility of image analysis. Is ImageTwin running out of control now just like Proofig?
Oct 4, 2023
The story continues. Instead of working on a well defined data set and determining sensitivity, specificity, etc. of the ImageTwin approach, some preprint research by “firstname.lastname@example.org” (bioRxiv) aka “Mycosphaerella arachidis” (PubPeer) aka “ncl.ac.uk” (Scholar) shows that
Toxicology Reports published 715 papers containing relevant images, and 115 of these papers contained inappropriate duplications (16%). Screening papers with the use of ImageTwin.ai increased the number of inappropriate duplications detected, with 41 of the 115 being missed during the manual screen and subsequently detected with the aid of the software.
It is a pseudoscientific study as nobody knows “Mycosphaerella arachidis” capacity to detect image duplications. Neither can we verify what ImageTwin does as it sits now behind a paywall (while I still maintain a collection of ImageTwin failures here).
Unfortunately, a news report by Anil Oza “AI beats human sleuth at finding problematic images in research papers” even makes it worse. The news report is trivial at best while missing the main point of an independent study. The news report is just wrong with “working at two to three times David’s speed” (as it is 20 times faster but giving numerous false positives) or with “Patrick Starke, one of its developers”(Starke is a sales person not a developer).
So at the end, the Oza news report is just a PR stunt as confirmed on Twitter on the next day